Every time an artificial intelligence answers a question or guides a robot's hand, it spends energy — and lately it has been spending far more than the task seems to deserve. A team at Tufts University's School of Engineering has now offered a working counterexample. In the laboratory of Matthias Scheutz, the Karol Family Applied Technology Professor, researchers built a proof-of-concept AI that completed structured robotic tasks more accurately than conventional systems while drawing as little as one-hundredth of the energy. The work was presented at the IEEE International Conference on Robotics and Automation (ICRA) in Vienna this June and published in the conference proceedings.

Rather than the screen-bound large language models most people interact with, the Tufts group studies vision-language-action (VLA) models — systems that extend language models with sight and movement so a robot can read a scene, interpret an instruction, and act on it by driving its wheels, arms, and fingers. To put their approach to the test, they used the Tower of Hanoi, a deceptively simple puzzle whose strict, rule-bound logic offers no shortcut through pattern-matching. On the standard version, the team's system succeeded 95% of the time, against just 34% for a conventional VLA. On a harder variant the robot had never encountered in training, the gap widened into a chasm: the new system solved it 78% of the time while standard models failed on every single attempt.
The efficiency figures are where the result becomes hard to ignore. Training the new model took 34 minutes, compared with more than a day and a half for the conventional VLA, and consumed only 1% of the energy. The savings carried into operation as well, with the system using roughly 5% of the power a standard model needed to carry out the same tasks. Taken together — training and execution — that is the source of the headline claim of up to 100 times less energy.

The secret is not a bigger network but a different architecture. The Tufts system is neuro-symbolic: it pairs the pattern recognition and generation of neural networks with symbolic reasoning, the kind of step-by-step, rule-based thinking humans use when they break a problem into categories and stages. A purely neural VLA, like a language model, acts on statistical guesses drawn from enormous training sets, and those guesses can misfire — a misread shadow, a misplaced block, a tower that topples. By letting symbolic rules constrain the trial and error, the hybrid reaches a reliable solution with far less flailing, and far less compute.
Why It Matters
The backdrop to this research is an energy curve that is bending in the wrong direction. The International Energy Agency estimates that AI and data centers in the United States consumed about 415 terawatt-hours of electricity in 2024 — more than 10% of the nation's output that year — and projects that figure to roughly double by 2030. As AI is stitched into search engines, office software, and industrial systems, demand is feeding a competitive race to build ever larger data centers, some drawing hundreds of megawatts, more power than many small cities require.

Against that trajectory, the disproportion between effort and result is striking. Scheutz offers an everyday illustration: the AI summary that now sits atop a Google search can consume up to 100 times more energy than generating the ordinary list of website links beneath it. Multiply that overhead across billions of daily queries and the cost is no longer abstract. The Tufts result matters because it challenges the assumption that better AI must mean bigger, hungrier models — and suggests the curve can be bent through smarter design rather than simply more silicon.
There is a reliability dividend, too. The same hallucinations that produce invented legal citations or six-fingered hands in image generators show up in robots as dropped blocks and toppled towers. A system that reasons over explicit rules is not only cheaper to run but easier to trust, which matters a great deal once machines are asked to act in the physical world rather than merely chat.
Reaction
The finding lands in the middle of a long-running argument about where artificial intelligence should go next. For most of the past decade, the dominant strategy has been scale: more parameters, more data, more compute. A vocal minority of researchers has insisted that pure neural scaling would eventually hit a wall of cost and brittleness, and that reintroducing symbolic reasoning — an idea nearly as old as AI itself — would be necessary to get reliable behavior. Results like the Tufts study give that camp concrete numbers to point to.

Still, the researchers are careful about what the proof-of-concept does and does not show. The Tower of Hanoi is a clean, well-structured problem with clear rules, which is precisely the setting where symbolic reasoning shines; the messier, ambiguous tasks of an ordinary kitchen or warehouse are a harder proving ground. The team frames its work not as a finished replacement for today's models but as evidence that current LLMs and VLAs, for all their popularity, may not be the right foundation for energy-efficient, dependable AI. The honest reading is that hybrid approaches deserve a serious look, not that the scaling era is over.
What's Next
Presenting at ICRA — the field's flagship robotics gathering, held this year at Messe Wien in Vienna under the theme "Robots for All" — places the work squarely in front of the community most likely to build on it. The natural next step is to push neuro-symbolic VLAs beyond tidy puzzles toward the unstructured tasks that define real human environments, the domain Scheutz's lab focuses on through its work on robots that interact with people.

The timing also fits a broader shift in the industry. After years of chasing general-purpose mega-models, much of the field is now exploring specialized systems tuned for particular domains — video, scientific research, robotics — where efficiency and reliability matter as much as raw capability. A neuro-symbolic robot brain that trains in minutes on a fraction of the power is exactly the kind of specialized, sustainable tool that vision implies, and the conference proceedings give other labs a blueprint to test and extend.
Closing Thoughts
It is tempting to read every AI advance as another step up the same staircase — larger models, larger clusters, larger bills. The Tufts work is a quiet reminder that progress can also mean doing more with less. By borrowing the human habit of breaking problems into rules and steps, a modest hybrid system outperformed brute-force pattern matching while sipping energy rather than gulping it.

None of this guarantees that neuro-symbolic AI will become the default; one elegant result on a benchmark puzzle is not a paradigm shift. But it reframes a question the field has been reluctant to ask out loud: not how much intelligence we can buy with more power, but how much we are wasting for want of better design. If the next decade of AI is shaped as much by energy limits as by ambition, results like this one suggest the wall ahead may have a door in it.
한글 요약
미국 터프츠대학교 공과대학 마티아스 쇼이츠 교수 연구팀이 기존 인공지능보다 에너지를 최대 100배 적게 쓰면서도 더 정확한 결과를 내는 '뉴로-심볼릭(neuro-symbolic) AI' 개념 검증 모델을 선보였습니다. 신경망의 패턴 인식 능력에 인간이 문제를 규칙과 단계로 쪼개 푸는 방식인 기호적 추론을 결합한 것이 핵심으로, 이 연구는 6월 오스트리아 빈에서 열린 국제 로봇·자동화 학회(ICRA) 2026에서 발표되고 학회 논문집에 실렸습니다.
연구팀은 로봇용 시각-언어-행동(VLA) 모델을 하노이의 탑 퍼즐로 시험했습니다. 표준 퍼즐에서 새 시스템은 95% 성공률을 보여 기존 VLA의 34%를 크게 앞섰고, 학습 때 보지 못한 더 복잡한 버전에서는 78% 대 0%로 격차가 더 벌어졌습니다. 학습 시간은 하루 반 이상에서 34분으로 줄었고, 학습 에너지는 1%, 실행 에너지는 5% 수준에 그쳤습니다. 미국의 AI·데이터센터가 2024년 약 415테라와트시를 소비했고 2030년까지 두 배로 늘 것이라는 전망과 맞물려, 이 결과는 '더 큰 모델'만이 답이 아닐 수 있음을 시사합니다.
다만 연구진은 하노이의 탑처럼 규칙이 분명한 과제는 기호적 추론에 유리한 환경이며, 실제 생활 속 모호한 작업으로 확장하는 것은 또 다른 과제라고 신중하게 선을 그었습니다. 그럼에도 거대 모델 경쟁이 비용과 신뢰성의 한계에 부딪힐 수 있다는 오랜 우려 속에서, 적은 자원으로 더 안정적인 성능을 내는 하이브리드 접근이 진지하게 검토될 가치가 있음을 보여준 사례로 평가됩니다.
참고 / 출처: Tufts Now, ScienceDaily, ICRA 2026