Industry Insights

Test-Time Training Could Help Robot Learning Adapt on the Fly

By Brian Heater, Managing Editor, A3

07/08/2025

2 minutes

MIT

Humans are pretty good at picking things up on the fly – robots, not so much. While large language models (LLMs) offer a lot of potential for automation, these too, can fall short when it comes to facing down unforeseen challenges.

A new and novel architecture called test-time training (TTT) could be a key to increasing real time robot adaptability in the field. According to the authors of a newly published paper titled “The Surprising Effectiveness of Test-Time Training for Few-Shot Learning,” TTT offers great promise for robotic reasoning, when systems encounter new challenges not addressed by the LLMs that power them.

“Genuine learning — what we did here with test-time training — is something these models can’t do on their own after they are shipped,” says the paper’s lead author, Ekin Akyürek. “They can’t gain new skills or get better at a task. But we have shown that if you push the model a little bit to do actual learning, you see that huge improvements in performance can happen.”

ROI Calculator

Discover the potential cost savings of robotic automation over a 20-year system life

This calculator compares your current manual labor costs against the total cost of owning and operating a robotic system over its 20-year lifespan.

EXPLORE TODAY

The MIT team conducting the research point to a sixfold improvement in accuracy, as TTM serves to temporarily adjust the LLM, based on challenges the system encounters.

“We find that test-time training is a much stronger form of learning,” adds coauthor, Mehul Damani. “ While simply providing examples can modestly boost accuracy, actually updating the model with those examples can lead to significantly better performance, particularly in challenging domains.”

TTMs don’t make for especially fast thinking, however. Processing can take five to 10 minutes, up from the more standard sub-minute timing. That discrepancy is obviously far less than optimal when it comes to putting robots to work. For that reason, the researchers submit a combination of methods that defaults to LLMs, while shifting to TTT for especially challenging tasks.

“We wouldn’t want to do this for all user queries, but it is useful if you have a very hard task that you want to the model to solve well,” according to Akyürek. “There also might be tasks that are too challenging for an LLM to solve without this method.”