Beyond bigger models: OpenAI pioneers smarter AI with 'Test-Time Compute’, stalling scaling for now
OpenAI and other top AI companies are leaving behind the efforts of creating bigger and bigger models in pursuit of smarter techniques of training. The changes come at a time when scaling methods are reaching their practical and financial limits. According to a recent Reuters report, leading researchers, investors, and executives in the AI industry are focusing efforts on making AI “think” in more human-like ways to increase the model's efficiency.
These new techniques are marked by the AI models that have just been released, which is OpenAI o1. Unlike traditional scaling, where many gigabytes of data and computing power are added to enhance models, o1 introduces “test-time compute.” This approach lets the model generate and evaluate multiple solutions in real-time during the inference stage or when the AI is actively in use.
Noam Brown, a researcher at OpenAI, spoke about the effect of this method during the TED AI conference in San Francisco last month. He said, “It turned out that having a bot think for just 20 seconds in a hand of poker got the same boosting performance as scaling up the model by 100,000x and training it for 100,000 times longer.”
Top AI researchers and investors feel that this shift could sway the AI race to favour this branch of research. Although OpenAI shares vast knowledge in AI, it declined to comment on the new techniques used. Since the viral launch of ChatGPT in 2022, tech companies upped the ante in scaled models which brought increased valuations and attracted investors. However, prominent AI scientists such as Ilya Sutskever, co-founder of Safe Superintelligence (SSI) and a former executive at OpenAI, are now weighing in on the limitations of the “bigger is better” philosophy.
Sutskever, one of the first advocates for scaling really big models, spoke to Reuters about the current plateau in pre-training results. He explained that scaling massive pre-training with endless data had once been the breakthrough, like ChatGPT. But today, those same methods are yielding diminishing returns. “The 2010s were the age of scaling,” Sutskever said. “Now we're back in the age of wonder and discovery once again. Everyone is looking for the next thing,” he added.
Meanwhile, scientists are encountering delays and lousy outcomes in efforts to top OpenAI's GPT-4 as training LLMs costs tens of millions of dollars. Hundreds of chips run simultaneously, and any hardware failure risks ruining months of work. Models require vast amounts of data today, which are scarce, and then there's the added lag from power shortages that slow these “training runs,” which take up huge amounts of energy.
Meanwhile, other AI labs, like Google DeepMind and Anthropic, are claimed to be developing similar models. OpenAI's chief product officer, Kevin Weil, also revealed the company's visionary strategy: “By the time people do catch up, we're going to try and be three more steps ahead.”.
This shift could dramatically change the look of AI hardware, as until now, Nvidia has long dominated the world of chips for AI. However, this shifting focus to inference or real-time processing creates an opening for competition. It is being very carefully watched by venture capital investors like Sequoia and Andreessen Horowitz.
Sonya Huang, a partner at Sequoia Capital, described the changes, saying, “This shift will move us from a world of massive pre-training clusters toward inference clouds, which are distributed, cloud-based servers for inference.” Nvidia, meanwhile, points to strong demand for inference, with its CEO Jensen Huang highlighting this trend. "We've now discovered a second scaling law," Huang noted, referring to their latest chip, Blackwell.
As AI firms begin to push along new trails, these innovations are likely to shape not only the AI future but also change the competitive dynamics in the tech and chip industries.