Has the 'Law of Scale' of AI become invalid?
2024-12-19
According to a recent report on the website of Time magazine, companies have been betting on a tempting rule for over a decade: as long as they find ways to continue expanding their artificial intelligence (AI) systems, they will become increasingly intelligent. This is not just a beautiful fantasy. The term 'Law of Scale' proposed by OpenAI, the Open Artificial Intelligence Research Center, in 2020, has now become the industry standard, recognizing this trend. This theory has prompted AI companies to bet billions of dollars on increasingly large computing clusters and datasets. This gamble paid off generously, turning rough text machines into eloquent chatbots. But now, this belief of 'bigger is better' has been questioned. Recently, Reuters and Bloomberg reported that the benefits of leading AI companies expanding their AI systems have decreased. A few days ago, the American magazine "Information" reported that OpenAI's yet to be released "Orion" model did not perform as expected in internal testing, and the company has doubts about continuing to advance this technology. The co-founder of Andreessen Horowitz, a well-known venture capital firm in Silicon Valley, also pointed out that increasing computing power can no longer bring the same "intelligence improvement". Nevertheless, many leading AI companies seem confident that technology is advancing rapidly. A spokesperson for Anthropic, the developer of the popular chatbot "Claude," said in a statement, "We have not seen any signs of deviating from the law of scale." OpenAI declined to comment. DeepMind, a subsidiary of Google, did not respond to a request for comment. Last week, a new experimental version of Google's "Gemini" model replaced OpenAI's GPT-4o on the AI performance rankings. Google CEO Sundar Pichai posted on social platform X, saying "Stay tuned. The recent press conference reflects a more complex situation. Since its mid size model Sonnet was released in March this year, Anthropic has updated it twice to make it more powerful than the company's largest model Opus, which has yet to receive such an update. In June of this year, Anthropic announced that Opus would be updated "later this year," but last week, the company's co-founder and CEO Dario Amodai did not give a specific time on Lex Friedman's podcast. Google updated its smaller Gemini Pro model in February this year, but has not yet updated its larger Gemini Ultra model. The o1 preview model recently released by OpenAI has surpassed GPT-4o in several benchmark tests, but is not as good as the latter in other aspects. According to reports, o1 preview is internally referred to as the "reasoning capable GPT-4o", indicating that the scale of its underlying model is similar to GPT-4. Due to the competition of interests in various aspects, the truth is elusive. Last week, Amodai said that if Anthropic cannot produce more powerful models, "we as a company will completely fail," exposing the risks faced by AI companies that bet on perpetual progress in the future. Slowing progress may scare investors and trigger economic liquidation. At the same time, Ilya Sutskwell, former chief scientist of OpenAI and a staunch supporter of the law of scale, now states that the performance improvement of larger models has entered a bottleneck period. But his position also has its bias: the newly established AI startup, Security Super Intelligence, was founded in June this year, and its funding and computing power are not comparable to its competitors. The failure of the scale assumption will help create a fair competitive environment. Gary Markus, a well-known figure in the field of AI and author of books such as "Taming Silicon Valley," said, "They believe there are certain mathematical laws, and they make predictions based on these laws, but the system does not meet expectations." He said that recent diminishing returns indicate that we have finally "hit a bottleneck," which he has been warning about since 2022. He said, "I didn't know exactly when we would encounter a bottleneck at the time, and we did make some progress. But now we seem to be stuck." Marcus said that the slowdown in progress may reflect the limitations of current deep learning technologies, or simply put, "there isn't enough new data left. Some people who closely monitor the dynamics of AI support this viewpoint. Sasha Luccioni, the head of the AI and climate department at the American hugging face company, said that the information obtained from text and images is limited. She pointed out that SMS communication is more likely to cause misunderstandings than face-to-face communication, which illustrates the limitations of textual data. She said, "I think the same goes for language models." Eg Erdier, a senior researcher at the AI Era Institute, a non-profit organization that studies the development of AI, said that fields such as reasoning and mathematics are particularly lacking in data, and we "really don't have that much high-quality data. This does not mean that scale expansion may stop, it simply means that scale expansion alone is not enough. He said, 'In every order of magnitude expansion, different innovations need to be found.' He pointed out that this does not mean that AI progress will slow down overall. This is not the first time critics have claimed that the law of scale is invalid. Last week, Amodai said, "Every stage of scale expansion is controversial. The latest controversy we have encountered is that data is running out, or data quality is not high enough, or models do not have reasoning ability... I have seen enough, so I believe that scale expansion may continue." OpenAI CEO Sam Altman reviewed the company's early days on the Y-Combinator podcast and believed that part of the reason for its success was the "almost religious faith" in scale expansion - a concept that was considered "heretical" at the time. Marcus recently posted on social platform X claiming that his prediction of diminishing returns was correct. Altman responded by saying, "There is no bottleneck." Rem Seville, director of the Institute for the Age of Artificial Intelligence, said that we keep hearing news that new models have not met internal expectations, and there may be another reason behind this. After talking to people from OpenAI and Anthropic, he felt that people's expectations were too high. He said, 'They expect AI to be able to write doctoral dissertations. It may be a bit disappointing to say so.' Sevilla said that the temporary downturn does not necessarily indicate a broader slowdown. History has shown that significant technological breakthroughs occur over a long period of time: GPT-4, released 19 months ago, was born 33 months after GPT-3 was released. Sevilla said, "We tend to forget that there is a 100 fold difference in computing power scale between GPT-3 and GPT-4. To achieve a performance that is 100 times more powerful than GPT-4, it would require up to one million graphics processing units (GPUs)." This is larger than any GPU cluster currently known. However, he pointed out that collaborative projects to build AI infrastructure have already been launched this year, such as Elon Musk's 100000 GPU supercomputers in Memphis - the largest of its kind - reportedly built in just three months. During this period, AI companies may explore other methods to improve performance after training their models. OpenAI's latest large-scale model was once considered a paradigm: after gaining more thinking time, it surpassed previous models in reasoning ability. Sevilla said, "This is something we already know could happen," referring to a report released by the Institute for the Age of Artificial Intelligence in July 2023. Premature assertion that AI technology is slowing down may not only have an impact on Silicon Valley and Wall Street. After the release of GPT-4, people felt the rapid progress of technology, so they wrote an open letter calling for a six-month pause in training larger systems to give researchers and governments a chance to catch up with the progress. This letter has received over 30000 signatures, including Elon Musk and Turing Award winner Joshua Bunjo. Similarly, there is no consensus on whether people will take opposite actions when they perceive a slowdown in technological progress, leading to the disappearance of AI security issues from the agenda. Most of the AI policies in the United States are based on the premise that AI systems will continue to expand. A provision in the comprehensive executive order on AI signed by Biden in October 2023 (expected to be repealed by Trump) requires AI developers to inform the government of relevant information when training models with computing power exceeding a certain threshold. This threshold is higher than the maximum model at that time, assuming that future models will be even larger. The export controls aimed at restricting other countries from obtaining the powerful semiconductors needed to build large-scale AI models (restrictions on the sales of AI chips and technology in certain countries) are also based on the same assumption. However, if breakthroughs in AI development begin to rely more on factors beyond computing power, such as better algorithms or specialized technologies, then these limitations will have a relatively small impact on hindering AI progress in other countries. Scott Singer, a visiting scholar at the Carnegie Institute for International Peace's Technology and International Affairs program, said, "The first thing the United States needs to recognize is that export controls are, to some extent, based on the theory of technological timelines." He said that if the United States "stagnates in cutting-edge fields," other countries may step in to promote technological breakthroughs in AI; If the United States' leading position in the field of AI declines, it may be more willing to negotiate security principles with other countries. (New Society)
Edit:He ChenXi Responsible editor:Tang WanQi
Source:
Special statement: if the pictures and texts reproduced or quoted on this site infringe your legitimate rights and interests, please contact this site, and this site will correct and delete them in time. For copyright issues and website cooperation, please contact through outlook new era email:lwxsd@liaowanghn.com