Cretaceous of AI large model
2022-10-11
The beautiful long vacation is coming to an end, and we should be busy again. In the AI field, the busiest basic technology in the past two years should be a big model. With the recent AI painting, AI video generation and other capabilities constantly refreshing the public's understanding of AI technology boundaries, the status of the big model behind AI creators has also risen with the tide. The vigorous campaign of "refining big models" seems to have come to harvest time. However, while the large model is becoming more and more popular, it is not difficult to see a problem: although the pre training large model has shown good application effects in many fields, the commercial value generated by these effects is difficult to equate with the training cost and infrastructure investment cost of the large model. In fact, the ostensible large model is going through a somewhat difficult transition stage: the "magical" effect that the large model continues to show has aroused great attention from the capital, industry and academia. As one large model after another is trained and launched to the market, it will be found that the application scenarios and commercial value of the large model are not sufficient, although they have. How to go from "refining big models" to "using big models" is becoming a key test. In particular, China's AI industry is more aggressive in the investment and construction of large models, so the application transformation test of large models will emerge in the Chinese market more significantly and proactively. The situation of AI pre training large model at the current stage reminds me of a word: Cretaceous. Cretaceous is the last epoch of Mesozoic in the geological age. At this time, the global warming began and the continental shelf structure began to take shape. Dinosaurs still dominate the world, but mammals have become active. The big model seems to be at such a stage. The idea of the big model, which is tamped by BERT and GPT-3, is still hanging over the AI industry. However, how to move the big model to a new era of application has become a very eager and slightly confused must answer question. New species begin to appear, while old species still occupy the mainstream. Before discussing the transformation of the big model, we still need to go back to the development ideas and application logic of the big model with a little space. The so-called pre training large model refers to the basic model for training on large-scale and broad data. It captures the basic feature that the more data the depth learning algorithm has, the stronger the robustness of the model, and violently "feeds data" to the model. After the pre training of large-scale data, the model can adapt to more kinds and more complex downstream tasks, so as to finally obtain a better intelligent experience. In fact, the large-scale pre training model is not an innovation on the technical path, but an engineering innovation closer to grasping the technical characteristics. The road to big models was widely recognized, starting with Google's release of BERT in October 2018. It used large-scale data from BooksCorpus and Wikipedia to conduct model training, and broke the industry record at that time on 11 downstream tasks. We can understand the large-scale pre training model as a kind of "prefabricated vegetable". Since it is too difficult for users to cook by themselves, and it takes a lot of work and energy, it may be better to let the merchants prefabricate first. After the user buys the dishes back
Edit:Li Jialang Responsible editor:Mu Mu
Source:ithome.com
Special statement: if the pictures and texts reproduced or quoted on this site infringe your legitimate rights and interests, please contact this site, and this site will correct and delete them in time. For copyright issues and website cooperation, please contact through outlook new era email:lwxsd@liaowanghn.com