The "General Large Model Evaluation Standards" have been released
2024-10-14
The reporter learned from China Mobile Group that on October 12th, during the 2024 China Mobile Global Partner Conference, China Mobile, together with the China Electronics Standardization Research Institute of the Ministry of Industry and Information Technology, China Telecom, State Grid, China Petroleum, iFlytek and other industry parties, jointly released a new achievement in the construction of a large model evaluation system - the "General Large Model Evaluation Standards", providing important reference for the industry to select high-quality AI large models. China Mobile Chairman Yang Jie stated at the conference that in today's world, a new round of technological revolution and industrial transformation characterized by digitization is deepening, and data, computing power, and artificial intelligence have become important driving factors for new quality productivity. AI will accelerate its development and empower thousands of businesses and families with intelligence, which will strongly promote integration and innovation, and constantly leap forward information consumption, leading the economy and society from "Internet plus", "5G+" to "AI+". It is understood that large model evaluation is an important part of empowering industry applications with artificial intelligence. Currently, many enterprises have initiated the construction of large-scale models and urgently need to establish supporting evaluation systems for general large-scale models and industry large-scale models. China Mobile United Electronic Standards Institute, central enterprises, and industry stakeholders have preliminarily completed the development of universal large model evaluation standards. It is reported that the evaluation criteria released this time are based on the "2-4-6" framework: "2" represents two types of evaluation perspectives, guided by the actual usage needs of key industries, dividing the evaluation task into two types of perspectives: understanding and generating; '4' represents four types of evaluation elements, extracting four key elements from the entire evaluation lifecycle: evaluation tools, evaluation data, evaluation methods, and evaluation indicators; '6' represents the six evaluation dimensions, which comprehensively consider the core competencies in the application process of the large model. This standard widely incorporates opinions from all parties involved in industry, academia, research, and application, and combines industry specific scenario requirements to provide objective basis and important reference for the comprehensive evaluation of general large-scale models. Next, China Mobile will collaborate with all parties involved in industry, academia, research, and application to deeply explore the application needs of key industries such as petroleum, electricity, transportation, and logistics. We will continue to build and improve a universal and industry wide model evaluation system to support the high-quality development of domestic large models. (New Society)
Edit:Yao Jue Responsible editor:Xie Tunan
Source:XinHuaNet
Special statement: if the pictures and texts reproduced or quoted on this site infringe your legitimate rights and interests, please contact this site, and this site will correct and delete them in time. For copyright issues and website cooperation, please contact through outlook new era email:lwxsd@liaowanghn.com