Small Language Models: A New Hotspot in the AI Field

2024-12-18

For many years, tech giants such as Google and startups like OpenAI have been tirelessly utilizing massive online data to build larger and more expensive artificial intelligence (AI) models. These Large Language Models (LLMs) are widely used in chatbots such as ChatGPT to assist users in handling a variety of tasks, from writing code, planning itineraries, to creating poetry, and more. Since the launch of ChatGPT, AI models have been racing towards becoming bigger and stronger. But after the noise, technology companies are increasingly focusing on smaller and more streamlined Small Language Models (SLMs). They believe that these small and exquisite models are not only specialized in their respective fields, but also have lower deployment costs and are more energy-efficient. In the future, these AI models of varying scales will work together and become the right-hand man of humanity. Small models have unique advantages. With the rapid advancement of AI technology, the size of AI models is increasing day by day. OpenAI, the creator of ChatGPT, boasted last year that its GPT-4 model has approximately 2 trillion parameters. The parameters represent the size of the AI model. Generally, the more parameters there are, the stronger the ability of the AI model. The large number of parameters makes GPT-4 one of the most powerful AI models to date, capable of answering a wide range of questions from astrophysics to zoology. However, if a company only wants to use AI models to solve problems in specific fields (such as medicine), or if an advertising company only needs one AI model to analyze consumer behavior so that they can push ads more accurately, models like GPT-4 may be a bit overused, while SLM can better meet users' requirements. In a November report on the Forbes biweekly website, SLM was referred to as the "next big event" in the field of AI. Sebastian Bubeck, Vice President of Generative AI at Microsoft, stated that although there is currently no unified standard for the number of SLM parameters, it is approximately between 300 million and 4 billion, small enough to be installed on smartphones. Experts claim that SLM is better suited for simple tasks such as summarizing and indexing documents, searching internal databases, etc. Laurent Doude, the head of French startup LightOn, believes that SLM has many advantages compared to LLM: firstly, these models have faster response times, can respond to more queries simultaneously, and reply to more users; Secondly, SLM has lower deployment costs and less energy consumption. Dude explained that many LLMs currently require a large number of servers for training and processing queries. These servers are composed of cutting-edge chips and require a large amount of electricity to operate and cool. Training SLM requires fewer chips and consumes less energy, making it cheaper and more energy-efficient. SLM can also be directly installed on devices and run without relying on data centers, which can further ensure data security. Forbes states that SLM can perform various tasks with minimal computing resources, making it an ideal choice for mobile devices, edge devices, and more. AI models have sparked a trend of extreme simplicity, with companies such as Google, Microsoft, Metaverse Platform, and OpenAI taking action and launching various SLMs. At the end of December last year, Microsoft officially released the Phi-2 language model with only 2.7 billion parameters. Microsoft Research stated on its official X platform account that Phi-2 performs better than other existing SLMs and can run on laptops or mobile devices. In April of this year, Microsoft released the Phi-3 series models with only 3.8 billion parameters. In August of this year, Microsoft continued its efforts and launched the latest Phi-3.5-mini-instruct. This SLM is tailored for efficient and advanced natural language processing tasks. In September, Nvidia open-source Nemotron Mini-4B-Instruction. The company said that this SLM is particularly suitable for edge computing and device side applications. The report states that these two SLMs achieve a good balance between computing resource utilization and functional performance. In some aspects, its performance is even comparable to LLM. OpenAI is not to be outdone either. In July of this year, OpenAI released the GPT-4o mini, claiming it to be the company's most intelligent and affordable SLM. In addition, Amazon also allows the use of AI models of various scales on its cloud platform. Other companies are also developing SLMs that are more suitable for their own needs. For example, American pharmaceutical giant Merck is collaborating with Boston Consulting Group (BCG) to develop an SLM aimed at exploring the impact of certain diseases on genes. This will be an AI model with parameters ranging from several hundred million to several billion. Although SLM has unique advantages in efficiency and other aspects, LLM still has great advantages in solving complex problems and providing broader data access. Looking ahead, LLM and SLM models will be "friends rather than opponents", and their collaborative communication will become the mainstream trend. When encountering a problem raised by a user, an SLM will take the lead, understand the problem, and then send relevant information to several AI models of different sizes based on the complexity of the problem. These models work together to solve problems for users. Currently, AI models on the market are either too large, too expensive, or have too slow processing speeds. Collaboration between the two, or the best solution. (New Society)

Edit:Yao jue    Responsible editor:Xie Tunan

Source:Science and Technology Daily

Special statement: if the pictures and texts reproduced or quoted on this site infringe your legitimate rights and interests, please contact this site, and this site will correct and delete them in time. For copyright issues and website cooperation, please contact through outlook new era email:lwxsd@liaowanghn.com

Return to list

Recommended Reading Change it

Links

Submission mailbox:lwxsd@liaowanghn.com Tel:020-817896455

粤ICP备19140089号 Copyright © 2019 by www.lwxsd.com.all rights reserved

>