AI Observation | Three Inspirations from the Groq Chip Explosion for the Development of China's AI Chip Industry
2024-03-06
Recently, Silicon Valley startup Groq has launched a brand new AI chip, claiming to achieve "the strongest inference on the surface" - the inference speed of running large models on Groq is 10 times or even higher than that of Nvidia GPUs. As soon as this news was released, it quickly captured the headlines of major technology media sectors. This kind of reasoning speed is undoubtedly a technological leap in the field of AI chips, and also brings some new insights for domestic AI chip enterprises on how to break through and develop. Inspiration 1: It is possible to focus on specific scenarios and establish a "comparative advantage". The Groq chip is an LPU (language processing unit) with superior inference performance, once again demonstrating the application value of specialized AI chips in specific scenarios. Therefore, referring to Groq chips to replace or surpass Nvidia in a certain application scenario may be an effective development path for domestic AI chips at this stage. For example, this inference chip focuses on "fast". Conventional generative AI mainly uses NVIDIA A100 and H100 chips for training and inference work. Waiting during the inference process of large models is relatively normal, with characters popping out one by one and taking half a day to answer. But on Groq's demonstration platform, the model can almost immediately generate answers upon receiving prompt words. These answers not only have relatively high quality, but also come with citations, and their length can reach hundreds of words. What's even more surprising is that over three-quarters of its time is spent searching for information, while the time to generate answers is as short as a fraction of a second. Although Groq chips currently have various drawbacks, their advantages are too prominent to completely replace Nvidia in certain scenarios, and they can even do better, naturally attracting a lot of attention and recognition. It can be imagined that after optimizing the cost control of Groq chips to an appropriate range, there will be a large number of practical application scenarios to adapt and use. Inspiration 2: It is important to pay attention to "performance matching" in application scenarios. Groq chips stand out with their inference speed, fully demonstrating the strong correlation between their performance and application scenarios, which once again reminds us of the importance of application scenarios. The domestic AI chip industry should attach importance to performance matching in practical application scenarios, and carry out chip optimization and innovation on this basis. On the track of language reasoning, who is the champion has not yet been determined. At present, there are still considerable deficiencies in the overall quality of Groq chips. For example, the memory of each card in the Groq chip is 230MB. When running the Llama-2 70B model, 305 Groq cards are sufficient, while using the H100 only requires 8 cards. From the current price, this means that at the same throughput, Groq's hardware cost is 40 times that of H100, and its energy cost is 10 times. In addition, Groq chips can currently only adapt to the inference work of a few large models, and require a lot of debugging, and their wide applicability is relatively poor. The gap among them is also an innovation opportunity for Chinese enterprises. Ultimately, the success of an AI chip product is closely related to various factors, including the technology roadmap of the chip itself, the timing of product launch, and the maturity of the large model. For large
Edit: Responsible editor:
Source:
Special statement: if the pictures and texts reproduced or quoted on this site infringe your legitimate rights and interests, please contact this site, and this site will correct and delete them in time. For copyright issues and website cooperation, please contact through outlook new era email:lwxsd@liaowanghn.com