NVIDIA Challenger raised $250 million to build the fastest AI computer with the largest chip

2021-11-11

Just now, the star AI chip enterprise cerebras announced that it had obtained a round f financing of US $250 million, with a post investment valuation of more than US $4 billion. At present, the company has the world's largest computer chip wse-2 and the world's fastest AI computer cluster CS-2. It is NVIDIA's strong rival in the AI field. Another AI chip star enterprise has obtained financing! This is cerebras, which has built the world's largest chip and the world's fastest AI computer. On Wednesday local time, cerebras announced that it had obtained a round f financing of US $250 million led by edge capital through its alpha wave ventures and Abu Dhabi growth. Other investors participating in this round of financing also include altimeter capital, benchmark capital, coat management, eclipse ventures, Moore strategic ventures and vy capital. After this financing, the total financing amount of cerebras reached US $750 million, and the post investment valuation exceeded US $4 billion. Previously, cerebras was valued at $2.4 billion in the e-round financing in 2019. Andrew Feldman, co-founder and CEO of cerebras, said: "The cerebras team and our extraordinary customers have made incredible technological breakthroughs and are changing artificial intelligence to make previously unimaginable possible." "This fund enables us to expand our global leadership to new regions, further popularize artificial intelligence, create a new era of high-performance AI computing, and solve today's most urgent social challenges, such as drug discovery, climate change and so on." At present, cerebras is in competition with AI chip giant NVIDIA. Both are trying to build complete computer systems and services around the world's largest chip. Cerebras calls it the "wafer scale" engine. The CS-2 built by cerebras is the fastest AI computer in the world. The machine is powered by cerebras wafer level engine wse-2. Wse-2 is the largest chip ever, containing 2.6 trillion transistors and an area of more than 46225 square millimeters. In contrast, NVIDIA's largest GPU has only 54 billion transistors and an area of 815 mm2. In addition, wse-2 has 123 times the computing kernel, 1000 times the high-speed on-chip memory, 12862 times the memory bandwidth and 45833 times the structural bandwidth of the competitive product. For AI tasks, large chips can process information faster and produce answers in a shorter time. Performance parameter comparison between cerebras wse-2 and NVIDIA A100 GPU In September this year, the company announced that it had established a partnership with cirrascale, a cloud operator, to provide CS-2 rental services. Cerebras says the company's infrastructure can calculate very large neural networks with trillions of parameters. Moreover, the company has also made great achievements on non AI issues, such as physics and other basic science issues. Cerebras initially served customers of the National Research Laboratory of the United States, and later expanded to more commercial enterprises. At present, the customer list of cerebras includes Argonne National Laboratory, Lawrence Livermore National Laboratory, neocortex AI supercomputer EPCC of Pittsburgh Supercomputing Center, University of Edinburgh Supercomputing Center, Tokyo electronic equipment company, pharmaceutical giant GlaxoSmithKline, aslikon, etc. In recent years, the scale of neural network has been growing steadily. In the past year, openai's gpt-3 natural language processor once became the world's largest neural network with 175 billion parameters, and was soon surpassed by Google's 1.6 trillion parameter model switch transformer. Such a huge model will encounter the bottleneck of computing power, and the system requirements have far exceeded the processing capacity of a single computer system. The memory of a single GPU is about 16GB, while the memory required by models such as gpt-3 is often as high as hundreds of TB. As in the past, simple and crude computing power expansion has been difficult to meet the needs. Therefore, system clustering becomes very important. How to implement cluster is the most critical problem. Keep every machine busy, otherwise the utilization of the system will decline. For example, this year NVIDIA, Stanford and Microsoft built a 1 trillion parameter gpt-3 and expanded it to 3072 GPUs. However, the utilization rate of the system is only 52% of the peak operation that the machine should be able to achieve in theory. Cerebras is trying to solve this problem. The first largest 7Nm chip AI cluster In August this year, cerebras released the world's first brain level AI solution - a single system that can support 120 trillion parameter AI models. How was the first brain level AI solution born? In addition to using the largest chip, cerebras also revealed four new technologies. This combination of technologies can easily build brain scale neural networks and distribute work to the core cluster of artificial intelligence optimization. Cerebras weight streaming: decomposing computing and memory This is a new software execution mode, which can decompose the calculation and parameter storage, expand the scale and speed independently and flexibly, and solve the problems of delay and memory bandwidth in small processor clusters. This technology realizes the storage of model parameters off chip for the first time, and provides the same training and reasoning performance as on chip. Cerebras memoryx: enable multi billion parameter model This is a memory expansion technology, which enables model parameters to be stored outside the chip and effectively streamed to CS-2 to achieve the same performance as on the chip. This architecture is highly flexible, supporting 4tb to 2.4pb storage configurations and 200 billion to 120 trillion parameter sizes. Cerebras swarmx: providing larger and more efficient clusters This is a high-performance communication structure optimized by artificial intelligence, which can expand the on-chip structure of cerebras to outside the chip, so as to expand the AI cluster and realize linear expansion of its performance. In other words, 10 CS-2 are expected to achieve the same solution 10 times faster than a single CS-2. Given that each CS-2 provides 850000 AI optimized cores, cerebras can connect 163 million AI optimized kernel clusters. Selectable sparsity: shorten time The unique data flow scheduling and huge memory bandwidth of cerebras architecture enable this type of fine-grained processing to accelerate all forms of sparsity. "Big Mac" chip iteration history As early as 2019, cerebras released the first generation WSE (wafer scale engine) chip. Typically, chip manufacturers slice 12 inches of silicon. After cutting, the wafer will continue to be cut into hundreds of independent small chips, and then integrate transistors on these small chips. Wse-1 is directly "grown" on a whole wafer. As the largest AI chip at that time, wes-1 had 400000 AI programmable cores and 1.2 trillion transistors, which were manufactured by TSMC 16nm process. Compared with traditional chips, WSE also contains 3000 times of high-speed on-chip memory and 10000 times of memory bandwidth. The total bandwidth of WSE is 100 petabits per second. At that time, the strongest NVIDIA ga100 core had an area of more than 800 square millimeters and 54 billion transistors. However, in front of cerebras's wse-1 chip, ga100 had to be called Dad. In April this year, cerebras launched the latest wafer scale engine 2 (wse-2) chip again, reaching a record 2.6 trillion transistors and 850000 AI cores, which were manufactured by TSMC's 7Nm process. Compared with the first generation WSE chip, the performance characteristics of wse-2, such as the number of transistors, the number of cores, memory, memory bandwidth and structure bandwidth, have more than doubled. With the support of advanced technology, cerebras can insert more transistors into the same 8x8 inch chip with an area of about 46225 mm2. In addition, TSMC's 7Nm process can make the width between circuits reach 7 billionth of a meter. This financing also brings cerebras a step closer to the next generation 5nm process. After all, the photomask cost required to manufacture chips will reach millions of dollars each. (Xinhua News Agency)

Edit:Chen Jie    Responsible editor:Li Ling

Source:Tecent News

Special statement: if the pictures and texts reproduced or quoted on this site infringe your legitimate rights and interests, please contact this site, and this site will correct and delete them in time. For copyright issues and website cooperation, please contact through outlook new era email:lwxsd@liaowanghn.com

Return to list

Recommended Reading Change it

Links

Submission mailbox:lwxsd@liaowanghn.com Tel:020-817896455

粤ICP备19140089号 Copyright © 2019 by www.lwxsd.com.all rights reserved

>