A brief history of the development of AI chips
Время обновления: 2022-01-04 12:07:11
As the field of artificial intelligence continues to make breakthroughs. As an important cornerstone to realize AI technology, AI chip has great industrial value and strategic position. As the key link and hardware foundation of AI industry chain, AI chips have extremely high barriers of technology development and innovation. From the trend of chip development, it is still in the primary stage of AI chip development. The future will be an important stage of AI chip development, and there is huge room for innovation in both architecture and design concepts.
History of chip development
At the Dartmouth Conference in 1956, scientists John McCarthy, Claude Shannon and Marvin Minsky introduced the term "artificial intelligence". better than the person who wrote it.
In this era of rapid advances in computer technology, an optimistic environment has led researchers to believe that AI technology will be "overcome" in no time. In 1970, Marvin Minsky wrote in Life magazine that "within the next three to eight years, we will have machines that are equivalent to the intelligence of the average person."
By the 1980s, AI came out of the lab and into commercialization, setting off a wave of investment. in the late 1980s, the AI technology bubble eventually burst and AI returned to research, where scientists continued to develop AI's potential. Industry observers called AI a technology ahead of its time, and some said it was the technology of the future. After a long period of what was called the "AI winter," commercial development began again.
In 1986, Geoffrey Hinton and his colleagues published a landmark paper describing an algorithm called "back-propagation" that could significantly improve the performance of multilayer or "deep" neural networks. In 1989, Yann LeCun and other researchers at Bell Labs created a neural network that could be trained to recognize handwritten zip codes, demonstrating an important real-world application of this new technology. It took them only three days to train the deep learning convolutional neural network (CNN). Time flies to 2009, when Rajat Raina, AnandMadhavan and Andrew Ng of Stanford University published a paper describing how modern GPUs have far more computational power than multi-core CPUs in the field of deep learning. ai is ready to take off again.
Thanks to Moore's Law in the last two decades, ample computing power has made it possible to deliver the computational performance required by AI algorithms at an acceptable price, power consumption and time. According to Intel's processor chip capacity and retail price comparison, the computing power per unit price can be purchased 15,000 times higher, so that the "general-purpose central processing unit" (CPU) can support a variety of artificial intelligence tasks. It can be said that the time is very ripe to significantly enhance artificial intelligence research and development through chip technology. However, since CPUs are designed and optimized for hundreds of tasks, it is not possible to sacrifice flexibility to optimize for a particular type of application, and therefore may not be the optimal choice for all AI algorithms. For this reason, a variety of CPU plus dedicated chip heterogeneous computing solutions have emerged to address the research of computational resources and memory access bottlenecks. In addition, "brain-like" computing research, which is different from "brain-inspired" deep neural networks, has also introduced advanced neuromimetic chips to support ultra-high energy-efficient natural learning methods.
The development stage of AI chip
Artificial intelligence core computing chips have also undergone four major changes.
Before 2007, AI research and applications experienced several ups and downs, and never developed into a mature industry; at the same time, limited by the algorithm, data and other factors, AI did not have a particularly strong demand for chips at this stage, and a general-purpose CPU chip could provide sufficient computing power. Later, due to the development of high-definition video, games and other industries, GPU products made rapid breakthroughs; at the same time, it was found that the parallel computing characteristics of GPU are exactly adapted to the requirements of AI algorithms for parallel computing of large data, such as GPU can improve the efficiency of deep learning algorithms by 9 times to 72 times than the previous traditional CPU, so they began to try to use GPU for AI computing. After 2010, cloud computing was widely promoted, and AI researchers could do mixed computing with a large number of CPUs and GPUs through cloud computing, and in fact the main computing platform for AI today is still cloud computing. However, the AI industry's requirements for computing power continue to rise rapidly, so after 2015, the industry began to develop dedicated chips for AI, which can further bring a 10-fold improvement in computing efficiency through better hardware and chip architecture.
Semi-custom FPGA-based AI chips
In the case that the chip demand has not yet become a scale, and the deep learning algorithm is not yet stable and needs continuous iterative improvement, the use of FPGA chips with reconfigurable features to achieve semi-custom AI chips is the best choice.
The company designed the DeepProcessingUnit (DPU) chip, hoping to achieve better performance than GPU with ASIC-level power consumption, and its first products are based on FPGA platform. Although this semi-custom chip relies on the FPGA platform, it can be developed and iterated quickly using the abstracted instruction set and compiler, and also has very obvious advantages compared with dedicated FPGA gas pedal products.
Fully customized AI chips for deep learning algorithms
These chips are fully customized using ASIC design method, and the performance, power consumption and area indicators are optimized for deep learning algorithms. Google's TPU chip and China's Cambrian deep learning processor chip of the Institute of Computing, Chinese Academy of Sciences are typical representatives of this type of chip.
Cambrian has pioneered the direction of deep learning processors internationally: the Cambrian processor, for example, currently contains three prototype processor structures: Cambrian 1 (English name DianNao, a prototype processor structure for neural networks), Cambrian 2 (English name DaDianNao, for large-scale neural networks), and Cambrian 3 (English name PuDianNao, for a variety of deep learning algorithms).
Cambrian chips are scheduled for industrialization this year: Cambrian 2 has a main frequency of 606MHz, an area of 67.7mm2, and a power consumption of about 16W in a 28nm process. its single-chip performance exceeds 21 times that of a mainstream GPU, while its energy consumption is only 1/330 of that of a mainstream GPU. the performance of a high-performance computing system composed of 64 chips is even 450 times higher than that of a mainstream GPU, but the total energy consumption is only 1/150. The total energy consumption is only 1/150.
Brain-like computing chips
The design purpose of such chips is no longer limited to just accelerating deep learning algorithms, but in the basic structure of the chip and even at the device level, we hope to develop new brain-like computer architecture, such as the use of new devices such as memory resistors and ReRAM to improve the storage density. This kind of chip research is still a long way from becoming a mature technology that can be widely used on a large scale in the market, and there are even great risks, but in the long run brain-like chips may bring a revolution in the computing system.
The TrueNorth processor is composed of 5.4 billion connected transistors, constituting an array of 1 million digital neurons, which can communicate with each other through 256 million electrical synapses.
The direction of AI research and development
In recent years, the application scenarios of AI technology have started to shift to mobile devices, such as autonomous driving in cars and face recognition in cell phones. The demand of industry has contributed to the progress of technology, and AI chips, as the foundation of industry, must achieve stronger performance, higher efficiency, and smaller size to complete the transfer of AI technology from the cloud to the terminal.
At present, AI chips are mainly developed in two directions: one is the FPGA (Field Programmable Gate Array) and ASIC (Application Specific Integrated Circuit) chips based on the traditional von Neumann architecture, and the other is the brain-like chip designed to imitate the neuronal structure of the human brain. Among them, FPGA and ASIC chips have already formed a certain scale no matter in R&D or application; while brain-like chips are still in the early stage of R&D, but have great potential and may become the mainstream in the industry in the future.
Brief development history of AI chips
The main difference between these two development lines is that the former follows the von Neumann architecture, while the latter uses the brain-like architecture. Every computer you see uses the von Neumann architecture. Its core idea is that the processor and memory should be separated, hence the CPU (central processing unit) and memory. And the brain-like architecture, as the name suggests, mimics the neuronal structure of the human brain, so the CPU, memory and communication components are integrated together.
AI chip technology features and representative products
From GPU, to FPGA and ASIC chips
Before 2007, AI did not have a particularly strong demand for chips due to factors such as algorithms and data at that time, and general-purpose CPU chips could provide sufficient computing power.
Then, due to the rapid development of high-definition video and game industry, GPU (graphics processing unit) chips made rapid development. Because GPUs have more logical computing units for data processing and belong to a highly parallel structure, they have an advantage over CPUs in processing graphics data and complex algorithms, and because AI deep learning has many model parameters, large data size and large computation, GPUs replaced CPUs as the mainstream of AI chips for some time afterwards.
GPU has more logical operation units (ALU) than CPU
However, GPUs are after all only graphics processors, not chips dedicated to AI deep learning, and naturally have shortcomings, such as the performance of their parallel structure cannot be fully utilized when executing AI applications, resulting in high energy consumption.
At the same time, the application of AI technology is growing, and AI can be seen in the fields of education, medical care, and driverlessness. However, the high energy consumption of GPU chips cannot meet the needs of the industry, so they are replaced by FPGA chips, and ASIC chips.
So what are the technical characteristics of these two chips? And what are the representative products?
The "all-purpose chip" FPGA
FPGA (FIELD-PROGRAMMABLEGATEARRAY), that is, "field-programmable gate array", is the product of further development on the basis of PAL, GAL, CPLD and other programmable devices.
FPGA can be understood as a "universal chip". The user defines these gate circuits and the connections between memories by burning in the FPGA configuration file, and designs the FPGA hardware circuits in hardware description language (HDL). With each burn-in, the hardware circuits inside the FPGA are connected in a defined way and have a certain function, and the input data only needs to pass through each gate circuit in turn to get the output result.
In layman's terms, "universal chip" is a chip that can have what functions you need it to have.
Despite the name "universal chip", FPGAs are not without flaws. Because of the high flexibility of FPGA structure, the cost of a single chip in mass production is also higher than the ASIC chip, and in terms of performance, the speed and energy consumption of FPGA chips compared to ASIC chips also made a compromise.
In other words, although the "universal chip" is a "multi-faceted", but its performance than the ASIC chip, the price is also higher than the ASIC chip.
But in the chip demand has not yet become a scale, deep learning algorithms need to constantly iterate to improve the situation, with reconfigurable characteristics of the FPGA chip is more adaptable. Therefore, the use of FPGA to achieve semi-custom AI chip is undoubtedly the insurance choice.
At present, the FPGA chip market is divided by American manufacturers Xilinx and Altera. According to foreign media Marketwatch, the former accounts for 50% of the global market share, the latter accounts for about 35%, the two manufacturers dominate 85% of the market share, more than 6,000 patents, is undoubtedly the industry's two big mountains.
Xilinx's FPGA chips are divided into four series from low-end to high-end, namely Spartan, Artix, Kintex and Vertex, and the chip process also ranges from 45 to 16nm. The higher the chip process level, the smaller the chip. Among them, Spartan and Artix are mainly for the civilian market, applications including unmanned, smart home, etc.; Kintex and Vertex are mainly for the military market, applications including defense, aerospace, etc.
Xilinx's Spartan series FPGA chips
Let's talk about Xilinx's old rival Altera. Altera's mainstream FPGA chips are divided into two categories, one focusing on low-cost applications, medium capacity, performance can meet the general application needs, such as Cyclone and MAX series; and one focusing on high-performance applications, large capacity, performance can meet all kinds of high-end applications, such as Startix and Arria series. Altera's FPGA chips are mainly used in the fields of consumer electronics, wireless communication, military aviation, etc.
Specialized integrated circuit ASIC
Before the large-scale emergence of AI industrial applications, using general-purpose chips like FPGAs suitable for parallel computing to achieve acceleration can avoid the high investment and risk of developing custom chips like ASICs.
However, as we mentioned earlier, since the general-purpose chips are not designed specifically for deep learning, FPGAs inevitably have bottlenecks in terms of performance and power consumption. As the scale of AI applications expands, such problems will become increasingly prominent. In other words, all our good ideas about AI need chips to keep pace with the rapid development of AI. If the chip cannot keep up, it will become a bottleneck in the development of AI.
Therefore, with the rapid development of AI algorithms and application areas in recent years, as well as the achievements in R&D and the gradual maturity of the process, ASIC chips are becoming the mainstream of AI computing chip development.
ASIC chips are specialized chips customized for specific needs. Although sacrificing versatility, ASICs have advantages over FPGA and GPU chips in terms of performance, power consumption, and size, especially in mobile devices that require chips with both high performance, low power consumption, and small size, such as the cell phones we have in our hands.
However, because of its low versatility, the high R&D cost of ASIC chips may also bring high risk. However, if you consider the market factors, ASIC chips are actually a major trend in the development of the industry.
Why do you say so? Because from servers and computers to driverless cars and drones, to all kinds of home appliances in smart homes, a huge number of devices need to introduce AI computing capabilities and perceptual interaction capabilities. For the requirements of real-time, as well as training data privacy and other considerations, these capabilities can not rely entirely on the cloud, must be supported by the local hardware and software infrastructure platform. The high performance, low power consumption and small size of ASIC chips can precisely meet these needs.