Is the domestic GPU upstart with AI as a selling point a bubble?
Yesterday, the news that Biren Technology completed the B round of financing, and the cumulative financing exceeded 4.7 billion in more than a year after its establishment broke the semiconductor circle.
Biren Technology, which was established in September 2019, stated on its official website that from the development path, Biren Technology will first focus on cloud-based general intelligent computing, and gradually focus on artificial intelligence training and reasoning, graphics rendering, high-performance general computing and other fields. Catch up with existing solutions and achieve breakthroughs in domestic high-end general-purpose intelligent computing chips. It is said that the founding team of the company is composed of core professionals and R&D personnel in the fields of chip and cloud computing at home and abroad, and has profound technical accumulation and unique industry insights in the fields of GPU, DSA (dedicated accelerator) and computer architecture.
Moore Thread, established in June 2020, has received billions of dollars in financing. The company is committed to building a computing platform in China’s visual computing and artificial intelligence fields, developing the world’s leading independent innovation GPU intellectual property, and helping to build China’s local high-performance computing ecosystem system, its GPU product line covers general graphics computing and high performance computing. It is said that the core members of its founding team are mainly from NVIDIA, Microsoft, Intel, AMD and Arm, etc. The main members are more than 10 in the fields of GPU driver, compilation, AI chip, software algorithm and system design. Years of experience.
Muxi Integrated Circuit, established in September 2020, has completed hundreds of millions of PreA+ rounds of financing. It is said that the founding team of this high-performance general-purpose GPU chip design company mainly comes from international companies such as AMD, and has the design and development of GPU chips from 40nm to 7nm process. Mass production experience.
The founding team of Xinpu Semiconductor, established in November 2019, comes from the GPU R&D team of Xiyou Post. This chip design start-up company focusing on computer graphics and high-performance computing will invest 150 million yuan in Nanjing to develop high-performance, high-reliability and high-stability. Unique domestic independent GPU and artificial intelligence chips.
Hanbo Semiconductors, established in Shanghai in December 2018, has completed a total of 50 million US dollars in Series A financing. Its core employees have an average of more than 15 years of related chip and software design experience, and currently there are more than 150 employees. Its products focus on the optimization of computer vision and video processing, providing rich features and efficient performance/power consumption, suitable for multiple artificial intelligence fields.
Denglin Technology, which was established in November 2017, recently completed the A+ round of financing, and its first GPU+ (software-defined on-chip heterogeneous general-purpose AI processor) product has successfully passed the test. Since its establishment three years ago, Dinglin has been committed to a multi-scenario AI computing platform that is completely self-developed. Its Goldwasser GPU+ product innovatively adopts the heterogeneous design of software and hardware collaboration on the existing mainstream GPU architecture in the market, compared with traditional GPU in AI computing. Performance and energy efficiency have been significantly improved.
Founded in December 2015, Shanghai Tianshu Zhixin recently completed a C round of financing of 1.2 billion yuan. Its 7-nanometer general-purpose (GPGPU) cloud computing chip BI was taped out in May 2020, returned in November, and successfully completed in December ” light up”. Tianshu Zhixin will further accelerate the research and development of cloud training and reasoning chips for 5G requirements, provide options for current mainstream GPGPU ecological products, and help artificial intelligence be applied in more fields.
Most of these domestic GPU upstarts founded by senior Chinese experts from international giants such as NVIDIA or AMD only have ambitions and development plans, and have no specific products and application solutions. With such a large amount of VC investment in a short period of time, is this another round of “bubble” of domestic chips?
To accurately answer and predict the prospects of this round of domestic GPU financing and entrepreneurship, we must first look at the development history of GPUs, the current status of the global and Chinese markets, and the potential for future application development.
Graphics Processing Unit (GPU) Development Process
Friends who are more familiar with GPUs can skip this part and go directly to the “Global GPU market enters an oligopoly pattern” section.
Graphics Processing Unit (GPU), also known as Display core, graphics card, visual processor, display chip or graphics chip, is a kind of graphics processor specially designed for PCs, workstations, game consoles and some mobile devices (such as tablet computers, smartphones, etc. ) on the microprocessor that runs the drawing operations.
The composition of the Graphics Processing Unit (GPU). (Source: Wikipedia)
Graphics processor is a concept first proposed by NVIDIA when it released the NVIDIA GeForce 256 graphics processing chip in August 1999. Until now, the graphics chip in a computer that handles graphics output was rarely seen as a separate computing unit. The competitor ATI (later acquired by AMD) also proposed the concept of Visual Processing Unit (VPU). The graphics processor can reduce the dependence of the graphics card on the CPU, and share some of the tasks originally performed by the CPU, especially when performing 3D graphics operations, the effect is more obvious. The core technologies used by the GPU include hardware coordinate transformation and light source, three-dimensional environment material map and vertex blending, texture compression and bump mapping, dual texture four-pixel 256-bit rendering engine, etc.
The GPU can be combined with a dedicated circuit board to form a graphics card, or directly embedded on the motherboard as a separate chip, or built into the northbridge chip of the motherboard, and now it is also built into the CPU to form an SoC. In 2007, more than 90% of new desktop and notebook computers had embedded graphics chips, but they tended to be lower in performance than discrete graphics cards. However, after 2009, AMD and Intel have each vigorously developed high-performance integrated graphics processing cores built into the CPU. Its performance has surpassed those of low-end discrete graphics cards in 2012, which has caused many low-end discrete graphics cards to gradually lose market demand. . In the field of handheld and mobile devices, with the increasing demand for graphics processing capabilities, Qualcomm, Imagination, ARM, etc. have begun to “show their talents” in the field of GPUs, but most of them are implanted in the form of GPU kernels. application processor MPU.
Traditional CPUs such as Intel i5 or i7 processors have fewer cores and are designed for general-purpose computing. In contrast, a GPU is a special type of processor with hundreds or thousands of cores optimized to perform massive computations in parallel. While GPUs are best known for 3D rendering in games, they are especially useful for data analysis, deep learning, and machine learning algorithms. GPUs allow certain computations to be processed 10 to 100 times faster than conventional CPUs.
AI acceleration lets GPUs and Nvidia take off
An AI accelerator is a specialized hardware accelerator or computer system designed to accelerate artificial intelligence applications, especially artificial neural networks, machine vision, and machine learning. Typical applications of AI accelerators include algorithms for robotics, IoT, and other data-intensive or sensor-driven tasks. They are typically designed from many processor cores and typically focus on low-precision arithmetic operations, employing new dataflow architectures or in-memory computing architectures.
GPUs are specialized hardware for processing images and computing local image properties, while the mathematical foundations of neural networks and image processing are similar, and both require processing huge matrix-parallel tasks. Since AI became popular in 2012, GPUs have been increasingly used for machine learning tasks. Especially since 2016, GPUs have become more and more popular in processing AI tasks and are moving in the direction of deep learning. Whether it is AI training in the data center or edge AI inference for autonomous driving, GPUs can handle it calmly. With the popularity of GPUs in AI, Nvidia, which focuses on GPUs, has naturally become the darling of the AI era. It has changed its “underdog” image that has lived in the gap between Intel and AMD for many years, and has become a Wall Street upstart with a market value that exceeds Intel’s.
Deep learning frameworks and AI algorithms are still evolving, making it incredibly difficult to design custom hardware. Reconfigurable devices like field programmable gate arrays (FPGAs) can evolve with AI frameworks and software more flexibly than GPUs. Microsoft took the lead in using FPGA chips for AI inference acceleration, and the application prospects of FPGAs in AI acceleration also prompted Intel to acquire Altera, with the aim of integrating FPGAs into server CPUs so that CPUs can perform AI acceleration while performing general computing tasks.
While GPUs and FPGAs perform better than CPUs for AI-related tasks, custom-designed ASICs based on Domain-Specific Architecture (DSA) concepts can increase efficiency by up to 10x. This AI accelerator uses methods such as optimized memory usage and low-precision arithmetic to speed up computation and increase computational throughput. Internet giants such as Facebook, Amazon, and Google are all designing their own AI ASICs, like Google’s TPUs.
The global GPU market has entered an oligopoly pattern
According to the forecast of authoritative research institutions, the global GPU market will reach US$25.41 billion in 2020, and it is expected to reach US$185.31 billion in 2027, with a compound annual growth rate of 32.82%. By industry application of GPU, the market can be segmented into electronics, IT & telecom, defense & intelligence, media & entertainment, automotive, and others. The automotive segment is expected to witness the highest CAGR due to the widespread use of GPUs in design and engineering applications.
In the global AI chip market, GPU accounts for about 1/3. The high-performance computing (HPC) space has historically been an important market for GPUs, with data predicting that by 2023, 10% of servers will be equipped with GPUs to accelerate AI workloads, up from less than 2% in 2018. With the accelerated convergence of HPC and AI, GPUs are redefining the data center and high-performance computing markets.
The global GPU has entered an oligopoly pattern. In the traditional GPU market, the revenue of the top three Nvidia, AMD, and Intel can almost represent the revenue of the entire GPU industry. In terms of mobile phone and tablet GPUs, the GPU designs of MediaTek, HiSilicon Kirin and Samsung Exynos are mainly based on the public version of ARM Mali GPU or Imagination PowerVR microarchitecture, while Qualcomm Snapdragon Adreno and Apple A series use self-developed GPU microarchitecture.
Nvidia is a recognized global leader in GPU computing, and its main GPU production line “GeForce” is in direct competition with AMD’s “Radeon”. NVIDIA’s four business growth drivers are gaming, data center, professional vision and autonomous driving. Representative GPU solutions include GeForce, DGX, EGX, HGX, Quadro, and AGX. The company’s fiscal 2021 revenue was $16.7 billion, with gaming, data center, professional vision, and autonomous driving businesses contributing 47%, 40%, 6%, and 3%, respectively, in fiscal 2020. After reaching 50% gross margin in 2014, Nvidia’s gross margin in fiscal 2021 exceeded 60%.
In September 2020, NVIDIA announced the acquisition of ARM for $40 billion. If this merger is successful, the combination of NVIDIA’s leading AI computing platform and ARM’s huge processor ecosystem will create a world-class computing company in the AI era. The combined Nvidia will advance computing from the cloud, smartphones, PCs, autonomous driving and robotics to the edge IoT, expanding AI computing to global markets. At the same time, the developers of the NVIDIA computing platform will expand from 2 million to more than 15 million, thus forming the world’s largest computing platform and ecological community.
Development status and market potential of domestic GPUs
After years of exploration and development, domestic CPUs have formed a certain climate, and the industry and ecology have gradually improved. Domestic CPUs represented by Godson, Zhaoxin and Feiteng began to develop and expand their ecosystems around their respective core products, and gradually developed and expanded with the help of the country’s Xinchuang and the independent development of the semiconductor industry. However, the development of domestic GPUs lags far behind domestic CPUs. It was not until 2014 that Jingjia Micro successfully developed the first high-performance, low-power GPU chip in China, the JM5400.
The reason for this is that the GPU’s own dependence on the CPU’s properties is the main factor. The GPU structure has no controller and must be controlled by the CPU to work, otherwise the GPU cannot work alone. Therefore, it is in line with the development logic of the chip industry that the domestic CPU is one step ahead of the GPU. Furthermore, the development of GPU technology is very difficult. The domestic talent gap is also one of the reasons for the slow development of domestic GPUs.
However, the scale and potential of China’s GPU market is very large, and the huge manufacturing capacity of the whole machine means a huge amount of GPU purchases. Although there have been bottlenecks in the production growth of complete computers and smartphones in recent years, due to the large volume of these two types of products, the large demand for GPUs and the very high value of individual products, the market size is still very considerable. At the same time, with the rapid growth of the whole machine shipments, the demand for server GPUs is also growing rapidly. According to statistics, domestic server shipments reached 3.304 million units in 2018, a year-on-year increase of 26%, of which the growth rate of shipments in the Internet, telecommunications, finance and service industries all exceeded 20%. In addition, there is also a huge demand for GPUs in emerging computing fields such as the Internet of Things, Internet of Vehicles, and artificial intelligence.
Compilation of major domestic GPU manufacturers
In addition to the several domestic GPU start-ups mentioned at the beginning of this article with AI as their selling point, there are also some domestic GPU manufacturers who have been cultivating in specific fields for many years, and are now seizing the opportunities of the Xinchuang market and “domestic substitution” to expand the application market and accelerate the The development of the domestic GPU industry.
Changsha Jingjiawei was established in April 2006. It is currently the only listed company focusing on the design of domestic GPU chips. Its main products include JM5400, JM7200 and JM7201 graphics processors. The application markets for notebook computers, all-in-one computers, mobile Workstations, blade motherboards and other desktop office and industrial control fields.
In April 2014, Jingjia Micro successfully developed the first domestic high-reliability, low-power GPU chip-JM5400, which has completely independent intellectual property rights, breaking the long-term monopoly of foreign products in the Chinese GPU market. The first-generation GPU JM5400 is mainly used in the military market, replacing the original ATI M9, M54, M72 and other American GPU chips.
Jingjiawei’s second-generation GPU JM7200 series was successfully taped out in August 2018 and received its first order in March 2019. Compared with the previous generation product, the theoretical performance of JM7200 has been doubled, and the process has also been improved to 28nm. However, the JM7200 still lags far behind the Nvidia GT640 released in 2012 with the full version of the GK107 core in terms of memory bandwidth, pixel fill rate, and floating-point performance.
In 2019, based on the JM7200, the company launched the commercial version JM7201 to meet the high-performance display requirements of desktop systems, and fully support domestic CPUs and domestic operating systems, thus promoting the further improvement of the ecological construction of the domestic computer industry. JM7201 optimizes desktop applications in the civilian market, and introduces standard MXM and standard PCIE graphics cards, which reduce power consumption and size while ensuring performance.
Jingjia Micro GPU has completed the adaptation work with major domestic CPU and operating system manufacturers such as Loongson, Feiteng, Kylin Software, Tongxin Software, Dao, Tianmai; Manufacturers establish cooperative relations and conduct product testing; conduct mutual certification with many software and hardware manufacturers such as Kirin, Great Wall, Cangqiong, Baode, Chaotu, Kunlun, Zhongke Fangde, Zhongke Controllable, Ningmei, etc. to jointly build localization Computer application ecology.
Jingjia Micro GPU production line. (Source: Jingjiawei and Founder Securities)
In December 2018, Jingjia Micro raised 1.088 billion yuan through a fixed increase for high-performance general-purpose graphics processors and general-purpose chips for consumer electronics. R&D and industrialization projects. Among them, the high-performance general-purpose graphics processor project includes two GPU chips, JM9231 and JM9271, which are aimed at medium and high-end series products in different application fields. According to the company’s 2020 interim report, the research and development of the next-generation graphics processor is in the back-end design stage. Jingjia Micro JM9 series is the first GPU with unified rendering structure after JM5400 and JM7200 local rendering computing cores, and the number of programmable computing modules has been increased. The performance of JM9231 and JM9271 is comparable to the GTX1050 and GTX1080 launched by NVIDIA in 2016, respectively.
The launch of the JM9 series will shorten the company’s GPU level and international leader to 5 years, greatly enhancing the company’s competitiveness in the GPU field.
In October 2020, Wuhan-based Innosilicon announced a partnership with Imagination to develop “Fenghua” series of GPUs based on Imagination’s new top-of-the-line BXT multi-core architecture, using innovative SoC technologies such as multi-chip (chiplet) and GDDR6 high-speed memory.
Innosilicon “Fenghua” series graphics card GPU. (Source: Innosilicon)
In terms of Xinchuang and computing power security, the “Fenghua” series of GPUs have built-in physical unclonable iUnique Security PUF information security encryption technology, which improves data security and computing power attack resistance, and supports the autonomous and controllable ecology of desktop computers and data center GPU computing. This GPU chip comes with floating-point and intelligent 3D graphics processing functions, a fully customized multi-stage pipeline computing core, both high-performance rendering and intelligent AI computing power, and can be cascaded to combine multiple chips to combine processing capabilities, greatly increasing flexibility , suitable for 1080P/4K/8K high-quality display in the domestic desktop market, supporting VR/AR/AI, multi-channel server cloud desktop, cloud gaming, cloud office and other application scenarios.
VeriSilicon, an IP and chip custom development service provider
The GPU IP of VeriSilicon shares listed on the Science and Technology Innovation Board originated from the acquisition of embedded GPU developer Vivante in 2016. In the field of GPU IP, VeriSilicon has mastered core technologies such as support for mainstream graphics acceleration standards, independent controllable instruction sets and strong scalability, which are widely used in IOT, automotive electronics, PC and other markets. VeriSilicon’s scalable Vivante GPU IP applications range from low-power small IoT MCUs (GPU Nano IP series) to powerful SoCs (GPU Arcturus graphics IP) for automotive and computer applications, covering a wide range of chip sizes and power consumption Budget, cost-effective premium graphics processor solution.
VeriSilicon Vivante GPU IP product line and its applications. (Source: VeriSilicon)
VeriSilicon’s graphics processor technology supports the industry’s mainstream embedded graphics acceleration standards Vulkan 1.0, OpenGL 3.2, OpenCL 1.2 EP/FP and OpenVX 1.2, etc. It has an autonomously controllable instruction set and a dedicated compiler, supporting 250 billion times per second floating-point computing power and 128 parallel shader processing units.
VeriSilicon’s research and development in graphics processor technology includes the high-performance general-purpose graphics processor GC8400 IP, which is suitable for automotive electronics and is still in the IP design verification stage. Double precision, 512 parallel shader processing units.
Headquartered in Zhangjiang, Shanghai, Zhaoxin is the third microprocessor company in the world with X86 authorization. It masters the three core technologies of CPU, GPU and chipset, and has the design and R&D capabilities of the three core chips and related IP. Zhaoxin provides desktop complete machines, servers, industrial motherboards and system-level solutions, which are widely used in party and government office, transportation, finance, energy, education and network security.
Zhaoxin KX-6000 is the first domestic SoC single-chip domestic general-purpose processor that fully integrates CPU, GPU and chipset. It adopts 16nm process, integrates high-performance graphics card, supports DP/HDMI/VGA output, and is compatible with DirectX, OpenGL, Mainstream APIs such as OpenCL can output up to 3 monitors at the same time, and the resolution can reach 4K.
In the future, Zhaoxin will further upgrade the KX series processors, using a new CPU architecture, upgrading the memory from DDR4 to DDR5, and upgrading the bus from PCIe3.0 to PCIe4.0. The upgrade of memory and bus can increase the bandwidth of the graphics card and the communication speed between the CPU and the GPU, respectively. In addition to the integrated GPU, Zhaoxin also plans to release a discrete GPU chip using TSMC’s 28nm process and a TDP of 70W.
Core Pupil Semiconductor
Xi’an Xintong Semiconductor is committed to the research and development of high-performance GPU chips. The company’s founding team has more than 10 years of academic and engineering experience in the GPU field, and is a R&D team with full-stack software and hardware support. Its core technicians come from the GPU core team of West Post, Intel, Mstar, Huawei Hisilicon, ZTE, RedHat, Tencent, ThoughtWorks and other well-known software and hardware companies.
The company’s first-generation GPU chip (GenBu01) series products have completed the adaptation work with domestic CPUs and mainstream operating systems, and can be used in embedded computing and equipment, office computers and industrial control display equipment and other application scenarios. Jointly released MXMGB01 graphics card based on Xinpu GenBu01 with Shenzhen Zhongwei Information, which is a purely domestic display solution that supports domestic firmware BIOS, domestic operating system and domestic CPU.
The high-performance products under development can be applied to large-scale equipment such as servers and data centers. The patents currently invented by the company cover multiple technical core directions of GPU chip design, including video memory management, chip architecture modeling, graphics pipeline architecture, and shader design.
The first-generation product Goldwasser that Denglin Technology spent three years developing has been mass-produced in the third quarter of 2020. The product is currently cooperating with leading companies in the Internet and security fields for integration and business testing. The company adopts the independently innovative Minsky architecture (software-defined heterogeneous artificial intelligence computing platform), and fully supports various popular artificial intelligence network frameworks and underlying operators on the premise of providing compatible CUDA/OpenCL hardware acceleration capabilities. Compared with NVIDIA’s current mainstream cloud inference products (T4), Denglin Technology’s products have a smaller chip area on the same process. Under the same power consumption, depending on the AI network, the computing efficiency can be improved by 3-10. At the same time, it also reduces the dependence of chip performance on external memory throughput.
Are Nvidia and AMD under threat?
Driven by AI-accelerated computing, independent innovation of domestic semiconductors, and venture capital, the domestic GPU, which was originally calm, suddenly took off, setting off a storm in the already restless Chinese semiconductor industry. This is definitely a good thing for the development of the semiconductor industry and the domestic GPU industry, but the author thinks that this storm chased and set off by capital is a bit too hot. Even with senior GPU R&D experts and strong capital support, it is unrealistic to create an industry from scratch to compete with global GPU giants. Are Nvidia and AMD feeling threatened? In my opinion, apart from the loss of some technical management talents and the poaching of R&D personnel, domestic GPUs will not be able to shake their status in the short term.
The growth in demand for GPUs in specific fields such as national Xinchuang market demand and industrial control will bring opportunities for domestic GPU manufacturers such as Jingjiawei to grow and expand the market. As for these GPU upstarts with AI as their main application market, in addition to coming up with truly comparable GPU chips, they also need to work hard on ecological construction and AI scene implementation to prove that they are indeed getting so much money. Value for money” can eliminate the suspicion of “bubble”.
Friends who are interested in Fangzheng Securities’ research report “GPU Research Framework – Industry In-depth Report” can click “Read the original text” in the lower left corner of the end of the article to enter the Electronic engineering album website to download the full PDF version.