AMD launches AI chip challenge, but Nvidia is still alone

Author: Zhao Jian

Source: Jiazi Guangnian

Image credit: Generated by Unbounded AI tools

Nvidia CEO Huang Renxun is trying to give the industry an impression that AI is equal to Nvidia.

Today, with the explosion of AI large language models, Nvidia's GPU chips for artificial intelligence are almost the only choice to complete AI training that requires extremely high computing power.

This extreme imbalance of supply and demand has made Nvidia’s GPUs difficult to find. Even OpenAI CEO Sam Altman is complaining that the shortage of chips has affected the development of ChatGPT.

Huang Renxun must be happy to hear this. In 2023, driven by the demand for AI, Nvidia's market value will exceed one trillion US dollars.

However, some people are trying to break Nvidia's "lonely seeking defeat" status in the field of artificial intelligence.

On Wednesday, AMD (Advanced Semiconductor) officially released the annual flagship chip Instinct MI300 at its first "artificial intelligence and data center" product launch conference, a super chip that can compete with Nvidia's Grace Hopper series.

Instinct MI300 has two versions: MI300X has only GPU, which is specially designed for AI model training and packs 153 billion transistors; MI300A is an APU that integrates multiple CPUs, GPUs and high-bandwidth memory (AMD proposed in 2011 product concept), packaged with 146 billion transistors.

The release of Instinct MI300 means that Nvidia is no longer the only option for AI companies for computing power. AMD has indeed managed to attract some AI star unicorns, such as Hugging Face, AMD will optimize the model for its CPU, GPU and other AI hardware.

Instinct MI300 carries AMD's ambitions in the field of artificial intelligence. AMD CEO Lisa Su recently said: "If you look at five years, you will see artificial intelligence in every product of AMD, and it will become the biggest growth driver."

AMD is Nvidia's old rival. The competition between the two in the GPU market has lasted for 17 years, and most of the time ended with Nvidia's victory.

And this time, can AMD, which has already proven itself in the CPU market once, copy its successful experience to the GPU market?

1.AMD wants Nvidia's AI crown

AMD is a well-known old semiconductor company in the world, founded in 1969. According to the ranking data of global semiconductor companies released by Gartner this year, AMD ranks seventh.

CPU is AMD's birthplace. In 1981, AMD obtained the authorization of Intel X86 series processors, and became the second in the industry in one fell swoop during the bonus period of the PC era, and the second in this industry has been doing it for decades.

In addition to the CPU, AMD has gradually established a complete chip layout of "CPU+GPU+DPU+FPGA" through continuous mergers and acquisitions.

Some of the more important mergers and acquisitions include:

  • In July 2006, AMD spent US$5.4 billion to acquire ATI, the No. 2 in the GPU industry at the time, officially launching GPU competition with Nvidia;
  • In February 2022, AMD spent US$49.8 billion to complete the acquisition of FPGA manufacturer Xilinx to strengthen its layout in the data center business;
  • In April 2022, AMD announced the acquisition of DPU chip maker Pensando for US$1.9 billion to continue expanding its data center business.

AMD's business structure is divided into four major sectors: data center, client, game and embedded business.

The data center includes all of AMD's server-related revenue; the client revenue mainly involves desktops and personal computers, which used to be one of AMD's core businesses, but now the proportion of revenue is not high; the game business mainly involves the GPU product line, Sony, Microsoft is a stable major customer; the embedded business mainly comes from the original Xilinx business.

As artificial intelligence becomes a trend, the data center has become a business that major cloud giants attach great importance to and invest heavily in, and it is also a battleground for Nvidia, Intel and AMD.

At the previous 2023 Q1 financial report meeting, AMD emphasized that AI is currently the company's first strategic focus, and AMD is committed to building a more diverse AI product matrix.

Yesterday, AMD's product launch conference was the first time with the theme of "artificial intelligence and data center". Su Zifeng emphasized at the press conference that driven by large-scale language models, the market opportunities for artificial intelligence are increasing, and the market potential may increase from the current US$30 billion to about US$150 billion by 2027.

AMD does not want to miss this AI feast, but Nvidia is a mountain that has to be overcome.

In the latest quarterly financial report, AMD's data center business revenue was US$1.295 billion, compared with US$1.293 billion in the previous quarter, basically no growth. In contrast, Nvidia’s data center business revenue in the first quarter of this year hit a record high, up 14% year-on-year to $4.28 billion, more than three times that of AMD.

According to quantitative hedge fund Khaveen Investments, Nvidia's data center GPU market share will be as high as 88% in 2022, and AMD and Intel will divide the rest.

Although AMD is an old player in the GPU market, its past GPU series products were mainly used in the fields of image processing and AI reasoning, while AI training, which requires more parallel computing, entered the market later.

The release of Instinct MI300 means that AMD is trying to change the dominance of Nvidia in the AI training market.

2. Enter AI training

The Instinct MI300 is the first high-performance "APU" for the data center - a concept pioneered by AMD.

In 2011 (the fifth year after AMD acquired ATI), AMD compared the left and right brains of humans with CPU and GPU in its product conception, and based on this, it proposed a "CPU+GPU" heterogeneous product strategy and named it APU.

Analogous to the human brain, AMD believes that the left brain is more like a CPU, responsible for logical processing of information, such as serial operations, numbers and arithmetic, analytical thinking, understanding, classification, sorting, etc., while the right brain is more like a GPU, responsible for parallel computing, multiple Modalities, creative thinking and imagination, etc.

Picture from Huatai Research

However, in 2011, AMD was in the trough of the "lost decade". Whether it was in the CPU line or the GPU line, it failed to produce enough excellent products, and the development of APU was not satisfactory.

When the time came to March 2020, AMD released a new microarchitecture version CDNA, which is specially designed for high-performance computing and AI computing in data centers. Prior to this, AMD's GPU used the same architecture to solve the needs of gaming and computing scenarios at the same time, which is not conducive to the optimization of different scenarios.

Instinct series products are designed for HPC high-performance computing and AI computing. The newly released MI300 fully pursues Nvidia's Grace Hopper in terms of specifications and performance.

Instinct MI300 adopts TSMC’s 5nm process, and has two different versions: MI300X only has GPU, designed for AI model training, and packs 153 billion transistors; MI300A is a combination of multiple CPUs, GPUs and high-bandwidth memory The APU packs 146 billion transistors.

AMD claims that the Instinct MI300 has 8 times higher AI performance than the previous generation MI250, which can reduce the training time of very large AI models such as ChatGPT and DALL-E from months to weeks, saving millions of dollars in electricity bills.

AMD demonstrated the Falcon model of MI300x running 40 billion parameters at the press conference, allowing it to write a poem about San Francisco. "Models are becoming more and more capacity-hungry, and you actually need multiple GPUs to run the latest large language models," Su said, noting that with more memory on AMD chips, developers won't need as much. Multiple GPUs.

AMD has not yet announced the pricing of MI300, but the management stated in the FY23Q1 earnings conference call that data center products will continue the previous cost-effective pricing style, focusing on opening up the market first.

AMD expects the MI300 to be launched by the end of this year and will be installed in Lawrence Livermore National Laboratory's exascale supercomputer EI Capitan and other large-scale cloud client AI models.

Morgan Stanley analyst Joseph Moore gave optimistic guidance, saying that AMD has seen "stable orders" from customers, and the company's AI-related revenue in 2024 is expected to reach $400 million, and may even reach $1.2 billion-this expectation It is 12 times as much as before.

However, although AMD is almost the only company capable of challenging Nvidia, it must be a very difficult process.

3. Nvidia's moat

After AMD's product launch, the capital market responded mediocrely. AMD's stock price fell by more than 3%. On the contrary, Nvidia's stock price rose by 3.9%, and its market value exceeded one trillion US dollars again.

In the eyes of investors, AMD's annual chip MI300 still seems difficult to shake the foundation of Nvidia.

For example, AMD did not disclose at the conference which major customers it has received support for its annual chip. Kevin Krewell, principal analyst at TIRIAS Research, said: "I think that no (big customers) have indicated that they will use MI300X or MI300A, which may disappoint Wall Street. They hope that AMD will announce that it has already made some Design-wise replaced Nvidia."

The currently disclosed customers are only the open source large-scale unicorn Hugging Face, and the Lawrence Livermore National Laboratory disclosed earlier. But the two are not of the same order of magnitude as the cloud giants that have greater demand for data center chips.

In terms of the performance of the chip itself, although MI300 surpasses Nvidia in some parameters, such as the number of transistors is higher than A100's 54 billion, Nvidia may soon make up for it through product iterations.

In fact, Nvidia is already doing this. On May 29, two weeks before the AMD conference, Nvidia officially released the new GH200 Grace Hopper super chip at the COMPUTEX 2023 pre-show conference, with 200 billion transistors, higher than MI300.

More importantly, Nvidia also announced that Google, Microsoft and Meta will be the first major customers to adopt this super chip.

**In addition to the excellent product itself, Nvidia’s other impregnable moat is its CUDA ecology. **

NVIDIA released the CUDA ecosystem in 2007. By using CUDA, developers can use Nvidia's GPU for general computing processing, not just graphics processing.

CUDA provides an intuitive programming interface that allows developers to write parallel code in C, C++, Python, and other languages.

AI master Wu Enda once commented on this: "Before the emergence of CUDA, there may not be more than 100 people in the world who can use GPU programming. After having CUDA, using GPU has become a very easy thing."

AMD launched ROCm in 2016 with the goal of establishing an ecosystem that can replace CUDA. In 2023, CUDA developers will reach 4 million, including large enterprise customers such as Adobe. The more users, the better the stickiness. It will take time for ROCm, which started late, to build a developer ecosystem.

"While AMD is competitive in terms of hardware performance, people still don't believe that AMD's software solutions can compete with Nvidia," said Moor Insights & Strategy analyst Anshel Sag.

This is a unique moat belonging to Nvidia. It is extremely challenging for AMD to break through.

4. AMD's success, may be difficult to replicate

For AMD, perhaps the least feared thing is to face the challenge.

From 2006 to 2016, it was AMD's "lost decade". During this period, AMD's two biggest competitors, Intel and Nvidia, are undergoing product iterations driven by Moore's Law.

Intel practices the "Tick-Tock pendulum strategy" and does a major product iterative update every two years (one-year process, one-year micro-architecture design); Doubling the performance in 6 months - under the guidance of the company, the product will be upgraded every six months.

AMD failed to keep up with the product update rhythm of the two industry leaders, and the company's development was on the verge of collapse until Su Zifeng took over as AMD's fifth CEO in 2014.

The AMD that Su Lifeng just took over is a mess. Its notebook computer market is occupied by Intel, the emerging smartphone market is divided by Nvidia, Qualcomm and Samsung, and the server market share has shrunk from 1/4 of the original to only 2%. AMD had to lay off about a quarter of its employees, and its stock price hovered around $2, and analysts said it was "uninvestable."

At that time, Intel CEO Ke Zaiqi commented on AMD: "This company will never come back again, so don't mind focusing on the new competitor Qualcomm."

But everyone knows the story after that. Under the leadership of Su Zifeng, AMD fought a beautiful turnaround in the CPU market. Not only did it gradually erode Intel's market share, but its stock price also historically overtook Intel in February 2022.

The reason why AMD was able to break through in the CPU market is that it has seized the strategic mistakes of its rival Intel.

In the link of chip manufacturing, AMD and Intel have chosen different routes. AMD divested its chip manufacturing business in 2009, established a joint venture with independent foundry Gexin, and only focused on chip design (Fabless), which allowed AMD to choose an independent third-party foundry (Foundry ). Intel has been integrating chip design and chip manufacturing (IDM) since its inception.

In the early days of the development of the semiconductor industry, a highly vertically integrated IDM like Intel was the more mainstream model. AMD co-founder Jerry Sanders also said a famous saying: "Real men have fabs." But ironically, AMD had the opportunity to complete the counterattack precisely because it later divested the fab .

After 2014, Intel’s chip manufacturing process encountered technical difficulties, and the yield rate of 10nm chips (equivalent to TSMC’s 7nm) was not good, which led to multiple delays in the 10nm mass production originally scheduled for the second half of 2016, and was finally released in the second half of 2019. The Tick-Tock strategy that Intel has been insisting on before has also been abandoned due to process technology reasons.

Intel founder Gordon Moore proposed Moore's Law, but Intel is now suffering from the "curse of Moore's Law." This allows AMD to seize the opportunity to overtake.

In 2018, AMD first cooperated with GlobalFoundries to launch the Zen+ architecture with a 12nm process, surpassing Intel with a 14nm process for the first time. Then in 2019, AMD cooperated with TSMC to launch the Zen 2 architecture of the 7nm process (equivalent to Intel's 10nm), leading Intel. Since then, Intel has been lagging behind AMD in terms of manufacturing process, and it has not improved until now.

Today, a similar scenario of "the second child challenges the boss" seems to be re-enacted, but the battlefield has been changed from the CPU to the GPU. Although AMD is still AMD led by "Su Ma", Nvidia led by Huang Renxun is more popular than Intel back then.

In Silicon Valley, Huang Renxun is known as an aggressive man. He likes to wear black leather clothes and is always ready to fight back. When the stock price rose to $100, he even tattooed the Nvidia logo on his arm.

In 2016, Huang Renxun did not take AMD seriously. He directly commented that there was a gap of "9 and 0" between Nvidia and AMD. At the beginning of 2019, AMD rushed to release the 7nm graphics card ahead of Nvidia. Huang Renxun seemed to not care about it on the surface, and said directly that "this graphics card is very ordinary."

Today, AMD once again challenged Nvidia with better products. On the one hand is the smug AMD, and on the other is Nvidia, which is seeking defeat by Dugu. A GPU war on artificial intelligence has just begun.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)