AMD talks 1.2 million GPU AI supercomputer to compete with Nvidia — 30X more GPUs than world's fastest supercomputer (2024)

AMD talks 1.2 million GPU AI supercomputer to compete with Nvidia — 30X more GPUs than world's fastest supercomputer (1)

Demand for more computing power in the data center is growing at a staggering pace, and AMD has revealed that it has had serious inquiries to build single AI clusters packing a whopping 1.2 million GPUs or more.

AMD's admission comes from a lengthy discussion The Next Platform had with Forrest Norrod, AMD's EVP and GM of the Datacenter Solutions Group, about the future of AMD in the data center. One of the most eye-opening responses was about the biggest AI training cluster that someone is seriously considering.

When asked if the company has fielded inquiries for clusters as large as 1.2 million GPUs, Forrest replied that the assessment was virtually spot on.

Morgan: What’s the biggest AI training cluster that somebody is serious about – you don’t have to name names. Has somebody come to you and said with MI500, I need 1.2 million GPUs or whatever.Forrest Norrod: It’s in that range? Yes.

Morgan: You can’t just say “it’s in that range.” What’s the biggest actual number?Forrest Norrod: I am dead serious, it is in that range.

Morgan: For one machine.Forrest Norrod: Yes, I’m talking about one machine.

Morgan: It boggles the mind a little bit, you know?

1.2 million GPUs is an absurd number (mind-boggling, as Forest quips later in the interview). AI-training clusters are often built with a few thousand GPUs connected via a high-speed interconnect across several server racks or less. By contrast, creating an AI cluster with 1.2 million GPUs seems virtually impossible.

We can only imagine the pitfalls someone will need to overcome to try and build an AI cluster with over a million GPUs, but latency, power, and the inevitability of hardware failures are a few factors that immediately come to mind.

AI workloads are extremely sensitive to latency, particularly tail latency and outliers, wherein certain data transfers take much longer than others and disrupt the workload.Additionally, today's supercomputers have to mitigate the GPU or other hardware failures that, at their scale, occur every few hours. Those issues would become far more pronounced when scaling to 30X the size of today's largest known clusters. And that's before we even touch on the nuclear power plant-sized power delivery required for such an audacious goal.

Even the most powerful supercomputers in the world don't scale to millions of GPUs. For instance, the fastest operational supercomputer right now, Frontier, "only" has 37,888 GPUs.

The goal of million-GPU clusters speaks to the seriousness of the AI race that is molding the 2020s. If it is in the realm of possibility, someone will try to do it if it means greater AI processing power. Forest didn't say which organization is considering building a system of this scale but did mention that "very sober people" are contemplating spending tens to hundreds of billions of dollars on AI training clusters (which is why millions of GPU clusters are being considered at all).

Stay On the Cutting Edge: Get the Tom's Hardware Newsletter

Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.

See more GPUs News

More about gpus

Nvidia preparing a China-focused variant of its B200 Blackwell AI GPU to comply with US export regulationsElon Musk fires up ‘the most powerful AI cluster in the world’ to create the 'world's most powerful AI' by December — system uses 100,000 Nvidia H100 GPUs on a single fabric

Latest

AMD says its EPYC processors are up to twice as fast as Nvidia's Arm-powered Grace CPU Superchip across multiple benchmarks
See more latest►

24 CommentsComment from the forums

  • A Stoner

    And even with that, they would not even have the processing power of an insect at their disposal. We still have no factual intelligence from all the AI spending that has happened. No AI knows anything at all as of yet.

    Reply

  • JRStern

    Well Musk was just out there raising money for 300,000 GPUs, we're talking billions or trillions before they're all installed and usable, not to mention gigawatts of power to run. OTOH this is crazy stuff, IMHO, and perhaps Elon isn't hip to the news that much smaller LLMs are now being seen as workable so maybe nobody will need a single training system with more than 300 or perhaps 3000 GPUs, to do a retrain within 24 hours. And maybe whole-hog retrains won't be as necessary anymore, either.

    So AMD is just trolling, is what this comes down to, unlikely to actually build it out.

    Reply

  • Pierce2623

    The record Dynex set recently was only a quantum record and the record they beat wasn’t even real quantum computing. The record they beat only involves 896 GPUs

    Reply

  • jeremyj_83

    It literally said "For instance, the fastest operational supercomputer right now, Frontier, "only" has 37,888 GPUs." in the article. Frontier has 1.1 exaFLOPs of computing power just so you know.

    Reply

  • DS426

    Usually business is all about ROI and profit but... really, c'mon, someone show me the math on how investments like this pay off without losing money?? We're also talking about cooling, electric bills, sys admins, and so on, so... wtf is so magically about a (relatively?) well-trained and advanced AI LLM or such that costifies this?

    Seriously, not being a hater just to hate but again being on the business side of things in IT, I need to see some math.

    On another note, at least some folks are seeing the value in not paying ridiculous cash just to have "the best" (nVidia) whereas AMD can honestly and probably provide a better return on investment. Kind of that age-old name brand vs. generic argument.

    Still mindblown over here. How many supercomputers have more than 1.2 million CPU's? I know this doesn't account for core counts but holy smokes, we're clearly not talking apples to apples here!! Pretty sure a mini power plant is literally needed to sit beside a datacenter/supercomputing facility like this.

    Reply

  • oofdragon

    I honestly don't get it. Ok so someone like Elon is considering 300 thousand GPUs like Blackwell's, spending in the order of billions just to buy them, then you have the electric bill and maintenance as well every month. In what war can he possible make a profit out of this situation?

    Reply

  • abufrejoval

    Nice to see you reading TNP: it's really one of the best sites out there and on my daily reading list.

    And so are the vultures next door :-) (the register)

    Reply

  • ThomasKinsley

    Not to get all cynical, but this sounds like a bit of a stretch to me. The reporter gave the random number 1.2 million and the AMD staff member responded with, "It’s in that range? Yes." A range needs more than one number. Are we talking 700,000? 1 million? 1.4 million? There's no way to know.

    Reply

  • kjfatl

    If Musk is serious about the 300,000 GPU's it makes perfect sense that the design would support an upgrade path where compute modules could be replaced with future modules with 2X or 4X the capacity.
    The most obvious use for such a machine is for constant updates to self-driving vehicle software. Daily or even by the minute upgrades are needed for this to be seamless. This is little different than what Google or Garman does with maps. When 'interesting' data is seen by vehicles it would be sent to the compute farm for processing. Real-time data from a landslide just before the driver ran off the side of the road would qualify as 'interesting'. Preventing the crash in the next landslide would be the goal.

    This sort of system is large enough to justify custom compute silicon supporting a limited set of models. This alone might cut the hardware requirements by a factor of 4. Moving to Intel 14A or the equivalent from TSMC or Samsung might give another factor of 8 toward density. Advanced packaging techniques might double it again. Combining all of these could provide a machine with the same footprint and power envelope of today's supercomputer with 30,000 GPUs.

    Reply

  • shawman123

    How much power would Million GPUs would consume. its seems off the charts if all of them are fully used. !!!

    Reply

Most Popular
Asus ROG Ally X available for $799 — Best Buy ships Asus' new handheld gaming device by July 26
Intel's new desktop CPU and chipset names spotted in leaked photo — Core Ultra Arrow Lake chips will land on Z890 chipset
Indie dev reverse-engineers Asus proprietary XG eGPU connector to work with unsupported devices — open-source GPU docking station embraces ROG Ally and ROG Flow
An interview with AMD's Mike Clark, the Father of Zen — 'Zen Daddy' says 3nm Zen 5 is coming fast; also talks compact cores for desktop chips
Beefy Meteor Lake CPU powers new mini-PC — OneXPlayer M1 features Intel Core Ultra 9 185H, up to 32GB RAM, and OCuLink
Nvidia GeForce RTX 50-series launch pushed back to early 2025 according to prominent leaker
Microsoft's EU agreement means it will be hard to avoid CrowdStrike-like calamities in the future
AMD Ryzen 9000 processors prices listed by French retailer — European Ryzen 9 9900X price at nearly $750
Microsoft defends Game Pass price changes, tells FTC that adjustment offers multiplayer for less
Leaked RDNA 4 features suggest AMD drive to catch up in Ray Tracing — doubled RT intersect engine could come to PS5 Pro
AMD talks 1.2 million GPU AI supercomputer to compete with Nvidia — 30X more GPUs than world's fastest supercomputer (2024)

References

Top Articles
Miyoo Mini (and Miyoo Mini Plus) Starter Guide
NBA 2024 offseason: DeMar DeRozan's tepid free agent market says more about the league than the CBA
Butte Jail Roster Butte Mt
Episode 163 – Succession and Legacy • History of the Germans Podcast
Endicott Final Exam Schedule Fall 2023
Sofia Pinkman
Join MileSplit to get access to the latest news, films, and events!
Sarah Lindstrom Telegram
Halo AU/Crossover Recommendations & Ideas Thread
Craigslist Southern Oregon Coast
Colossians 2 Amplified
FREE Houses! All You Have to Do Is Move Them. - CIRCA Old Houses
Weldmotor Vehicle.com
Weather Channel Quincy
1 Bedroom Apartment For Rent Private Landlord
The Nun 2 Showtimes Tinseltown
Caribbean Mix Lake Ozark
Rooms For Rent Portland Oregon Craigslist
The Quiet Girl Showtimes Near Amc Shirlington 7
11 Shows Your Mom Loved That You Should Probably Revisit
Optum Primary Care - Winter Park Aloma
Costco Gas Price City Of Industry
Txu Cash Back Loyalty Card Balance
What Time Is First Light Tomorrow Morning
Bannerlord How To Get Your Wife Pregnant
Prey For The Devil Showtimes Near Amc Ford City 14
Cozy Bug Company Net Worth
All Obituaries | Dante Jelks Funeral Home LLC. | Birmingham AL funeral home and cremation Gadsden AL funeral home and cremation
Ap Computer Science Principles Grade Calculator
Dumb Money Showtimes Near Showcase Cinema De Lux Legacy Place
Pella Culver's Flavor Of The Day
Walgreens Pharmacy On Jennings Station Road
Milwaukee Nickname Crossword Clue
Pillowtalk Leaked
Back Doctor Near Me That Accept Medicaid
Korslien Auction
6030 Topsail Rd, Lady Lake, FL 32159 - MLS G5087027 - Coldwell Banker
Hubspot Community
Liv Morgan Wedgie
Top Dog Boarding in The Hague with Best Prices on PetBacker
How to Survive (and Succeed!) in a Fast-Paced Environment | Exec Learn
Nsfw Otp Prompt Generator Dyslexic Friendly
Bryant Air Conditioner Parts Diagram
Ces 2023 Badge Pickup
Realidades 2 Capitulo 2B Answers
Best Of Clinton Inc Used Cars
File Annual Report - Division of Corporations
Craig List El Paso Tx
Democrat And Chronicle Obituaries For This Week
29+ Des Moines Craigslist Furniture
Diora Thothub
Mecklenburg Warrant Search
Latest Posts
Article information

Author: Kimberely Baumbach CPA

Last Updated:

Views: 6129

Rating: 4 / 5 (41 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Kimberely Baumbach CPA

Birthday: 1996-01-14

Address: 8381 Boyce Course, Imeldachester, ND 74681

Phone: +3571286597580

Job: Product Banking Analyst

Hobby: Cosplaying, Inline skating, Amateur radio, Baton twirling, Mountaineering, Flying, Archery

Introduction: My name is Kimberely Baumbach CPA, I am a gorgeous, bright, charming, encouraging, zealous, lively, good person who loves writing and wants to share my knowledge and understanding with you.