Elon Musk has begun training xAI with 100,000 Liquid-Cooled NVIDIA H100 GPUs that's the Most Powerful AI Training Cluster in the World
The popular venture 'xAI' from the company's chairman has officially begun training on NVIDIA's most powerful data center H100 GPUs. Elon Musk proudly announced this on X, calling it 'the most powerful AI training cluster in the world!'. In the post, he said that the supercluster will be trained by 100,000 liquid-cooled H100 GPUs on a single RDMA fabric and congratulated xAI, X, and team Nvidia for starting the training at Memphis.
Nice work by @xAI team, @X team, @Nvidia & supporting companies getting Memphis Supercluster training started at ~4:20am local time.
— Elon Musk (@elonmusk) July 22, 2024
With 100k liquid-cooled H100s on a single RDMA fabric, it’s the most powerful AI training cluster in the world!
The training started at 4:20 am Memphis local time and according to another follow-up post, Elon claims that the world's most powerful AI will be ready by December this year. As per the reports, GROK 2 will be ready for release next month and GROK 3 by December.
Elon Musk as recently seen at Memphis
xAI was renting Nvidia's AI chips from Oracle but decided to build its own server, ending the existing deal with Oracle, which was supposed to continue for a few years. The project is now aimed at building its own supercomputer superior to Oracle and this is going to be achieved by using a hundred thousand high-performance H100 GPUs. Each H100 GPU costs roughly $30,000 and while GROK 2 did use 20,000 of them, GROK 3 requires five times the power to develop its AI chatbot. For more, read the full report by WCCF TECH.