Elon Musk’s recently established xAI data centre in Memphis achieved a significant milestone this week by activating all 100,000 advanced Nvidia chips simultaneously, as confirmed by sources familiar with the development.
This accomplishment makes the data centre, known as “Colossus,” the most potent computer ever constructed. It also marks a noteworthy technological feat for xAI, a relatively young company that made the colossal facility operational in under six months.
While Musk has said the facility is the largest in the world, industry experts have raised doubts about xAI’s capacity to power and manage many GPUs – specifically Nvidia’s H100 chips – at full capacity.
No other company has successfully interconnected 100,000 GPUs due to the limitations of the networking technology required to connect the chips and make them function as a unified computer.
This milestone was achieved earlier this week, enabling the company to train an AI model with more computational power than any other known model in history. xAI is utilising the data centre to train the AI model that powers Grok, the company’s chatbot, which presents itself as an unfiltered version of ChatGPT.
xAI has been aggressively pursuing its goals, even resorting to connecting natural gas turbines to supplement conventional power as a temporary measure to continue iterating while utility officials work to enhance the facility’s power supply.
Energy has emerged as a significant challenge in developing more potent AI models. Bloomberg reported that OpenAI CEO Sam Altman had sought assistance from US government officials to build data centres that require five gigawatts of power, equivalent to five nuclear power plants.
Microsoft, BlackRock, and Abu Dhabi’s MGX are collaborating on a $30 billion investment fund focused on infrastructure projects for massive data centres utilised for AI.
The competition to construct larger data centres is the driving force behind OpenAI’s pursuit of billions of dollars in new funding and its exploration of changing its corporate structure to enable larger investments.
While computing power alone does not guarantee a superior AI model, the industry generally believes that more computing power results in more capable models. It is also possible to train multiple AI models and then merge them into one larger model, often referred to as a “mixture of experts.”
Historically, the more interconnected GPUs under a single roof, the more potent the models produced.