At the beginning of November 2024, the US Federal Energy Regulatory Commission (FERC) rejected Amazon’s request to buy an additional 180 megawatts of power directly from the Susquehanna nuclear power plant for a data center located nearby. The rejection was due to the argument that buying power directly instead of getting it through the grid like everyone else works against the interests of other users.
Demand for power in the US has been flat for nearly 20 years. “But now we’re seeing load forecasts shooting up. Depending on [what] numbers you want to accept, they’re either skyrocketing or they’re just rapidly increasing,” said Mark Christie, a FERC commissioner.
Part of the surge in demand comes from data centers, and their increasing thirst for power comes in part from running increasingly sophisticated AI models. As with all world-shaping developments, what set this trend into motion was vision—quite literally.
The AlexNet moment
Back in 2012, Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton, AI researchers at the University of Toronto, were busy working on a convolution neural network (CNN) for the ImageNet LSRVC, an image-recognition contest. The contest’s rules were fairly simple: A team had to build an AI system that could categorize images sourced from a database comprising over a million labeled pictures.
The task was extremely challenging at the time, so the team figured they needed a really big neural net—way bigger than anything other research teams had attempted. AlexNet, named after the lead researcher, had multiple layers, with over 60 million parameters and 650 thousand neurons. The problem with a behemoth like that was how to train it.
What the team had in their lab were a few Nvidia GTX 580s, each with 3GB of memory. As the researchers wrote in their paper, AlexNet was simply too big to fit on any single GPU they had. So they figured out how to split AlexNet’s training phase between two GPUs working in parallel—half of the neurons ran on one GPU, and the other half ran on the other GPU.