Mosharaf Chowdhury, University of Michigan – Smarter AI Training to Slash Energy Waste

AI takes a lot of energy, so how do we lighten the load?

Mosharaf Chowdhury, associate professor of computer science at the University of Michigan, gets computer chips working together toward a solution.

Dr. Mosharaf Chowdhury is a computer scientist interested in all facets of efficient systems: from software runtimes to the hardware resources they run on, both in the cloud and across the planet. His research aims to make AI/ML cheaper, faster, and more energy-efficient, thereby making it broadly accessible around the world.

Smarter AI Training to Slash Energy Waste

 

AI is becoming an integral part of our daily lives – helping with everything from drafting emails to diagnosing diseases. These tools are powerful because they’re trained on oceans of data, but training AI takes a lot of electricity. In fact, creating one large AI model can use as much energy as thousands of U.S. homes do in an entire year. And as the demand for AI grows, so does its appetite for electricity.

To train these massive models, tech companies use thousands of computer chips at once. Ideally, they’d all share the work evenly and finish together. But in reality, some chips get more work to do while others get a very light load. The chips with less work finish sooner and sit idle, while they wait for the slow chips to catch up,– but the idling chips still use energy. On top of that, delays from hardware hiccups or slow networks can make things worse. The result? We found that up to 30% of the energy used in training AI can go to waste.

As part of my lab’s ML Energy initiative, we have built Perseus, a simple, effective way to cut energy waste in AI training. Instead of letting faster chips waste energy, Perseus slows them down just enough so all chips finish together. That means less waste, without slowing training down or hurting the model’s accuracy. In real-world tests across a wide range of AI models, Perseus reduced the bulk of the energy waste in training.

By making AI training smarter, we are reducing its cost and making this technology more accessible to everyone.

Read More:
The ML.ENERGY Initiative

Share