While it’s widely known that the computers powering generative AI use a ton of water and power, its actual impact on the environment is often harder to pin down.
In a push toward greater transparency, French model builder Mistral AI this week published a peer-reviewed report in collaboration with consulting firm Carbone 4 and France’s ecological transition agency (ADEME) which attempted to quantify the impact of its Mistral Large 2 LLM on the environment across three key metrics: greenhouse gas (GHG) emissions, water consumption, and materials use.
In the 18 months since Mistral started work on the model, training and running it, a process known as inference, accounted for the lion’s share of GHG emissions (85.5 percent) and water consumption (91 percent).
By Mistral’s estimate, training the 123 billion parameter model produced approximately 20 kilotons of CO2 equivalents (CO2e) and consumed 281,000 cubic meters of water – the equivalent of roughly 112 Olympic-sized swimming pools.
“These figures reflect the scale of computation involved in GenAI, requiring numerous GPUs, often in regions with carbon-intensive electricity and sometimes water stress,” the company explained.
Mistral also took note of the materials consumed in the process of manufacturing the datacenters, servers, and other equipment over this period. Interestingly, less than two-thirds (61 percent) of materials consumption was attributed to manufacturing, transportation, and end-of-life. Mistral notes 29 percent of materials consumption occurred during the training and inference stage, suggesting a fairly high rate of hardware failures.
El Reg reached out to Mistral for clarification; we’ll let you know what we find out.
As you might expect, actually running the completed model, a process called inference, generated far fewer CO2 equivalents and consumed a fraction of the water for each request.
To synthesize a 400 token response — about a page worth of text — Mistral Large 2 consumed about 45 ml of water and generated about 1.14 grams of CO2e. According to the startup, this is roughly the equivalent to the water needed to grow a small pink radish or GHG emissions from watching a streaming video for about 10 seconds in the US or 55 seconds in France (presumably because France generates a much higher proportion of electricity from sources that don’t emit CO2, like nuclear power).
That might not sound like much, but remember that these figures are proportional to the user base. The more people pinging the model, the bigger its environmental impact.
AI’s thirst confirmed
The findings closely align with prior research into AI’s drinking habits.
“Mistral AI’s disclosure matches extremely well with our earlier estimate that 10-50 medium-sized responses from a medium-sized LLM (GPT-3-175B) consume 500 ml of water,” Shaolei Ren, associate professor of electrical and computer engineering at UC Riverside, told The Register.
Ren and his team have been studying AI’s impact on things like air quality and public health for several years now. In 2023, the team published a detailed report estimating the amount of water consumed during training and inference.
As a quick refresher, AI datacenters consume a lot of power and produce heat as a byproduct. To keep this equipment from overheating, datacenters often employ a form of air conditioning called cooling towers, which function like industrial-scale swamp coolers, evaporating water to chill the air.
As we’ve previously explored, these cooling towers are extremely energy-efficient, requiring a fraction of the power needed for more traditional refrigerant-based systems, but can be problematic in drought-prone regions where water is scarce and expensive.
It’s worth noting that even if the datacenter itself doesn’t consume water directly, the power plants that make their computations often do. The massive cooling towers found outside nuclear power plants are among the most recognizable, but are also commonly employed at gas and coal-fired plants. Because of this, reducing datacenter water consumption isn’t as simple as switching to alternative thermal management tech, like closed loop liquid coolers, dry coolers, or conventional AC units.
Key insights
According to Mistral, the study showed that AI’s environmental impact is heavily influenced by its geographic location. Building training models in cool climates with an ample supply of renewable carbon-free energy can significantly reduce the models’ carbon footprint and water consumption.
Mistral also contends that customers can minimize the environmental impact of GenAI by opting for smaller case-specific models, which need fewer resources to train and run. (Bonus: They tend to work better, too.)
This highlights the importance of choosing the right model for the right use case
“Benchmarks have shown impacts are roughly proportional to model resize: a model 10 times bigger will generate impacts one order of magnitude larger than a smaller model for the same amount of generated tokens,” the model builder wrote. “This highlights the importance of choosing the right model for the right use case.”
The AI startup also suggested grouping queries — likely a reference to a technique called continuous batching, which seeks to pack as many compute-heavy prefill operations into a single run as possible — to minimize wasted compute cycles.
While not mentioned in the blog post, techniques like speculative decoding or sparse model architectures, like mixture of experts, could also serve to reduce AI’s environmental impact by increasing the number of tokens generated using the same compute. Mistral Large 2 is a dense model, but the startup is a pioneer in MoE models.
A call for reporting standards
“They didn’t include every detail, but it’s a step towards greater transparency and that’s very important for understanding the real true environmental impact of AI computing,” Ren said of the Mistral AI report. “Having a standardized reporting or measurement methodology is clearly very important.”
Ren notes that while AI’s water consumption isn’t all that large compared to some other sectors, like agriculture, you need to understand its impact before you can take steps to minimize it.
While the report is a step in the right direction, Mistral admits there’s still some room for improvement. The startup notes that a general lack of standards for AI reporting requires certain assumptions to be made.
In releasing this information, it’s clear Mistral AI would like to see other model devs follow suit.
“To improve transparency and comparability, AI companies ought to publish the environmental impacts of their models using standardized, internationally recognized frameworks,” the company wrote. “This could enable the creation of a scoring system, helping buyers and users identify the least carbon-, water-, and material intensive models.”
In particular, Mistral has identified three key details of note: (1) the impact of training the model, (2) the ongoing environmental cost of running that model, and (3) the portion of the model’s lifespan spent on inference versus training.
This last one, they contend, is essential to ensuring the resources sunk into training a model are effectively amortized and not wasted.
Mistral argues the first two are essential knowledge for users, developers, and policy makers, while the third they suggest could either be an internal metric or released to the public. ®