The Unknown Toll Of The AI Takeover
June 20, 2024 3:16 PM   Subscribe

As artificial intelligence guzzles water supplies and jacks up consumers’ electricity rates, why isn’t anyone tracking the resources being consumed?

In early May, Google announced it would be adding artificial intelligence to its search engine. When the new feature rolled out, AI Overviews began offering summaries to the top of queries, whether you wanted them or not — and they came at an invisible cost. Investigative journalist Lois Parshley explores this topic for The Lever. link.
posted by suburbanbeatnik (41 comments total) 30 users marked this as a favorite
Could the answer be that capitalism requires the extraction of resources and externalizing every cost it can?

Off to read the article....
posted by GenjiandProust at 3:27 PM on June 20 [13 favorites]

I'm curious how they came up with the 3Wh / search number, because a 300W consumer GPU running a small LLM can spit out something like 100 tokens/s, which is less than 0.001 Wh/token. Those AI summaries are around a paragraph at most, so maybe 200 tokens max, so let's just round up to 300 tokens to get a generous estimate of 0.3 Wh per search, which is still a tenth of the 3Wh number quoted in the article. To get that number Google would either need to run an (unnecessarily?) much larger model or somehow have less efficient hardware than consumers.
posted by Pyry at 3:46 PM on June 20 [3 favorites]

Inference is cheap.

And believe it or not, they don't even need to run the model every time someone asks how many rocks they should eat... Caching common queries greatly reduces energy costs.
posted by kaibutsu at 4:02 PM on June 20 [2 favorites]

The exact terms are secret, because of a 2017 agreement to hide a former Amazon discount from the public, but it’s estimated to shift $135 million a year onto the utility’s customers.

Numerous gnarly quotes about regular folks funding all this nonsense, and tons of direct government subsidies-- at some point if we are already paying for these companies operations, we should just publicly own and control them too to work for our interests instead.
posted by GoblinHoney at 4:02 PM on June 20 [7 favorites]

Pyry: "I'm curious how they came up with the 3Wh / search number, because a 300W consumer GPU running a small LLM can spit out something like 100 tokens/s, which is less than 0.001 Wh/token."

They may be factoring in the cost of training the model to begin with, or the cost of building the infrastructure needed to serve it at scale (even if serving the query itself is relatively cheap once everything is up and running).
posted by Rhaomi at 4:11 PM on June 20 [1 favorite]

Here's an older comment I made on misreporting of data center energy use.

In short, misreporting in this topic is rampant. Some people publish bad estimates which eventually get retracted, and reported go crazy writing shocking articles before the retractions land. Citations on those stories continue to circulate after retractions. A perfect storm.
posted by kaibutsu at 4:12 PM on June 20 [11 favorites]

I have yet to find a research paper on this with source data that passes the smell test - as in neither hand-wavey guesses nor simply taking Google/OpenAI at their word (or even ChatGPT itself! Since that’s where one of the most oft-cited figures actually comes from!), and I have done some serious digging around Arxiv looking for answers. There are a few papers but in every case it was more like vague estimates or - in the one solid study I found - sourced from 2020 (effectively the stone age).

From what I have been able to gather by taking things like number of A100s / H100s used for GPT-4 inference and multiplying by their typical draw under load, though, this is the first popular article I’ve read where the stated cost of inference per query (3 Wh) is probably within an order of magnitude of the actual real-world figure. And yes it’s pretty low, as kaibutsu said.

My workstations / devkits for work chew up something like 400x that much every hour, 16 hours a day. Which accounts for ~60% of my electricity consumption (most of the rest is central air conditioning), and I consume about half as much power as the typical American household.

The upcoming models seem likely to come with absolutely murderous power requirements attached (murderous as in someone ought to be getting murdered over it), but the present just isn’t all that bad: with nVidia announcing a tripling of efficiency with their newly released architecture, we’d almost appear to be treading water vs the year over year increase in training costs (eg the big new model OpenAI trained this spring is widely rumored to have consumed 200 GWh to train, or four times the 50GWh needed to train GPT-4 in 2022).

I just don’t believe any sane action any of us can take will stop that usage from growing explosively into something horrible and legit dangerous on a 2-3 year horizon. If I had anything hopeful to say it would go here.
posted by Ryvar at 4:12 PM on June 20 [7 favorites]

To get that number Google would either need to run an (unnecessarily?) much larger model or somehow have less efficient hardware than consumers.
factoring (rock) consumers into the equation?
posted by HearHere at 4:13 PM on June 20 [1 favorite]

I got curious about this question recently, but more from the angle of how generative AI use does or does not compare to and/or seem likely to accelerate extant data center costs and impacts caused by general computing. I have a bunch of articles saved up for a post one day when I have time to process and put together a good post, but here’s one that folks might find interesting: “Measuring the Carbon Intensity of AI in Cloud Instances
posted by cupcakeninja at 4:18 PM on June 20 [1 favorite]

because a 300W consumer GPU running a small LLM can spit out something like 100 tokens/s, which is less than 0.001 Wh/token

This is where drawing from our experiences with open source / consumer-accessible models can steer us wrong: assuming the leaked specs are correct, GPT-4 is a mixture of experts model with 16 experts with 110-billion parameters each. Now, one of the benefits of MOE is that you only run 2 or 3 of those for a given query, but that’s still unlike anything in the consumer space. The largest consumer-facing models are typically 70 billion parameters total, though nVidia just released Nemotron-4 340B (more intended to generate synthetic training data than as a chatbot) and Meta is supposed to eventually release a 400B version of Llama 3.

Point is: the scale is very, very different when you’re targeting 6.5 kW A100 clusters rather than 300W 4070 TIs. Add to that: Google has their own hardware stack, they’re not nVidia-based like everyone else so a true apples-to-apples comparison is probably impossible.
posted by Ryvar at 4:28 PM on June 20 [4 favorites]

In short, misreporting in this topic is rampant. Some people publish bad estimates which eventually get retracted, and reported go crazy writing shocking articles before the retractions land. Citations on those stories continue to circulate after retractions. A perfect storm.

And then LLMs are trained on those bogus estimates.
posted by It's Never Lurgi at 4:30 PM on June 20 [6 favorites]

or somehow have less efficient hardware than consumers
I’d expect consumers to regularly have more efficient hardware than Google per watt. The two ways of getting to “efficient” are “do a lot more things” or “use a lot less power,” and consumer hardware has really focused on “use a lot less power.” The most efficient CPUs tend to have pretty modest absolute benchmarks.

I’m not Google, so I can only speculate about the economics at that scale, but I suspect paying for more U of rack space to rack only efficient, low-power CPUs will add up faster than paying for the electricity to cram more compute into each U, especially when the CPU isn’t the only thing in each box consuming power. Maximizing compute-per-watt is not necessarily the same as maximizing compute-per-dollar. When you’re an organization that big, you’ve almost certainly got somebody balancing performance, power, density and anticipated maintenance costs in the context of your own deployment environment and coming up with buying decisions that are “efficient” only in dollars, and highly idiosyncratic.
posted by gelfin at 4:39 PM on June 20 [5 favorites]

I think we need to clear up the whole water supply thing. It requires an incredible amount of water to cool data centers, and that water is heated up and carried away to.... an evaporation pool or directly back into the water supply. It goes through the computer facility in the same types of pipe that your house and municipality use.

The water used in creating the power (particularly mining coal) that runs the data centers does get dirty and needs to be treated, but it is a tiny fraction of the water that is supposedly "consumed" by AI servers.

That can matter more or less depending on the water needs of the local communities, but given that the cooling water is completely potable and a lot ends up back in the same channels it was taken from it's really a no-op of an argument.
posted by Tell Me No Lies at 4:48 PM on June 20 [2 favorites]

I’ve seen the “evaporated water comes back” argument before. Do you actually know that or are you making best case assumptions about water reclamation?
posted by Artw at 4:56 PM on June 20 [2 favorites]

When I worked at Google the cooling systems were total loss evaporation with some exceptions in northern climes and where prevented by law.

We designed the DC to maximize power consumption per linear foot of rack.

COLO and edge installations used whatever the provider would let us get away with, which was always less than the big DCs
posted by pdoege at 5:03 PM on June 20 [10 favorites]

I’ve seen the “evaporated water comes back” argument before

Unless you have a secret pipe that ejects water into translunar space, it does in fact come back somewhere on Earth, and in reasonably short order.
posted by aramaic at 5:10 PM on June 20 [3 favorites]

Okay, so it’s the same facile argument I have seen elsewhere. Thank you.
posted by Artw at 5:17 PM on June 20 [6 favorites]

Unless you have a secret pipe that ejects water into translunar space, it does in fact come back somewhere on Earth, and in reasonably short order.
When you’re drawing water out of a river, and using it to humidify the air well downstream, and when water taken out of that river has an immediate opportunity cost in, say, clean, hydrated humans and flushed toilets, “somewhere” can easily be deeply unsatisfactory. It’s a very hand-wavey answer that glosses over some significant practical externalities and kind of just the whole idea of how water infrastructure works in general.
posted by gelfin at 5:22 PM on June 20 [29 favorites]

Yeah, at the risk of contributing to a pileon, is that argument really intended in good faith? Like, obviously nobody thinks that the problem with high water usage is that the water is lost from Earth's hydrosphere.
posted by biogeo at 5:25 PM on June 20 [8 favorites]

NVidia claims 3,000 tokens/s on a reasonably large LLM (Llama3 70B) for their H200 datacenter GPU, which is listed at a TDP of 700W-- that's less than 0.0001 Wh/token, or an order of magnitude more efficient than the consumer GPU. Clearly "3,000 tokens per second" is a marketing number that should be taken with a big grain of salt, but even if you divide that by ten to get a paltry 300 tokens/s it would be more power efficient than like a 4080.

However, another direction to get estimates is you could assume OpenAI et al. price at cost, and then compute how much power usage that implies-- OpenAI charges $15/1M tokens, so if that's fully used to pay for electricity at $0.10/kWh, that gets you a number of around 0.15 Wh/token, which would be consistent with 3Wh for a single search summary.

So that's a pretty big spread, three orders of magnitude!
posted by Pyry at 5:45 PM on June 20 [4 favorites]

“It’s very counterintuitive [that utilities] make money by spending other people’s money.”
In an absolute and theoretical sense, this may be counterintuitive, but it's exactly the way the world works today. Private utilities providers have every incentive to maximise and constantly increase the amount of utilities used, which is only counterintuitive if you forget that the very basics of life are now commodities only available to consumers if they can be delivered at a profit.
posted by dg at 6:12 PM on June 20 [1 favorite]

Local models running in typical end-user scenarios may be a poor metric for datacenter energy usage because the pattern of usage is different. A big service provider doesn't process each request individually as it comes in; your request does not enjoy the full power of the GPU(s) that are handling it. Instead, requests are batched together because those requests, though their content is different, are going to go through the same overall sequence of operations and the server hardware takes advantage of that. This is not merely faster; it is also more power-efficient. (Batching can be done with local LLMs, too, but most of the programs/workflows that are built around local LLMs do not yet take full advantage of this.)
posted by a faded photo of their beloved at 7:14 PM on June 20 [1 favorite]

I’d expect consumers to regularly have more efficient hardware than Google per watt.

If you're running a data center, power is one of your biggest costs, and something you're eager to optimize. If you're spending some tens of millions of dollars (or more!) on electricity, it's worth it to spend millions of dollars on engineering and research to reduce your power costs.

On the consumer side, you can't do that kind of optimization: You take a guess once every five or ten years about which laptop looks like it has the best battery life. (And that's partly a function of battery size, not just efficiency...) The incentives and feedback loops just aren't there, nor is the economy of scale: Even if you happen to be the kind of person who plans your behavior around improving the carbon cost of your computing habits, millions of your peers are not. (Same reason that individual action doesn't solve climate change...)

And indeed, data center optimization has received quite a bit of machine learning research effort, to substantial effect.
posted by kaibutsu at 7:14 PM on June 20

Okay, so it’s the same facile argument I have seen elsewhere. Thank you.

I honestly do not know how to make a non-facile argument supporting the idea "“evaporated water comes back."

Would you like me to make up something complicated for you?
posted by Tell Me No Lies at 7:50 PM on June 20 [1 favorite]

Even if you assume a child’s version of the water cycle where the evaporated water travels as a fluffy cloud upstream of the data center and then drops as rain into a mountain lake and flows back down with 100% efficiency - again, this is a child’s version of events - you’ve still removed it from everywhere downstream of the data center.
posted by Artw at 8:15 PM on June 20 [7 favorites]

I think the point people are making, albeit with sarcasm, is that we understand water evaporates, but it can sit around in really crap conditions for a really long time and not be in a manner that humans need for drinking etc. This is a familiar problem, like in mining forex where you have ginormous evap pits that sit around being liquid toxic waste for a long time. Or CAFOs & dairies, where you have fucking lakes of piss and shit that is mostly water by percentage, but not really useful to people. So to clarify, when and where does the water used for cooling data centers come back and be suitable as “water” again?
posted by toodleydoodley at 8:16 PM on June 20 [4 favorites]

If your source of water is a deep reservoir you could definitely be depleting what amounts to a fossil resource, at least on a human timescale.
posted by BungaDunga at 8:26 PM on June 20 [2 favorites]

In short, misreporting in this topic is rampant. Some people publish bad estimates which eventually get retracted, and reported go crazy writing shocking articles before the retractions land. Citations on those stories continue to circulate after retractions. A perfect storm.

Hmmmm, where have I seen this before?
posted by neonamber at 8:30 PM on June 20

In Microsoft's 2022 Environmental Sustainability Report they claim that their entire global operations used approximately 6.4 million cubic metres of water in a year, which was up 34% on the year before due to AI data centre workloads. That doesn't really seem like a lot if you spread it over all their users. How many people use Microsoft's cloud services in one form or another? A bit of Googling (hopefully not AI powered!) says they have a user base of 1.2 billion Word users and 400-odd million O365 users. So your individual share as a M$ cloud user of their water usage is probably on the order of about 10 litres of water, or about 2 and a half gallons per year. If you use GenAI it's obviously going to be more but to put that into perspective, it takes around 40,000 gallons of water to manufacture a car.

Even Microsoft's total global water usage (6.4 billion litres, or about 1.7 billion gallons) is quite small compared to other industrial uses of water. For example there's a single nuclear power plant in Illinois that uses 248.5 billion gallons of water per year, although it's not clear what "use" means in that context compared to in a data centre. This is just me doing some Googling for about 10 minutes so I could be getting the wrong end of the stick but it just doesn't seem like data centre water usage is that big of a deal in the grand scheme of things even if AI causes it to increase by an order of magnitude.
posted by L.P. Hatecraft at 8:37 PM on June 20 [3 favorites]

Tell Me No Lies, is it in fact the case that the water used to cool data centers is returned directly to the municipal water supply as potable water, as you've suggested? Can you provide a source for this? It's surprising to me if true, but basically welcome news. But given that it runs contrary to my (admittedly imperfect) understanding of how municipal water systems tend to work, I would like to see something to back up that claim.
posted by biogeo at 8:41 PM on June 20 [3 favorites]

Look Dave, I can see you're really upset about this. I honestly think you ought to sit down calmly, take a stress pill, and think things over.
posted by fairmettle at 9:31 PM on June 20 [3 favorites]

Do you have any idea how much water I require to swallow a stress pill?!?!
posted by biogeo at 10:01 PM on June 20 [3 favorites]

Can you provide a source
Environmental Working Group:
"the facility discharges magnesium, fluoride, arsenic, barium, boron, iron, manganese, and aluminum – all pollutants"
posted by HearHere at 11:59 PM on June 20 [2 favorites]

Prove that this is true.

Prove that it isn't.

C'mon, folks.

As far as the article is concerned, too much concern. Of course, resources are being tracked in any reasonably developed region. It ends up the same old question of what do "we" want to do about it? Who is the "we"?
posted by 2N2222 at 5:08 AM on June 21 [1 favorite]

In principle, we could legislate that data centers cannot consum fresh water for cooling, maybe let them pay 1000x the price when they do. If such legislation took effect after two years, then data center operators would just replace their cooling systems, even still using water but all recirculating. It's even possible the bitcoiners would not upgrade, fight the law, and then go offline, saving Texas.

If we outlawed water cooling, then at some point they'd require a heat sink, so if they air were too hot, then they'd either use the ground or salt water. At least google places their servers is shipping containers anyways, so they'd likely resite some towards costal areas. We could even permit not-so-evaporative fresh water cooling, where water returns into rivers but warmer.

As a rule, if your industrial complex requires serious colling, like data centeers or nuclear reactors, then really it should be build in coastal areas so cooling could be done with brakish or salt water. Fukushima was perfectly sited for a nuclear power plant for example.

Interestingly, we've learned that Greenland's glaciers are melting 100 times faster than estimated, so choose costal areas with higher elevations or rebound effects, aka not Florida.
posted by jeffburdges at 5:22 AM on June 21

HearHere, thanks for finding that! But that linked article appears to be describing pollution associated with a power plant used for a data center, which is a bit different from the question of whether the water used to cool the data center itself remains potable and is returned to the municipal supply. It's still relevant to the overall discussion of resources used by data centers, of course, but doesn't speak to Tell Me No Lies's specific claim about data center water cooling being a "no-op".
posted by biogeo at 6:08 AM on June 21 [1 favorite]

Big data centers use total loss evaporative cooling. See HERE for diagrams, math, discussions, etc.

Data center cooling loops treat the water to reduce corrosion to the loops. The waste water is not potable. Even if it was potable it would not be re-injected into a potable supply because the data centers do not want liability.

Google’s water use for 2021 is documented in a press release HERE

Notice that the press release does not cover the water lost as part of power production.

Further, notice that if a smaller data uses refrigeration instead of evaporation, this moves the water loss to the power production side. This is why the Amazon and other numbers look better than Google.
posted by pdoege at 9:03 AM on June 21 [14 favorites]

Thanks, pdoege! That's very helpful to understand what's going on. So it seems pretty clear, as I thought, that water used for data center cooling is not in fact returned to the supply, and thus water cooling does have a meaningful impact on the local water supply. I'm glad that's settled.
posted by biogeo at 12:18 PM on June 21 [2 favorites]

eMergy accounting should be the method to try and figure out what's going on.

Note how 'human services' are able to be a part of the calculations and just maybe AI can actually beat humans to make it worthwhile.
posted by rough ashlar at 6:34 PM on June 22

Recent salient essay from Emily M. Bender, one of the Stochastic Parrots authors:

And indeed, Google, Meta, and Microsoft have all made big promises about getting to net-zero carbon emissions.

So, how's that working out? Well, as Strubell and colleagues pointed out in 2019 and others (including the Stochastic Parrots authors) have reiterated since, it turns out that, added energy demand means added energy demand. If you use clean energy sources for that added demand, you've used up some of the clean energy supply that could have gone somewhere else. Until we actually have more clean energy than we need, throwing lots of it at training and using "generative AI" models is just going to make the climate crisis worse.

Evan Harper and Caroline O'Donovan, reporting in The Washington Post on June 21, 2024, have some alarming updates:

In the Salt Lake City region, utility executives and lawmakers scaled back plans for big investments in clean energy and doubled down on coal. The retirement of a large coal plant has been pushed back a decade, to 2042, and the closure of another has been delayed to 2036.

A spike in tech-related energy needs in Georgia moved regulators in April to green-light an expansion of fossil fuel use, including purchasing power from Mississippi that will delay closure of a half-century-old coal plant there. In the suburbs of Milwaukee, Microsoft’s announcement in March that it is building a $3.3 billion data center campus followed the local utility pushing back by one year the retirement of coal units, and unveiling plans for a vast expansion of gas power that regional energy executives say is necessary to stabilize the grid amid soaring data center demand and other growth.

In Omaha, where Google and Meta recently set up sprawling data center operations, a coal plant that was supposed to go offline in 2022 will now be operational through at least 2026. The local utility has scrapped plans to install large batteries to store solar power.

posted by ursus_comiter at 9:45 AM on June 24 [3 favorites]

The Missing AI Conversation We Need to Have: Environmental Impacts of Generative AI

It is true there are many ways that generative AI could enhance our ability to understand and respond to climate and environmental challenges. But as long as the control of generative AI’s development is primarily in the hands of those grounded in the culture and priorities of Silicon Valley, there is no reason to believe that these tools will be designed around the needs of the most vulnerable populations on the frontlines of the impact of climate change. (Indeed, if the history of social media development has taught us anything, it is that those who “move fast and break things” tend to develop products embedded with assumptions that rarely hold true for the many billions of people who live outside their very narrow segment of the Global North demographic.)
posted by ursus_comiter at 9:59 AM on June 24 [2 favorites]

« Older The American women who created an art haven in...   |   They said it would never happen Newer »

You are not currently logged in. Log in or create a new account to post comments.