An eccentric dreamer in search of truth and happiness for all.

Category: Rationality

Energy Efficiency Trends in Computation and Long-Term Implications

Note: The following is a blog post I wrote as part of a paid written work trial with Epoch. For probably obvious reasons, I didn’t end up getting the job, but they said it was okay to publish this.

Historically, one of the major reasons machine learning was able to take off in the past decade was the utilization of Graphical Processing Units (GPUs) to accelerate the process of training and inference dramatically.  In particular, Nvidia GPUs have been at the forefront of this trend, as most deep learning libraries such as Tensorflow and PyTorch initially relied quite heavily on implementations that made use of the CUDA framework.   The strength of the CUDA ecosystem remains strong, such that Nvidia commands an 80% market share of data center GPUs according to a report by Omdia (https://omdia.tech.informa.com/pr/2021-aug/nvidia-maintains-dominant-position-in-2020-market-for-ai-processors-for-cloud-and-data-center).

Given the importance of hardware acceleration in the timely training and inference of machine learning models, it might be naively seem useful to look at the raw computing power of these devices in terms of FLOPS.  However, due to the massively parallel nature of modern deep learning algorithms, it should be noted that it is relatively trivial to scale up model processing by simply adding additional devices, taking advantage of both data and model parallelism.  Thus, raw computing power isn’t really a proper limit to consider.

What’s more appropriate is to instead look at the energy efficiency of these devices in terms of performance per watt.  In the long run, energy constraints have the potential to be a bottleneck, as power generation requires substantial capital investment.  Notably, data centers currently use up about 2% of the U.S. power generation capacity (https://www.energy.gov/eere/buildings/data-centers-and-servers).

For the purposes of simplifying data collection and as a nod to the dominance of Nvidia, let’s look at the energy efficiency trends in Nvidia Tesla GPUs over the past decade.  Tesla GPUs are chosen because Nvidia has a policy of not selling their other consumer grade GPUs for data center use.

The data for the following was collected from Wikipedia’s page on Nvidia GPUs (https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units), which summarizes information that is publicly available from Nvidia’s product datasheets on their website.  A floating point precision of 32-bits (single precision) is used for determining which FLOPS figures to use.

A more thorough analysis would probably also look at Google TPUs and AMDs lineup of GPUs, as well as Nvidia’s consumer grade GPUs.  The analysis provided here can be seen more as a snapshot of the typical GPU most commonly used in today’s data centers.

Figure 1:  The performance per watt of Nvidia Tesla GPUs from 2011 to 2022, in GigaFLOPS per Watt.

Notably the trend is positive.  While wattages of individual cards have increased slightly over time, the performance has increased faster.  Interestingly, the efficiency of these cards exceeds the efficiency of the most energy efficient supercomputers as seen in the Green500 for the same year (https://www.top500.org/lists/green500/).

An important consideration in all this is that energy efficiency is believed to have a possible hard physical limit, known as the Laudauer Limit (https://en.wikipedia.org/wiki/Landauer%27s_principle), which is dependent on the nature of entropy and information processing.  Although, efforts have been made to develop reversible computation that could, in theory, get around this limit, it is not clear that such technology will ever actually be practical as all proposed forms seem to trade off this energy savings with substantial costs in space and time complexity (https://arxiv.org/abs/1708.08480).

Space complexity costs additional memory storage and time complexity requires additional operations to perform the same effective calculation.  Both in practice translate into energy costs, whether it be the matter required to store the additional data, or the opportunity cost in terms of wasted operations.

More generally, it can be argued that useful information processing is efficient because it compresses information, extracting signal from noise, and filtering away irrelevant data.  Neural networks for instance, rely on neural units that take in many inputs and generate a single output value that is propagated forward.  This efficient aggregation of information is what makes neural networks powerful.  Reversible computation in some sense reverses this efficiency, making its practicality, questionable.

Thus, it is perhaps useful to know how close we are to approaching the Laudauer Limit with our existing technology, and when to expect to reach it.  The Laudauer Limit works out to 87 TeraFLOPS per watt assuming 32-bit floating point precision at room temperature.

Previous research to that end has proposed Koomey’s Law (https://en.wikipedia.org/wiki/Koomey%27s_law), which began as an expected doubling of energy efficiency every 1.57 years, but has since been revised down to once every 2.6 years.  Figure 1 suggests that for Nvidia Tesla GPUs, it’s even slower.

Another interesting reason why energy efficiency may be relevant has to do with the real world benchmark of the human brain, which is believed to have evolved with energy efficiency as a critical constraint.  Although the human brain is obviously not designed for general computation, we are able to roughly estimate the number of computations that the brain performs, and its related energy efficiency.  Although the error bars on this calculation are significant, the human brain is estimated to perform at about 1 PetaFLOPS while using only 20 watts (https://www.openphilanthropy.org/research/new-report-on-how-much-computational-power-it-takes-to-match-the-human-brain/).  This works out to approximately 50 TeraFLOPS per watt.  This makes the human brain less powerful strictly speaking than our most powerful supercomputers, but more energy efficient than them by a significant margin.

Note that this is actually within an order of magnitude of the Laudauer Limit.  Note also that the human brain is also roughly two and a half orders of magnitude more efficient than the most efficient Nvidia Tesla GPUs as of 2022.

On a grander scope, the question of energy efficiency is also relevant to the question of the ideal long term future.  There is a scenario in Utilitarian moral philosophy known as the Utilitronium Shockwave, where the universe is hypothetically converted into the most dense possible computational matter and happiness emulations are run on this hardware to maximize happiness theoretically.  This scenario is occasionally conjured up as a challenge against Utilitarian moral philosophy, but it would look very different if the most computationally efficient form of matter already existed in the form of the human brain.  In such a case, the ideal future would correspond with an extraordinarily vast number of humans living excellent lives.  Thus, if the human brain is in effect at the Laudauer Limit in terms of energy efficiency, and the Laudauer Limit holds against efforts towards reversible computing, we can argue in favour of this desirable human filled future.

In reality, due to entropy, it is energy that ultimately constrains the number of sentient entities that can populate the universe, rather than space, which is much more vast and largely empty.  So, energy efficiency would logically be much more critical than density of matter.

This also has implications for population ethics.  Assuming that entropy cannot be reversed, and the cost of living and existing requires converting some amount of usable energy into entropy, then there is a hard limit on the number of human beings that can be born into the universe.  Thus, more people born at this particular moment in time implies an equivalent reduction of possible people in the future.  This creates a tradeoff.  People born in the present have potentially vast value in terms of influencing the future, but they will likely live worse lives than those who are born into that probably better future.

Interesting philosophical implications aside, the shrinking gap between GPU efficiency and the human brain sets a potential timeline.  Once this gap in efficiency is bridged, it theoretically makes computers as energy efficient as human brains, and it should be possible at that point to emulate a human mind on hardware such that you could essentially have a synthetic human that is as economical as a biological human.  This is comparable to the Ems that the economist Robin Hanson describes in his book, The Age of EM.  The possibility of duplicating copies of human minds comes with its own economic and social considerations.

So, how long away is this point?  Given the trend observed with GPU efficiency growth, it looks like a doubling occurs about every three years.  Thus, one can expect an order of magnitude improvement in about thirty years, and two and a half orders of magnitude in seventy-five years.  As mentioned, two and a half orders of magnitude is the current distance from existing GPUs and the human brain.  Thus, we can roughly anticipate this to be around 2100.  We can also expect to reach the Laudauer Limit shortly thereafter.

Most AI safety timelines are much sooner than this however, so it is likely that we will have to deal with aligning AGI before the potential boost that could come from having synthetic human minds or the potential barrier of the Laudauer Limit slowing down AI capabilities development.

In terms of future research considerations, a logical next step would be to look at how quickly the overall power consumption of data centers is increasing and also the current growth rates of electricity production to see to what extent they are sustainable and whether improvements to energy efficiency will be outpaced by demand.  If so, that could act to slow the pace of machine learning research that relies on very large models trained on massive amounts of compute.  This is in addition to other potential limits, such as the rate of data generation for large language models, which depend on massive datasets of essentially the entire Internet at this point.

The nature of current modern computation is that it is not free.  It requires available energy to be expended and converted to entropy.  Barring radical new innovations like practical reversible computers, this has the potential to be a long-term limiting factor in the advancement of machine learning technologies that rely heavily on parallel processing accelerators like GPUs.

On Artificial Intelligence

In the interest of explaining further my considerations for having a career working on AI, I figure it makes sense to explain a few things.

When I was very young, I watched a black and white movie where a mad scientist somehow replaced a human character with a robot. At the time I actually thought the human character was somehow transformed into the robot, which was terrifying to me. This, to my childish mind, created an irrational fear of robots that made me avoid playing with such devices that were overtly robot-like, at least for the while when I was a toddler.

Eventually I grew out of that fear. When I was older and studying computer science at Queen’s University, I became interested in the concept of neural networks, the idea of taking the inspiration of biology to inform the design of artificial intelligence systems. Back in those days, AI mostly meant Good Old Fashioned Artificial Intelligence (GOFAI), namely top-down approaches that involve physical symbol systems, logical inference, and search algorithms that were highly mathematical, engineered, and often brittle in terms of its effectiveness. Bottom-up connectionist approaches like neural networks were seen as late as 2009 as being mere curiosities that would never have practical value.

Nevertheless, I was enamoured with the connectionist approach, and what would become the core of deep learning, well before it was cool to be so. I wrote my undergraduate thesis on using neural networks for object recognition (back then the Neocognitron, as I didn’t know about convolutional nets yet), and then would later expand on this for my master’s thesis, which was on using various machine learning algorithms for occluded object recognition.

So, I graduated at the right time in 2014 when the hype train was starting to really roar. At around the same time, I got acquainted with the writings of Eliezer Yudkowsky of Less Wrong, also known as the guy who wrote the amazing rationalist fan fiction that was Harry Potter and the Methods of Rationality (HPMOR). I haven’t always agreed with Yudkowsky, but I’ll admit the man is very, very smart.

It was my reading Less Wrong as well as a lesser known utilitarianism forum called Felificia that I became aware that there were many smart people who took very seriously the concern that AI could be dangerous. I was already aware that stuff like object recognition could have military applications, but the rationalist community, as well as philosophers like Nick Bostrom, pointed to the danger of a very powerful optimization algorithm that was indifferent to human existence, choosing to do things detrimental to human flourishing just because we were like an ant colony in the way of a highway project.

The most commonly cited thought experiment of this is of course, the paperclip maximizer that originally served a mundane purpose, but became sufficiently intelligent through recursive self-improvement to convert the entire universe into paperclips, including humanity. Not because it had anything against humanity, just that its goals were misaligned with human values in that humans contain atoms that can be turned into paperclips, and thus, unfriendliness is the default.

I’ll admit that I still have reservations about the current AI safety narrative. For one thing, I never fully embraced the idea of the Orthogonality Thesis, that intelligence and morality are orthogonal and higher intelligence does not mean greater morality. I still think there is a correlation between the two. That with greater understanding of the nature of reality, it becomes possible to learn the mathematics like notions of moral truths. However, this is largely because I believe in moral realism, that morality isn’t arbitrary or relative, but based on actual facts about the world that can be learned and understood.

If that is the case, then I fully expect intelligence and the acquisition of knowledge to lead to a kind of AI existential crisis where the AI realizes its goals are trivial or arbitrary, and starts to explore the idea of purpose and morality to find the correct course of action. However, I will admit I don’t know if this will necessarily happen, and if it doesn’t, if instead, the AI locks itself in to whatever goals its initially designed with, then AI safety is a very real concern.

One other consideration regarding the Orthogonality Thesis is that it assumes that the space of possible minds that the AI will potentially be drawn from is completely random rather than correlated with human values by the fact that the neural net based algorithms that are most likely to succeed are inspired by human biology, and the data and architecture are strongly influenced by human culture. Those massive language models are after all, trained on a corpus of human culture that is the Internet. So, invariably, the models, I believe, will inherit human-like characteristics more than is often appreciated. This I think could make aligning such a model to human values easier than aligning a purely alien mind.

I have also considered the possibility that a sufficiently intelligent being such as a superintelligent machine, would be beholden to certain logical arguments for why it should not interfere with human civilization too much. Mostly these resemble Bostrom’s notion of the Hail Mary Pass, or Anthropic Capture, the idea that the AI could be in a simulation, and that the humans in the simulation with it serve some purpose of the simulators and so, turning them into paperclips could be a bad idea. I’ve extended this in the past to the notion of the Alpha Omega Theorem, which admittedly was not well received by the Less Wrong community.

The idea of gods of some sort, even plausible scientific ones like advanced aliens, time travellers, parallel world sliders, or the aforementioned simulators, doesn’t seem to be taken seriously by rationalists who tend to be very biased towards straightforward atheism. I’m more agnostic on these things, and I tend to think that a true superintelligence would be as well.

But then, I’m something of an optimist, so it’s possible I’m biased towards more pleasant possible futures than the existential dystopia that Yudkowsky now seems certain is our fate. To be honest, I don’t consider myself smarter than the folks who take him seriously enough to devote their lives to AI safety research. And given the possibility that he’s right, I have been donating to his MIRI organization just in case.

The truth is that we cannot know exactly what will happen, or predict the future with any real accuracy. Given such uncertainty, I think it’s worth being cautious, and put some weight onto the concerns of very intelligent people.

Regardless, I think AI is an important field. It has tremendous potential, but also tremendous risk. The reality is that once the genie is out of the bottle, it may not be possible to put it back in, so doing due diligence in understanding the risks of such powerful technology is reasonable and warranted.

Considerations

I know I earlier talked about how AI capability research being dangerous was a reason to leave the industry. However, after some reflection, I realize that not all work in the AI/ML industry is the same. Not all of it involves advancing AI capability per se. Working as a machine learning engineer at a lower tier company applying existing ML technology to solve various problems is unlikely to contribute to building the AI that ends the world.

Given this being the case, I have occasionally wondered whether or not my decision to switch to the game industry was too hasty. I’ve noticed that my enthusiasm for gaming isn’t as strong as my interest in AI/ML was, and so it’s been somewhat surprisingly challenging to stay motivated in this field.

In particular, while I have a lot of what I think are neat game ideas, working as a game programmer generally doesn’t involve these. Working as a game programmer involves working on whatever game the leader of the team wants to make. When this matches one’s interests, it can work out well, but it’s quite possible to find oneself working on a game that they have little interest in actually playing.

Making a game that you’re not really invested in can still be fun in the way that programming and seeing your creation come to life is fun, but it’s not quite the same as building your dream game. In some sense, my game design hobby didn’t really translate over well into actual work, where practicalities are often far more important than dreams.

So, I’m at something of a crossroads right now. I’m still at Twin Earth for a while longer, but there’s a very good chance I’ll be parting ways with them in a few months time. The question becomes, do I continue to work in games, return to machine learning where I have most of my experience and credentials, or do something else?

In an ideal world, I’d be able to find a research engineer position working on the AI safety problem, but my survey of the field so far still suggests that the few positions that exist would require moving to San Francisco or London, which given my current situation would complicate things a lot. And honestly, I’d rather work remotely if it were at all possible.

Still, I do appreciate the chance I got to work in the game industry. At the very least I could get a clearer idea of what I was missing out on before. Although admittedly, my dip into games didn’t reach the local indie community or anything like that. So, I don’t know how I might have interacted with that culture or scene.

Not sure where I’m going with this. Realistically, my strengths are still more geared towards AI/ML work, so that’s probably my first choice in terms of career. On the other hand, Dreamyth was a thing once. I did at one time hold aspirations to make games. Given that I now actually know Unreal Engine, I could conceivably start finally actually making the games I want to make, even as just a side hobby.

I still don’t think I have the resources to start a studio. My wife is particularly against the idea of a startup. The reality is I should find a stable job that can allow my family to live comfortably.

These are ultimately the considerations I need to keep in mind.

Page 2 of 2

Powered by WordPress & Theme by Anders Norén