Book Review: The Coming Wave by Mustafa Suleyman

Mustafa Suleyman’s The Coming Wave is a book in two parts: the first details how technological advancements have propelled humanity forward in waves — he uses the analogy of waves and how these natural forces can change the world around us (e.g., think of massive floods and tsunamis). He argues that these metaphorical waves of innovation are both unstoppable and transformative. The second part of the book serves as a warning about the potential dangers of artificial intelligence and other rapidly developing technologies, questioning whether humanity can harness these creations or if they will spiral beyond our control.

Suleyman co-founded DeepMind, an AI research company ultimately acquired by Google in 2014.  DeepMind was known for its work in artificial intelligence — particularly in developing systems like AlphaGo, which defeated human world champions in the game of Go (once thought to be an impossible task for AI). Suleyman illustrates how these innovations have reshaped industries, improved lives, and spread rapidly throughout society:

“General-purpose technologies become waves when they diffuse widely. Without an epic and near-uncontrolled global diffusion, it’s not a wave; it’s a historical curiosity. Once diffusion starts, however, the process echoes throughout history, from agriculture’s spread throughout the Eurasian landmass to the slow scattering of water mills out from the Roman Empire across Europe.”

He gives a number of interesting examples to support this. Such as:

“Or take electricity. The first electricity power stations debuted in London and New York in 1882, Milan and St. Petersburg in 1883, and Berlin in 1884. Their rollout gathered pace from there. In 1900, 2 percent of fossil fuel production was devoted to producing electricity, by 1950 it was above 10 percent, and in 2000 it reached more than 30 percent. In 1900 global electricity generation stood at 8 terawatt-hours; fifty years later it was at 600, powering a transformed economy.”

However, the book shifts dramatically in tone as it progresses, focusing on the challenges of controlling and regulating these emerging technologies. Suleyman presents a case for why containment is necessary (and that it is even possible) in order to ensure these technologies positively serve humanity rather than disrupt it. Though he acknowledges that this will be difficult, especially in today’s highly charged political environment:

“Going into the coming wave, many nations are beset by a slew of major challenges battering their effectiveness, making them weaker, more divided, and more prone to slow and faulty decision-making. The coming wave will land in a combustible, incompetent, overwrought environment. This makes the challenge of containment—of controlling and directing technologies so they are of net benefit to humanity—even more daunting.”

Well, that’s fun! But I think he’s mostly right.

However, in my opinion, I think trying to contain these technologies is no longer possible. Pandora’s box has already been opened, and it’s likely too late for any meaningful containment or regulation to happen due to the pace at which these advancements are occurring. It’s effectively an arms race as various AI laboratories build upon each others’ work and compete to outdo one another. An earlier passage in the book says as much:

“Of course, behind technological breakthroughs are people. They labor at improving technology in workshops, labs, and garages, motivated by money, fame, and often knowledge itself. Technologists, innovators, and entrepreneurs get better by doing and, crucially, by copying. From your enemy’s superior plow to the latest cell phones, copying is a critical driver of diffusion. Mimicry spurs competition, and technologies improve further. Economies of scale kick in and reduce costs. Civilization’s appetite for useful and cheaper technologies is boundless. This will not change.”

Looking at the reviews of this book on Goodreads, I noticed a lot of 1-star reviews. They seem to mostly be from those who dislike, fear, or otherwise loathe this technology. While I can understand their concerns, I think The Coming Wave offers a balanced take from someone on the inside, someone who is working (and has worked) on creating these AI models. Some of the arguments made in these reviews call into mind Neo-Luddism. Which Suleyman has an answer for:

“The Luddites were no more successful at stopping new industrial technologies than horse owners and carriage makers were at preventing cars. Where there is demand, technology always breaks out, finds traction, builds users.”

Overall, I thought that The Coming Wave was a good read, balancing optimism with caution. Suleyman’s first-hand expertise in developing state of the art AI models lends credibility to his arguments, and makes this an interesting read for anyone who wants to know about the potential societal impacts of AI tools.

TokenFlow: Visualize LLM token streaming speeds

Have you ever wondered how fast your favorite LLM really compares to other SoTA models? I recently saw a Reddit post where someone was able to get a distilled version of Deepseek R1 running on a Raspberry Pi! It could generate output at a whopping 1.97 tokens per second. That sounds slow. Is that even usable? I don’t know!

Meanwhile, Mistral announced that their Le Chat platform can output tokens at 1,100 per second! That sounds pretty fast? How fast? I don’t know!

So, that’s why I put together TokenFlow. It’s a (very!) simple webpage that lets you see the speed of different LLMs in action. You can select from a few preset models / services or enter a custom speed, and boom! You watch it spit out tokens in real time, showing you exactly how fast a given inference speed is for user experience.

Check it out: https://dave.ly/tokenflow/

The code is also available on Github.

Comparing reasoning in open-source LLMs

Alibaba recently released their “QwQ” model, which they claim is capable of chain-of-thought reasoning comparable to OpenAI’s o1-mini model. It’s pretty impressive — even more so because we can run this model on our own devices (provided you have enough RAM).

While testing the chain-of-thought reasoning abilities, I decided to compare my test prompt to Llama3.2 and was kind of shocked at how good it was. I had to come up with ever more ridiculous scenarios to try and break it.

That is pretty good, especially for a non chain-of-thought model. Okay, come on. How do we break it! Can we?

Alright, magical unicorns for the win.

Project: Super Simple ChatUI

I’ve been playing around a lot with Ollama, an open source project that allows one to run LLMs locally on their machine. It’s been fun to mess around with. Some benefits: no rate-limits, private (e.g., trying to create a pseudo therapy bot, trying to simulate a foul mouthed smarmy sailor, or trying to generate ridiculous fake news articles about a Florida Man losing a fight to a wheel of cheese), and access to all sorts of models that get released.

I decided to try my hand at creating a simplified interface for interacting with it. The result: Super Simple ChatUI.

As if I need more side projects. So it goes!