generative-ai - Dave Schumaker

Gemini: replace “this” with “this”

May 7, 2025

For the most part, I’ve had pretty positive experiences using AI tools to help enhance my coding activities (though there was the one time…).

A recent experience with Google’s new Gemini model left me frustrated. After prompting it to help me find and update some relevant code, it confidently informed me that it had identified the exact snippet that needed replacing. Great news, I thought, until I realized it was instructing me to replace the code with… exactly the same code.

I pointed out the issue. Gemini politely apologized for the confusion and assured me it would correct its mistake. To my disbelief, it promptly suggested the very same replacement again! And again!

Oh, I have receipts. Join me on this little adventure!

Maybe we don’t have to worry about AI taking our jobs just yet!

OpenAI’s new image generation models are… insane

March 26, 2025

You can probably repeat this blog post headline for any given service every week at this point…

Anyway! I’ve been on board the generative AI train for a few years now and it’s amazing to see how far it’s come. In October 2023, I got access to DALL-E 3 and was pretty impressed with its ability to render text.

Yesterday, OpenAI announced 4o Image Generation and boy does it kick things up a notch or two!

It’s ability to generate images and render text according to your exact prompt is incredible. We can now have full on automated AI memebots.

A four panel cartoon strip

first panel: a software engineer sitting in front of a computer screen on a Zoom meeting

second panel: the software engineer tells the participants (with a speech bubble): “I’m telling you, AI is coming for our jobs!”

third panel: we just see a slight closeup of the software engineer (the computer monitor isn’t visible)

fourth panel: same as the first panel except all the participants are now robots

Same angle and setup in every panel, reduced art style, broad outlines

Or, how about:

Cartoon drawing of a bored computer programmer sitting in front of a computer just pressed “enter” over and over. He is sarcastically excited and says, “Vibe coding. Wooooo.”

You can also feed it source images and it will run with it as well. So, obviously we need to use the Canine Calibration System.

I even gave it an image of me and told it to make a movie poster:

Create a dramatic cyberpunk 1980s horror movie poster image featuring a Computer Monster (We see an LCD screen with evil eyes and fangs and it has robotic legs) in a dark alley. In front of the monster, we see the man in this source image passed out on the ground, broken glasses lay next to him. At the top of the poster is the title of the movie in digital writing: “BUFFER OVERFLOW” at the bottom in the billing area, we see text that says, “Some bugs were never meant to be fixed.”

Or rewrite history…

Or really, really rewrite history…

It’s just wild. It’s coming for us as engineers, as musicians, as artists, as writers. This 2024 post on Twitter sums it up:

You know what the biggest problem with pushing all-things-AI is? Wrong direction. I want AI to do my laundry and dishes so that I can do art and writing, not for AI to do my art and writing so that I can do my laundry and dishes.

– Joanna Maciejewska on Twitter

Hmm, this sounds like a 4-panel comic to me!

Lazy AI

March 4, 2025

This is a first for me. Cursor attempted to “fix” an issue I was having with TypeScript by adding a // @ts-nocheck statement to the top of the file, essentially preventing TypeScript from running validation checks against the code.

TokenFlow: Visualize LLM token streaming speeds

February 11, 2025

Have you ever wondered how fast your favorite LLM really compares to other SoTA models? I recently saw a Reddit post where someone was able to get a distilled version of Deepseek R1 running on a Raspberry Pi! It could generate output at a whopping 1.97 tokens per second. That sounds slow. Is that even usable? I don’t know!

Meanwhile, Mistral announced that their Le Chat platform can output tokens at 1,100 per second! That sounds pretty fast? How fast? I don’t know!

So, that’s why I put together TokenFlow. It’s a (very!) simple webpage that lets you see the speed of different LLMs in action. You can select from a few preset models / services or enter a custom speed, and boom! You watch it spit out tokens in real time, showing you exactly how fast a given inference speed is for user experience.

Check it out: https://dave.ly/tokenflow/

The code is also available on Github.

It’s AI all the way down

February 6, 2025

Back in November, I went with some friends to play paintball — it was the first time I ever played. We had booked a 3 hour session that would feature multiple matches. I don’t think any of us had ever played before and we were all pretty nervous about getting hit.

Lo and behold, within the first 30 seconds of the game, I took a paintball to the knee (cue the “I used to be an adventurer like you…” meme from Skyrim). Somehow, I twisted my leg as I rag dolled into the ground.

Of course, you can’t just give up after 30 seconds, right? So, on I played. The result is that I ended up tearing my ACL (the doc said he had no idea how this could have happened), have a bone contusion, and will likely need reconstructive surgery at some point. Fun!

Anyway, the point of all of this — for funsies, I tried to create a song about the situation using Suno’s generative music service (see previously). I used ChatGPT to come up with some initial lyrics and then did some work to refine them.

Then! I decided to use OpenAI’s generative video tool, Sora, to attempt to create a bunch of clips. I strung everything together in iMovie and the result is this rowdy music video: “This is What I Get“

Using AI: Extracting reading list from GoodReads

January 24, 2025

I feel like I want to start creating some posts related around how I, as a software engineer, personally use generative AI tools. I think they are a huge boon for increasing productivity, exploring new ideas, and even learning new things

Reading Reddit, Hacker News, and various other forums, there’s a lot of anxiety among software engineers about how AI is going to steal our jobs. It’s not without merit:

Mark Zuckerberg says that Meta hopes to have AI do the work of mid-level engineers later this year.
OpenAI is working on a system that can carry out tasks and thinks like a level 6 engineer (ah ha, the joke is on them — we mostly go to meetings, write lots of documentation, and review pull requests)
Google says 25% of new code contributed to their repositories is written by AI.

A recent blog post by Dustin Ewers adds some needed sanity to the discussion. In a post titled, “Ignore the Grifters – AI Isn’t Going to Kill the Software Industry“, he argues that:

He argues that we should ignore the grifters.

I feel like half of my social media feed is composed of AI grifters saying software developers are not going to make it. Combine that sentiment with some economic headwinds and it’s easy to feel like we’re all screwed. I think that’s bullshit. The best days of our industry lie ahead.

It’s highly unlikely that software developers are going away any time soon. The job is definitely going to change, but I think there are going to be even more opportunities for software developers to make a comfortable living making cool stuff.

I am inclined to agree. Hey, I will drink this Kool-Aid!

—

Well, let’s get to the real reason I’m making this post. I posted about my 2024 reading list and shared all the books I had read during the year. Try to compile that list and add links by hand would be a huge pain. There has to be an easier way. (Cue super hero music)

There is!

If you go to my GoodReads “read” list, it looks like this. And it keeps going. It’s a lot of data.

If we open up the browser console, we can see that it’s just a good old fashioned HTML table.

So, using an AI tool like ChatGPT or Claude, how do you get this data in a manageable way? One area that I’ve personally seen people struggle with is how to write a prompt in a way that helps them. You need to provide context. You say:

Describe the problem: “I want to output a list of books from an HTML table into a JSON object using a JavaScript function that I can paste into the browser console.”
Provide some example date: “Here is the table’s head with names for each column: [paste block of code]. Oh! Here is also an example row of data: [paste block of code]”
Provide an example of the output: “Can you create a JSON object in the following shape?”

Using Claude as an example, here is what that looks like and you can also see the generated output:

Moment of truth — does it work? Let’s paste it into the browser console and see the result:

Yes! Victory! One problem though. I did not read 60 books in 2024. Oh, no. We are pulling all books visible from the page. This isn’t a problem. We can fix it by simply asking a followup question: “Can we modify the function so that it only returns books where date read is in the year 2024?”

Claude modifies the function to add a filter for 2024. If we paste that into the browser console, we now get the correct number of hooks: 30!

There is still another thing to do. I want to make this into a nice, unordered list that I can just add into my blog post. Again, we follow the steps outlined above:

Can you create an unordered HTML list that shows links to each book? Please add a link around the title, but keep the author name unlinked.
Here is my JSON object: [paste block of code]
I essentially want a list that looks like this: <li><a href=”[booklink”>Book title</a> by Author</li>

Hot diggity! It works. It generates a block of code that I can just paste into my blog’s text editor. Pretty neat. It took a total of 5 minutes. (Hey, writing this post took a lot longer than that.)

Anyway, this has been a production of “How I use AI”. Stay tuned for more exciting updates, coming to a blog near you.

Book Review: The Alignment Problem by Brian Christian

January 23, 2025

The Alignment Problem, (released in 2020 but still highly relevant today, especially in the age of generative AI hype), is a fascinating exploration of one of the most interesting issues in artificial intelligence: how to ensure AI systems safely align with human values and intentions. The book is based on four years of research and over 100 interviews with experts. Despite the technical depth, I feel that this book is written to be accessible to both newcomers and seasoned AI enthusiasts alike. A word of warning though: this book is has A LOT of info.

Before we get too deep into this review, let’s talk about safety and what it means in the context of AI. When we talk about AI safety, we’re referring to systems that can reliably achieve their goals without causing unintended harm. This includes:

The AI must be predictable, behaving as expected even in novel situations.
It must be fair, avoiding the amplification of existing societal biases.
It needs transparency, allowing users and developers to understand its decision-making process.
It must be resilient against failures and misuse.

Creating safe AI tools is both a technical challenge, as well as a psychological challenge: it requires understanding human cognition, ethics, and social systems, as these elements become encoded in AI behavior.

The book is divided into three main sections: Prophecy, Agency, and Normativity, each tackling different areas of aligning artificial intelligence with human values.

Prophecy explores the historical and technical roots of AI and highlights examples of unintended outcomes, such as the biased COMPAS recidivism prediction tool. COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) is a risk assessment algorithm used in the criminal justice system to predict the likelihood of a defendant reoffending. However, investigations revealed that the tool disproportionately flagged Black defendants as higher risk compared to white defendants, raising critical questions about fairness and bias in that AI system.

Agency delves into reinforcement learning and the parallels of reward-seeking behavior in human, showcasing innovations like AlphaGo and AlphaZero. His explanation of reinforcement learning, and its connection to dopamine studies, is particularly insightful. Christian dives into psychological experiments from the 1950s that revealed the brain’s pleasure centers and their connection to dopamine. Rats in these studies would press a lever to stimulate these areas thousands of times per hour, foregoing food and rest. Later research established that dopamine serves as the brain’s “reward scalar,” which helps influence decision-making and learning. This biological mechanism has parallels in reinforcement learning, where AI agents maximize reward signals to learn optimal behaviors.

Normativity examines philosophical debates and techniques like inverse reinforcement learning, which enables AI to infer human objectives by observing behavior. Christian connects these discussions to ethical challenges, such as defining fairness mathematically and balancing accuracy with equity in predictive systems. He also highlights key societal case studies, including biases in word embeddings and historical medical treatment patterns that skew AI decisions.

Christian interweaves these sections with interviews, anecdotes, and historical case studies that breathe life into the technical and ethical complexities of AI alignment.

He also delivers numerous warnings, such as:

“As we’re on the cusp of using machine learning for rendering basically all kinds of consequential decisions about human beings in domains such as education, employment, advertising, health care and policing, it is important to understand why machine learning is not, by default, fair or just in any meaningful way.”

This observation underscores the important implications of deploying machine learning systems in critical areas of human life. When algorithms are used to make decisions about education, employment, or policing, the stakes are insanely high. These systems, often trained on historical data, can perpetuate or amplify societal biases, leading to unfair outcomes. This calls for deliberate oversight and careful design to ensure these technologies promote equity and justice rather than exacerbate existing inequalities. (Boy, oh boy — fat chance of that in light of current events in January 2025)

Christian also highlights some of the strengths of machine learning. These systems can detect patterns in data that are invisible to human eyes, uncovering insights that were previously thought impossible. For example:

“They (doctors) were in for an enormous shock. The network could almost perfectly tell a patient’s age and sex from nothing but an image of their retina. The doctors on the team didn’t believe the results were genuine. ‘You show that to someone,’ says Poplin, ‘and they say to you, “You must have a bug in your model. ‘Cause there’s no way you can predict that with such high accuracy.” . . . As we dug more and more into it, we discovered that this wasn’t a bug in the model. It was actually a real prediction.”

Examples like this show the real-world potential of machine learning to revolutionize fields such as healthcare by identifying patterns that humans might overlook. However, these benefits are accompanied by significant challenges, such as the “black box” nature of AI decision-making, where it remains difficult to determine what features a model is actually using.

Christian shows how understanding these technical challenges, alongside ethical frameworks, can lead to more robust and equitable AI systems. These considerations emphasize the nature of AI safety, which requires combining insights from cognitive science, social systems, and technical innovations to address both immediate and long-term risks.

While the book is dense (very dense!) and information-rich, this strength can also be a drawback. Some sections felt overly detailed, and the pacing, especially in the latter half, left me feeling fatigued.

Despite this, The Alignment Problem remains a compelling and optimistic exploration of how researchers are tackling AI safety challenges. I think this book is an insightful read for anyone interested in AI and will leave you thinking about our future AI overlords long after you’ve turned the last page.

Project: Super Simple ChatUI

May 22, 2024

I’ve been playing around a lot with Ollama, an open source project that allows one to run LLMs locally on their machine. It’s been fun to mess around with. Some benefits: no rate-limits, private (e.g., trying to create a pseudo therapy bot, trying to simulate a foul mouthed smarmy sailor, or trying to generate ridiculous fake news articles about a Florida Man losing a fight to a wheel of cheese), and access to all sorts of models that get released.

I decided to try my hand at creating a simplified interface for interacting with it. The result: Super Simple ChatUI.

As if I need more side projects. So it goes!

Fun with GPT-3

June 6, 2022

Earlier this evening, I was playing around with OpenAI’s GPT-3 language model for generating “human-like text.”

I gave it the following prompt:

Write a plot about a movie that involves rogue unicorns who want to
conquer Earth and the planet’s only hope is an Australian Cattle dog named Benson who sleeps a lot.

The response?

The unicorns, led by their king, Uni, have been planning their conquest of Earth for years. They have been observing the humans and have concluded that they are weak and vulnerable. The unicorns have developed a powerful magic that will allow them to take over the planet.

Benson, an Australian Cattle dog, is the only one who can stop them. Benson is a lazy dog who loves to sleep. He is content to spend his days lounging around and eating food. However, when the unicorns begin their attack, Benson is the only one who can stop them. He rallies the other dogs and leads them in a fight against the unicorns.

The unicorns are powerful, but they are no match for Benson and the other dogs. Benson is able to defeat Uni and the other unicorns, and the Earth is saved.

Summer blockbuster coming soon to a theater near you!

Tag: generative-ai