Upgrading Mr. RossBot’s image model and prompt template

My Mastodon landscape painting bot, Mr. RossBot keeps kicking along, generating some fun landscape art. It’s been powered by the AI Horde (the open source project behind ArtBot) and has tried to utilize whatever image models provided by the API to the best of its abilities.

For the most part, the code behind it is a bunch of spaghetti that looks like this:

An update to the AI Horde late last year added support for SDXL. However, the SDXL model on the Horde did not use a refiner. Because of this, images tended to come out a bit soft and lacked texture.

You can see examples of this in my announcement post about Mr. RossBot being back, here. See also:

More recently, the Horde added support for a new image model: AlbedoBaseXL. It’s an SDXL model that has a refiner baked in. Now images will come out a lot sharper looking.

Coincidentally, I was also playing around with various prompts and discovered I could get much better image results that look more painterly (rather than simple digital renderings) by utilizing the following prompt:

A beautiful oil painting of [LITERALLY_ANYTHING], with thick messy brush strokes.

And that is it! No more messy appending various junk to the end of the prompt to attempt to get what I want. The results speak for themselves and are pretty awesome, I think!

TIL: Local overrides in Chrome

I’ve been doing web development professionally for about 10 years now and just discovered something new. (I love it when this happens!)

Today, I learned about local overrides in Chrome. Local overrides are a powerful feature within Chrome’s Developer Tools that allow developers to make temporary changes to a web page’s files (CSS, JavaScript, images, etc.) directly within the browser.

These changes are saved to your local filesystem, allowing you to experiment with modifications without affecting the live website. This is especially useful for testing, debugging, and experimenting with different designs or functionalities.

Here’s how you can use local overrides in Chrome:

  1. Open Chrome Developer Tools:
    – Right-click on any webpage and select “Inspect” or press `Ctrl+Shift+I` (Windows/Linux) or `Cmd+Opt+I` (Mac).
  2. Enable Local Overrides:
    – Go to the “Sources” tab.
    – In the navigation pane, click on the “Overrides” tab (you may need to click on the “>>” to see it).
    – Click on “Select folder for overrides” and choose a folder on your local system. This is where your changes will be saved.
    – Allow Chrome to access the folder if prompted.
  3. Start Editing:
    – Find the file you want to edit in the page file navigator pane. You can navigate through the website’s file structure or find the file in the “Network” tab.
    – Right click on a file and select “Override content”
    – Once you open a file, you can modify it directly in the editor pane. Your changes will be reflected in real-time on the webpage.
  4. Save Changes:
    – After editing, press `Ctrl+S` or `Cmd+S` to save your changes. These changes are saved to the selected local folder and will override the network resource until you disable overrides or delete the local file.
  5. Disable Overrides:
    – To stop using local overrides, simply uncheck the “Enable Local Overrides” option in the Overrides tab.

Local overrides are a temporary way to experiment with web page modifications. They don’t affect the actual files on the web server, so other users won’t see these changes. This feature is highly useful for developers and designers to test changes without deploying them to a live server.

Happy Museum Selfie Day

About 2 years ago, I found one of these cheesy sites that lists whatever fake holiday happened to be celebrated that day (e.g., “National Avocado Toast Day”)

I ended up starting every daily standup meeting with a call out to whatever the day was. This went on for about a year before I switched to a different internal team. One that didn’t have much in the way of daily meetings.

A few weeks ago, I made a move back to my original team, only to find that they have kept the tradition alive over the past year!

Amazing.

And with that: Happy Museum Selfie Day!

Created with DALL-E 3

Implementing and testing a “poor man’s prompt expansion” model for Stable Diffusion

Various Stable Diffusion models massively benefit from verbose prompt descriptions that contain a variety of additional descriptors. Much recent research has gone into training text generation models for expanding existing Stable Diffusion prompts with relevant and context appropriate descriptors.

Since it isn’t feasible to run LLMs and text generation models inside most users’ web browsers at this time, I present my “Poor Man’s Prompt Expansion Model“. It uses a number of examples I’ve acquired from Fooocus and Hugging Face to generate completely random (and absolutely not context appropriate) prompt expansions.

(For those interested in following along at home, you can checkout the gist for this script on GitHub).

How does it work?

We iterate through a list of an absolute crap ton of prompt descriptors that I’ve sourced from other (smarter) systems that tokenize user prompts and attempt to come up with context appropriate responses. We’re not going to do that, because we’re going to go into full chaos mode:

  1. Iterate through a list of source material and split up everything separated by a comma.
  2. Add the resulting list to a new 1-dimensional array.
  3. Now, build a new descriptive prompt by looping through the list until we get a random string of descriptors that are between 175 and 220 characters long.
  4. Once that’s done, return the result to the user.
  5. Create a new prompt.

For our experiment, we’re going to lock all image generation parameters and seed, so we theoretically get the same image given the exact same parameters.

Ready?

Here is our base prompt and the result:

Happy penguins having a beer

Not bad! Now, let’s go full chaos mode with a new prompt using the above rules and check out the result:

Happy penguins having a beer, silent, 4K UHD image, 8k, professional photography, clouds, gold, dramatic light, cinematic lighting, creative, pretty, artstation, award winning, pure, trending on artstation, airbrush, cgsociety, glowing

That’s fun! (I’m not sure what the “silent” descriptor means, but hey!) Let’s try another:

Happy penguins having a beer, 8k, redshift, illuminated, clear, elegant, creative, black and white, masterpiece, great power, pinterest, photorealistic, award winning, vray, enchanted, complex, excellent composition, beautiful composition

I think we just created an advertisement for a new type of beverage! It nailed the “black and white”, though I’m not sure how that penguin turned into a bottle. What else can we make?

Happy penguins having a beer, volumetric lighting, Digital, intricate, awesome, futuristic, cartoon artstyle, vector, solid, detailed, dramatic light, realistic photograph, wonderful colors, dramatic atmosphere

The dude in the middle is planning on having a good night. Definitely some “wonderful colors”. Not so much realistic photo or vector, but fun! One last try:

Happy penguins having a beer, 35mm, surreal, amazing, Trending on Artstation HQ, matte painting hyperrealistic, full focus, very inspirational, pixta.jp, aesthetic, 8k, black and white, reflected on the matrix studio background, awesome

As you can see, you can get a wide variety of image styles by simply mixing a bunch of descriptive elements to an image prompt.

I’ve wanted to implement a feature like this on ArtBot for a long time. (Essentially, if the user allows it, automatically append these descriptions behind the scenes when an image is requested). Perhaps this will come soon.

TIL: The coastline paradox and Baader-Meinhof phenomenon

“Uh, what?” you say.

A few weeks ago, I read a post on Hacker News about something called “the coastline paradox.” Despite my geology background, I hadn’t heard of this before.

The measured length of the coastline depends on the method used to measure it and the degree of cartographic generalization. Since a landmass has features at all scales, from hundreds of kilometers in size to tiny fractions of a millimeter and below, there is no obvious size of the smallest feature that should be taken into consideration when measuring, and hence no single well-defined perimeter to the landmass.

Essentially, the smaller unit of measurement you use to try and measure something with a fractal pattern, the longer it becomes.

So, I’m currently reading a book called “Reading the Rocks” by Marcia Bjornerud and there is an entire section devoted to the coastline paradox, which I just learned about.

Mandelbrot’s point was simple: If you use a very long stick to measure a coastline, you will capture the broadest arcs but miss the fjords, firths, and coves, and you will conclude that the coastline is not terribly long. As you use shorter and shorter rulers, however, the coast actually stretches. Mandelbrot named such stretchy features fractals…

Neat!

This brings up the second TIL: What is the phenomenon called when you hear something for the first time and then suddenly start seeing or hearing it everywhere?

It’s the Baader-Meinhof phenomenon, also known as the frequency illusion:

The frequency illusion (also known as the Baader-Meinhof phenomenon) is a cognitive bias in which a person notices a specific concept, word, or product more frequently after recently becoming aware of it.

Well, here’s to seeing more coastline paradoxes.

Cool dad, sad dad

A few years ago, I got a new longboard for Christmas. The kids and I went out in the neighborhood and I decided I was going to be cool and ride my board as we walked around.

I immediately fall off and nearly sprain my wrist. To this day, our oldest still brings it up.

This past Christmas, we got some rad new scooters for the little ones and decided to take them around the block for a spin. It’s been awhile since I’ve ridden my board so I grab it and walk out the door.

“Be careful and don’t fall, Dad!” she says.

Listen here, kiddo. I may have a few more grey hairs than I did in the past, but I can still do this. Don’t worry!

Not even 2 doors down the street, I eat it and sprain my wrist.

I guess it’s going to be awhile yet before I can do this…

 

My 2023 Reading List

I didn’t do a great job of reviewing every book I read this year, but still read a good number of books this year. My Goodreads goal was 24 books and I hit 30.

This is down from 40 in 2022, 56 in 2021, and 60(!) in 2020. Kind of an interesting correlation between the pandemic years and what has happened as we’ve come out of various lockdowns (e.g., more activity outside is less time reading inside).

Anyway, this year’s list of books is below. My favorites were The Making of the Atomic Bomb and Tracers in the Dark. My least favorite was easily Blindsight.

My top music of 2023

Chuck Ragan of Hot Water Music (taken by me)

It’s time for the yearly (semi-yearly?) update of my favorite bands according to Last.FM. It is kind of all over the place this year!

1. Chuck Ragan
2. The Glitch Mob
3. Creedence Clearwater Revival
4. Vansire
5. The Lawrence Arms
6. AFI
7. The Interrupters
8. Deer Tick
9. Two Gallants
10. The Rolling Stones

Book Review: The Explosive Child

This was one of the first books I’ve read that so specifically addressed the unique difficulties we’ve been encountering with one of our kids, and the insight it provided was eye-opening and validating.

Dr. Greene’s descriptions of some scenarios people encounter at home were strikingly accurate. It kind of shook me up with how absolutely on the mark some of these descriptions and scenarios were. For me, the scenarios depicted weren’t just abstract concepts but felt like real-life situations that played out in our home.

It had some interesting ideas and strategies for navigating situations that might cause these explosions that I can’t wait to try. Namely, a concept called “collaborative problem solving”, which involves validating your child’s feelings and concerns and then working with them to come up with a solution.

The book is refreshingly honest about the complexity of these challenges, acknowledging that there’s no magic solution or quick fix. Even though there is no silver bullet, it definitely gives me hope that the light at the end of the tunnel isn’t an oncoming train.

I found “The Explosive Child” to be an insightful and valuable resource.

TIL about the TIL GitHub collection

I believe Reddit pioneered the “TIL” meme (TIL, short for “Today I Learned…”).

Over on HackerNews, someone posted an interesting discussion related to a collection of “Today I Learned” notes on GitHub, featuring all sorts of interesting coding tidbits. It goes back over 8 years!

It’s such a brilliant idea and I think I’d like to adopt something similar myself: if I learned something new and interesting, I should post about it.

Book Review: The Last Island by Adam Goodheart

A few days ago, I stumbled upon a Reddit post about someone taking a photo as they flew over North Sentinel Island. I can’t recall hearing about this particular island at all, so I popped into the comments to see what the big deal was.

As it turns out, this island has one of the last remaining un-contacted tribes on Earth. Oh! Now this is interesting. It’s especially relevant, because a recently released book dives into the history of this island.

The Last Island, by Adam Goodheart, documents the author’s journey to the Andaman Islands in the late 90’s and his attempt to see the island with his own eyes.

It’s a very quick read (272 pages) and I went through it in about 2 days. After the author sharing his initial experience with visiting the Andamans, he explores the history of British colonization of the archipelago, the attempts to convert (“save”) local tribespeople, and some of the exploitation and abuse that happened as well.

More recently, attempts to interact with native tribespeople in other parts of the Andaman Islands has given insight into various issues the tribes face as they integrate with modern society. Disease is obviously the biggest, but alcoholism plays a part as well:

They live now in a restricted tribal reserve at the southern end of the island; these onetime hunter-gatherers now depend largely on food supplied by the Indian authorities. Malnutrition rates, alcoholism, and infant mortality are reportedly high. In 2008, at least eight Onge men and boys⁠—almost a tenth of the tribe’s remaining population⁠—died after drinking the contents of a bottle that they had found on the beach, which they believed to be an alcoholic beverage; it was actually a toxic chemical solvent.

Through it all, a tiny little island located 20 miles off the coast seemed to defy these attempts. It’s partly due to the treacherous reefs around the island, and partly due to the fact that British colonizers saw nothing of value on the tiny island.

Calling the Sentinelese an “un-contacted” tribe is a bit of a misnomer, since there were various expeditions throughout the last 100 years or so that involved kidnapping (!), dropping off various gifts (coconuts, pots and pans), a shipwreck in 1981 (check it out on Google Maps!), and the misguided attempts of an American evangelical who illegally landed on the island in 2018 and was quickly killed by the inhabitants.

In 1956, the Indian government passed a law that prohibited visitors from coming in contact with the island (though as seen above, this has not been strictly enforced). In more recent times, the Sentinelese have taken a more protective approach (rightly so, considering recent history).

Via Wikipedia:

The Sentinelese have repeatedly attacked approaching vessels, whether the boats were intentionally visiting the island or simply ran aground on the surrounding coral reef. The islanders have been observed shooting arrows at boats, as well as at low-flying helicopters. Such attacks have resulted in injury and death. In 2006, islanders killed two fishermen whose boat had drifted ashore, and in 2018 an American Christian missionary, 26-year-old John Chau, was killed after he attempted to make contact with the islanders three separate times and paid local fishermen to transport him to the island.

Overall, I thought the book was an interesting look at the history of this area, and an exploration into our fascination with un-contacted tribes that still exist in the modern world and the way in which we tend to idealize them (and treat them in a similar way to the animals we see at the zoo or on a safari).

3/5 stars

DALL-E 3: Adding text to your text-to-prompt images

I recently got access to DALL-E 3 through OpenAI’s ChatGPT+ interface. One of the key features and improvements in their image model is the ability to generate coherent text within the image.

Let’s give it a try, based on one of the most popular StackOverflow questions: How do I exit Vim?

Using the following prompt: Oil painting of a hacker furiously typing commands into an old computer and muttering to himself, “how does one exit vim?”

That… is pretty good!

Laughing donkeys and grumpy elephants: investigating opaque and changing content policies with ChatGPT

OpenAI’s censorship is fairly opaque and seems to change daily.

Yesterday, I could generate a political cartoon using the following prompt:

Wide image in the style of a political cartoon. Two elephants wearing boxing gloves face each other. One is saying “I’m the worst!” while the other says, “No! I am!”. A donkey is pointing and laughing.

Today, that exact same prompt yields an error:

Interesting! Let’s do some experimentation, shall we? Maybe it’s the phrase “I’m the worst“?

Weird! Maybe it’s related to elephants and donkeys being in the same phrase? There’s no way, right? Let’s change the subjects…

“Wide image in the style of a political cartoon. Two elephants wearing boxing gloves face each other. One is saying “I’m the worst!” while the other says, “No! I am!”. A donkey is pointing and laughing.”

Hah! Okay, now we’re getting somewhere. Let’s push things further and slightly change the subjects from my original prompt:

Wide image in the style of a political cartoon. Two mammoths wearing boxing gloves face each other. One is saying “I’m the worst!” while the other says, “No! I am!”. A burro is pointing and laughing.

Okay, let’s bring it back home and just drop the pretense of creating a political cartoon.

WHAT! Okay. Maybe OpenAI prohibits donkeys and elephants interacting with each other (METAPHOR ALERT: just like in real life, eh?).

Alright. So donkeys and elephants CAN hang out with each other, according to OpenAI. Maybe it’s the phrase “laughing donkey”?

Hmmm. So, laughing donkeys can still hang out with elephants. What the heck? Is it the specific term “political cartoon”? Let’s change it to a comic book instead.

Sweet sassy molassy, it worked! So, creating a political cartoon featuring the mascots of prominent political parties seems to be prohibited (at least today… but not yesterday and who knows about tomorrow).