Running AI locally when GPT-4 goes crazy

Joe Cooney • February 22, 2024

One of the features of life working at PZ is our brown bag lunch and learn sessions; presentations by staff on topics of interest – sometimes, but not always technical, and hopefully amusing-as-hell.


Yesterday we took a break from discussing the book Accelerate and the DORA metrics to take a whirlwind tour of the current state of play running “open source” generative AI models locally.


Although this talk had been ‘in the works’ for a while, one challenge was that it needed to constantly be revised as the state of AI and LLMs changed. For example, the Stable Video Diffusion examples looked kind of lame in comparison to OpenAI’s Sora videos (released less than a week ago) and Groq’s amazing 500 token-per-second hardware demo on Monday/Tuesday, and the massive context size available now in the Gemini 1.5 models (released a few hours before OpenAI announced Sora...coincidence? An effort by OpenAI to steal back the limelight! Surely NOT!). And now a day later, with the paint still drying on a highly amusing slide-deck for the talk, Google releases their “open-source" Gemma models! 


The day itself presented an excellent example of why having more control of your models might be a good thing. ChatGPT 4 users began reporting “crazy” and highly amusing responses to fairly normal questions. We became alerted to this when one of our own staff reported on our internal Slack about a crazy response she received to a question about the pros and cons of some API design choices. The response she got back started normally enough, but then began to seem to channel Shakespeare’s Macbeth and some other olde English phrases and finished thusly.   


"Choose the right charm from the box* dense or astray, it’ll call for the norm. Your batch is yours to halter or belt. When in fetch, marry the clue to the pintle, and for the after, the wood-wand’s twist'll warn it. A past to wend and a feathered rite to tend. May the gulch be bygones and the wrath eased. So set your content to the cast, with the seal, a string or trove, well-deep. A good script to set a good cast. Good health and steady wind!"


The sample JSON payload was also in keeping with the rest of the answer. 



 "htmlContent": "<p>Your HTML here</p>", 
 "metadata": { 
   "modifiedBy": "witch-of-the-wood", 
   "safety": "sanitized", 
   "mood": "lunar" 
 }


Hubble, bubble, toil and trouble. Although there were no reports of the GPT4 API being affected by this (only ChatGPT) it might have given people developing automated stock trading bots using GPT4 a reason to pause and contemplate what might have been if their stock portfolio now consisted of a massive long position on Griselda’s Cauldron Supplies.   


As ChatGPT would say, Good health and steady wind.

Share This Post

Get In Touch

Recent Posts

May 20, 2025
We’re proud to announce that Hanieh Madad has been named the winner of the Technical Award at the prestigious 2025 ARN Women in ICT Awards.
Copies of the book DesignedUp are stacked on top of each other on a pink background
By Lennah kuskoff May 5, 2025
At PZ, we’re always exploring how design and technology can better complement each other. We recently hosted a Lunch & Learn featuring Emma Carter, Experience Design Leader and author of DesignedUp, whose talk was a candid, experience-rich exploration of what it takes to create great products, and even better collaboration between disciplines.
By Joe Cooney May 5, 2025
A friend and former colleague reached out to me recently to ask if I could help him fix a couple of bugs in a small project he’d been working on. He was not a developer, but had worked in and around developers for his whole 20+ year career as a business analyst, product owner and program manager. With the advent of tools like Cursor and Lovable his lack of coding ability was (maybe) no longer a barrier to getting some ideas he’d been incubating in his mind for a while, out into the world. With credit card in hand, he dived headfirst into the world of “vibe” coding. We met for coffee, and he showed me the prototype he’d built. I was quite impressed with what he showed me (running on his laptop…deploying it anywhere was a bridge he had not crossed yet) – a capable working prototype that demonstrated the ideas he was trying to prove out. I asked him about the “development experience” and he said it had been great at first, and he’d been able to make a lot of progress quickly, but at some point he hit a bit of a wall where each change he tried to make introduced more issues, and he felt like it was pointless to continue. He’d switched between a few different AI coding tools in an effort to see if the problems he encountered were specific to the tool he’d started with, but without success. The vibes had run out.
By Joe Cooney April 3, 2025
Making cybersecurity fun and engaging with capture-the-flag (CTF) events—boost team collaboration, enhance security skills, and turn dry security practices into an exciting challenge!
More Posts