The concept of ‘prompt engineering’ — where you give a machine a written prompt and it spits out something cool — is still relatively new and yet… I can already see it coming to an end. Prompting is a pretty strange way to interface with an LLM. It’s a bit like flying a fighter jet while sitting on the windshield: you’re trying to guide a very powerful and complex machine without any visibility into the internal controls.
Today’s prompting methods fail to harness the full potential of AI models — all the novelty and hype around consumer-facing prompting tools only obscures how awkward they are: on the human side, prompts are difficult to write; on the machine side, prompts are difficult to transform into good outputs. But, new models are being released all the time, and they are able to output more types of content (like video or 3D renderings) — and are becoming smaller. These developments signal the emergence of more powerful interaction techniques.
Active triggering VS contextual triggering
So, these new interactions can be easily divided into two camps:
Active triggering: where you deliberately interact with a model to get a specific response. Like clicking a button to retrieve some text, or saying ‘OK Google’ to initiate a web search
Contextual triggering: where no deliberate action is needed, and the model simply anticipates when it should engage.
Writing a prompt is a very janky type of active triggering. What we’re starting to see now in active triggering is a move beyond prompts, into much simpler interactions. You can see this in the recent updates to Google Workspace, where a button-click triggers the creation of slides. No prompt-writing necessary.
And it’s pretty clear that with near-future models, there will be a bunch of simple interactions like this peppered around whatever content-creation software you’re using: highlighting some text or adjusting a slider could easily be triggers.
Consider this: you’re writing a word document, and you have ‘tone sliders’ at the bottom that maybe say something like “Earnest ←→ Funny” or “Formal ←→ Personal”, and adjusting these changes the tone of the content in real time.
Or how about this: you highlight parts of the document you like, and the piece automatically updates itself to emulate the style of the highlighted parts.
These are the kinds of active generation I can see coming in the near future that don’t require any text-prompting or chat-like elements. For many, I imagine changes like this could be a welcome addition to the way they work.
If you think about it, multiple types of interaction make sense. Humans do not solely communicate with the written word. To see some really innovative thinking on how we can interact with generative AI, check out Puppets App. It combines hand gestures, voice, text, and imagery as inputs. It’s really cool to see ‘making animated puppet shows’ as a use case in an emergent tech niche that is crowded with those racing to generate the best sales emails. But, most importantly, it’s a signal of multi-modal, fluid, and embodied interactions that will soon mature out of infancy.
New methods of interaction means new use-cases, way beyond the kinds we have now. Near-future models allow for more contextual triggering, where actions are triggered by what’s happening in context, rather than by a deliberate interaction from the user. We already have this with smaller tasks like with code completion in Copilot or grammar correction with Grammarly. Soon, this kind of contextual triggering will apply to more complex tasks and workflows.
Companies like Lindy.ai are already trying to build browser assistants for anything and everything (calendar management, booking flights, any of the boring repetitive admin you’d rather avoid). I’m starting to see this kind of thing more and more: models that are trained or fine-tuned to explicitly include browser and software actions, which ultimately makes it less necessary to tell the model what to do and when.
But! I don’t think prompting will completely disappear
Once we move beyond prompting I’m hoping there’s still room for artists and other creators to push models further than the developers originally intended. This is actually something that we’re already starting to see: visual artists are fine-tuning models in technically approachable ways to create custom image generators which they release into the community.
A good example of this is the “Floral Marble” style generator by artist Joppe Spaa creates ethereal images of stunning beauty. Here’s the important part — it would be nearly impossible (or at least, extremely difficult) to create these images with the original model, simply with prompting.
For me, the truly fascinating thing about this is that no matter how far developers attempt to go with the capabilities of an AI model, artists will always find a way to go even further. With custom models such as Floral Marble, outputted images are not the art — the model itself is the art. The images that come out of it are simply the artifacts of audience participation.
So yes, prompting is awkward and doomed to fade. But, from where I sit, a positive part of it is that it requires the user to think of something to say. You need creative initiative to get results. Its good because of the level of effort you have to put into it; if we interact proactively with AI, we will do more than automate our work, we’ll invent new forms of thought, creativity, and experience.
If you want to learn more about anything you just read, you can literally book in a meeting with me right now. If you want to learn EVEN MORE I also run a Generative AI Masterclass so check that out.