Hello friends of gen AI! Holy crap we have a lot to share this month…
We’re running a remote event on the 9th of May about AI agents! Register here. There will be lightening talks, followed by a Q&A. So far we have a rep from MultiOn and the co-founder of Text Alchemy on as speakers. We hope to see you there!
Lev was interviewed by Beth York again on her ON AI podcast. In this episode he discusses the differences between Claude III’s more natural sounding outputs with ChatGPT’s traditional corporate slant. He’s been interviewed by York before — here’s the other episode if you missed it!
Jeremy and Georgia also made another video together in now what is becoming a series: we record ourselves making something with gen AI tools, but Jeremy won’t know what it is until Georgia tells him on the call — this time we made promotional materials for a cult that was trying to look like a day spa. Enjoy!
We always try and use this newsletter as a kind of bucket for all the content we’ve made in the past month — sign up so you don’t miss anything!
Some thoughts on AI agents
Feels like over the last month or so people have been going kinda mad for AI agents — they’re either using them, building them, or just eager to learn more about how they work. The way I see it, agents represent a huge emerging shift in how we interact with apps and services. When agents hit the mainstream, the way we think about AI — and actually, the web in general — will be completely turned on it’s head.
ICYMI: what I mean by ‘agent’ is a transformer-based system (or similar architecture) that can carry out a series of actions on a device, or just within a browser. An agent can execute a sequence of tasks, often autonomously, without constant human oversight. A user might still be involved in bits that they deem critical, such as reading an agent-drafted tweet, but they wouldn’t be involved in every single micro-decision, such as opening up a browser, navigating to the right place, or inputting credit card details. And maybe soon, users won’t be involved at any stage at all. Today’s models that sit behind services like ChatGPT can kind of do this, but they aren’t very good at it. In 6 months, many models will be excellent at this.
So what we’re looking at here is a departure from both the way we interact with the web — where you directly control browsers and web apps — and the way we interact with AI models, where content is generated based on a single input, and then it awaits further instructions. AI agents are comprehensive systems that leverage both AI and non-AI based tools or APIs. E.g. with a click of a button, an agent should be able to reliably generate Good Content™️, edit and repurpose it for a range of social channels, and post to all those channels. Or, just with a basic instruction such as ‘organize my vacation to Mexico. I want to stay in Cancún and I want a sea view. My budget is xyz’ it should then book flights, accommodation, car rental, and whatever else you might need, just from that one instruction.
In terms of the content production example, we’re already there (you can achieve this in Zapier Canvas). And with the vacation example, we probably aren’t too far from it being as easy and frictionless as I described. Either way, these advances will shift our dominant UX paradigm away from being command-based, to being intent-based. A command-based UX is one where you, the human, must give a machine a series of instructions in order to achieve an over-arching goal: open browser, go to supermarket website, select groceries, etc. Whereas an intent-based system allows the user to simply convey their intentions (e.g. ‘plan dinner party’) and then get their desired results.
One context I’ve thought a lot about recently is business operations: if the new paradigm is intent, managers, executives, and other workers will have to learn to coordinate agentic systems — the ability to convey context and information to machines efficiently and clearly may become a key virtue in the future. Humans will strategize, and machines will execute on those strategies. This unlocks more capability for humans to do what they do best: handle dilemmas and manage uncertainty.
If you find this interesting and want to have more conversations like this, join our discord. We’re always there and up for a chat!
While we revel in the novelty of these potential transformations, I also want to note that the word ‘agent’ is not the best descriptor here, because it could be misleading to end consumers. ‘Agent’ is good in that it distinguishes the system from something like a straightforward image generator — something called an ‘agent’ can clearly do more than just stand still and generate content. But the word ‘agent’ also invokes some impression of consciousness or inherent motivation. An automated system that carries out a string of tasks with little human intervention may look like it could ‘get out of control’ or somehow ‘develop a mind of its own’. Agents will not have a mind of their own any more than ChatGPT does.
Questions about agency are not the right ones. The more urgent question is how we manage these new systems. They will have access to logins, files, and even credit card information, if we so choose. Benefiting from agents without being mindlessly cavalier about our personal digital spaces means establishing secure workflows, so that we don’t inadvertently pay for superfluous services or defame ourselves online. There’s a balance that needs to be struck between rejecting agents outright, and deploying them freely on the internet with damaging or embarrassing results — but it’s not 100% clear what that balance looks like yet, because this is still super new.
Ultimately, AI agents are about more than just task automation: they’re also about reimagining the ways in which humans interact with machines. If agents truly take off, and are woven into our technology ecosystems, we could see apps, websites, and software in general be slowly phased-out in lieu of simplified, intent-based interfaces. It’s going to be interesting. But for now, you can watch me try and post something on LinkedIn with MultiOn. And umm…
Not to dunk on MultiOn. They are absolutely one of the most interesting companies in the space, and you should try them out whenever you can. Their team is working really hard and is going to do amazing stuff… soon 🙂
Speaking of new paradigms… what instead of prompting we had this?
I’ve been experimenting with building a tool that can edit generated text based on the preferences on this matrix. I really want to go beyond just prompting, and instead surface more interesting and powerful interactions. Try it out here!