AI (Without the Hype)
It’s a Christmas surprise! My AI (Without the Hype) talk from NSSpain is now available for everyone to watch.
This talk is wide-ranging and covers a lot of ground. I start by showing you a day in the life of using AI, discuss the fundamentals of how AI systems work, and then share ways to shape the way you work with AI to get great results. Whether you’re a beginner or an expert, a developer or a designer, or just someone who wants to learn more advanced ways of thinking about AI — there’s something for everyone.
I’m not saying you should ignore your family on Christmas Eve, but if you’re looking to learn about a lot about AI once the mice are no longer stirring about — I think you’ll really enjoy my talk.
- Here’s a direct link
- Better yet, you can save it to my app Plinky, and watch it later.
- And here’s the full transcript below. It was pulled from YouTube and cleaned up with Claude Code, but I did skim it over to make a few minor edits and didn’t notice anything that misquoted me.
Introduction
Daniel Steinberg: I know I talk about how old I am a lot, but one of the reasons is people ask me why I don’t retire. And the reason I don’t retire is because of this. I mean, most of my friends are elsewhere.
Daniel Steinberg: So, I know I don’t remember names, but I really appreciate talking to people. And on the bus ride up from Madrid to Logroño, which is quite a bus ride, Joe said, “Hey, mind if I sit up here? We chat?” And it’s a great way to meet somebody that you only kind of know online. And people online, he’s much nicer in person. And I’ll tell you that he’s an indie iOS developer. You might know him from his app Plinky. He also teaches an AI workshop because that’s what everyone does these days. He’s going to talk to you about AI without the hype. Please welcome Joe.
Who Am I?
Thank you so much for having me here at NSSpain. It’s truly an honor to be here at this wonderful conference with so many of my favorite iOS developers. As you can guess, I’m here to talk to you about AI. But first, I’d like to tell you a little bit about who I am.
I’m an indie developer and I spend my days working on an app called Plinky. It’s the easiest way to save links for later, available on the App Store, and people really seem to like it. I’ve worked at every size company from a startup of two to internet success stories like Time Hop and Bitly. Most recently, I spent four years helping build the health client team at Twitter when it was Twitter, with a focus on improving misinformation and abuse at a global scale. I’m an open source maintainer and a writer, and these days I’m also a teacher.
I help people of all backgrounds, technical and non-technical, learn all about AI, as we’ll be doing here today. I love teaching people about this technology that’s emerging in real time. So, I work with individuals and teams alike to develop highly personalized hands-on workshops.
What I Love About Code
A simple and reusable abstraction tickles my brain, as does the personal fulfillment of solving a really difficult challenge. Code on its own is interesting, but I also love using code to build products that people love.
Don’t get me wrong, as an indie, I have to do design, marketing, sales, customer support, and of course, build my product. I love doing each of these things a little less than I love to code. Some, a lot less, actually.
But what I love most is making a real connection with another person through the software that I build. I feel so lucky that I get to build personal, playful productivity apps that make people happy. I’m always moved when a person reaches out to tell me, and that’s what drives me to do all the things that I don’t necessarily want to do. These are the roles that I take on to enable me to keep doing what I love most.
And that’s why I’m going to be talking to you today about how to use AI for your job as a software developer to do more of the things that you love most and less of the things that you don’t.
What We’ll Cover
We’ll start by talking about where AI is today and how I use AI in my day-to-day life.
Then we’ll talk about the technology behind AI and why AI tools work the way that they do. This will help explain why AI will continue to improve and diffuse across the industry, reshaping the way that we all work. Most importantly though, we’ll discuss the techniques that are required for successfully working with AI to accomplish your goals. There’s a lot that AI can do today and plenty that it can’t. I’m not here to convince you that it’s a magical tool that can do anything and everything. And that’s why I have this disclaimer.
A Disclaimer
As I mentioned, this talk is about AI, and people have a lot of feelings about AI — understandably so. Because I help people learn about AI, I hear all of these feelings all of the time. Some people think that AI can’t do very much and that it’s just a lying hallucination machine. Others worship at the altar of Sam Altman and think that we’re building a digital god.
Much like everything in life, I believe the answer is somewhere in the middle. There are many moral, societal, and security concerns to be had about AI — there are so many of them — and I genuinely wish that we were thinking deeply when having those discussions. The field is changing very quickly, so a few of the things that I present won’t be the only approach or the only solution.
I personally use AI a lot, and I’ll walk you through how it’s been a multiplier for my productivity. While I get a lot of utility out of AI, I definitely do not delegate all of my thinking to it. What remains most important as a software developer and as a person is your ability to think critically and to adapt to new technologies. That’s why I want to ask that for the next 30 minutes, you put aside your preconceived notions about AI, regardless of what they might be. And then I’d love for you to keep an open mind to the ideas that I present. In 31 minutes, you can believe whatever you like. I promise.
A Day in My Life with AI
I want to start by walking you through a regular day in my life, specifically the ways that I use AI. I start most mornings with a 15-minute bike ride up and down a quiet street where I live in New York City.
Sometimes I listen to a podcast. Sometimes I enjoy the sounds of the city. But lately, I’ve been opening ChatGPT’s voice mode and simply saying, “Teach me something new.”
Learning with AI
It’s like having an encyclopedia on demand. If I don’t know something, I just ask. If I find something interesting, I dive deeper down the rabbit hole. And of course, if I’m skeptical about something, I follow through and get more clarity. LLMs are like lossy encyclopedias. The more factual information there is about something in the training data, the more accurate you can expect an LLM to be. There’s a lot of highly sourced information about bioluminescence. So, I feel pretty good about the validity of what I’ve learned.
You can learn about almost anything. If it’s available on the internet, you can probably learn about it. It could be science, it could be history, or it can even be code. Lately, my bike rides have been filled with question after question about recent Swift evolution proposals. I’m not looking for an easy substitute for learning. Building a deep understanding requires practice and hard work. Instead, I’m using a new approach to break down a dense subject by asking the questions that I need answers to.
Building Tools on Demand
As I’m walking home from my bike ride, I have an idea. I get back to working on my NSSpain talk and I use AI to build this slide. Every slide in this talk with a chat bubble was made with a little tool that I built in a Claude artifact. The chat bubbles were created to be an exact replica of ChatGPT but with my talk’s color scheme. It was so simple that I built it with the prompts that you see on screen in less than 5 minutes. I didn’t have to write any code. I just had to describe my problem and give that to Claude.
Artifacts are accessible via the web, so you can go play with mine or even create your own based on it.
Now, is this chat bubble generator my life’s work? Of course not. It’s just a simple tool. But humans are tool builders. Being able to build tools on demand with AI is what makes AI so empowering to a software developer.
This XKCD comic about automation used to be a joke, but now it’s our reality. Rather than sitting in Keynote all day manipulating shapes until they look just right, I can now automate the process of creating a piece of software. Three prompts and five minutes later, I had exactly what I needed. No code and no thinking required to create a usable and reusable piece of software. Many problems are harder to describe than this, but sometimes it’s actually this easy to describe and build something that you need. The cost of making throwaway software that helps you accomplish a specific task is now almost zero.
Natural Language API Testing
Take a look at these two APIs and tell me which one you like more. In the spirit of automation, halfway through a curl request to test Plinky’s API, I realized I’m sick and tired of writing curl requests. I could use a GUI tool designed for API requests like Paw or Postman. But it would be even better to have a personalized tool. What if I could simply type “create tag NSSpain” and AI would create the correct curl request for me?
I think about how I can do this and I have a brilliant idea. I open up Codex, which is OpenAI’s CLI tool similar to Claude Code. First I ask Codex to read through the code for Plinky’s API layer to understand all of the endpoints that I have. Codex generates an OpenAPI spec from my code. So now we have a formal document that describes all of our endpoints and all of our parameters. Then Codex builds a Python script called plinky-api which also uses Codex to translate my natural language request into the correct API call. The script works by using GPT to analyze my command line input and reads through the OpenAPI spec to determine which routes I’m most likely trying to call.
Then we get this: now when I run this command, it chooses the right endpoint and constructs a curl request to add a new tag with the name NSSpain. I can now test my server much faster without having to keep all of the API routes in my head.
If you have a ChatGPT subscription, Codex is included for free. This means that right now you can build any tool that transforms natural language into something meaningful. I came up with and built this idea in 15 minutes. Most of that time I was off doing something else while Codex was writing the code.
It did take me a couple of iterations to perfect, but that’s nothing compared to the time that I’ve saved since, let alone making my life a little less annoying.
Lunch Break
That was hard work. So, I pause to have lunch with my lovely, supportive wife. I tell her about my brilliant new invention, and she says, “That sounds nice.”
She’s not a developer, so she doesn’t get how insanely cool this was. But it’s also proof that I don’t spend all day talking to AI.
Design Assistance
It’s time to get back to work. So, I start on my iOS 26 redesign for Plinky. I’ve been adjusting Plinky’s color palette to feel more modern and to stand out in a world of stock system apps. I realized early on that liquid glass would look better on a creamy white or espresso brown background rather than a pure white, black, or gray like many apps use. The colors that I choose will be used everywhere across my brand. So, I need to be thoughtful with these color choices.
I know the look that I’m going for and I’ll know it when I see it, but I don’t want to spend all day endlessly trying out hex codes to get there.
So, once again, I turn to ChatGPT to act as a design assistant.
ChatGPT isn’t a designer and it’s not replacing my design process, but it’s a great way to try a lot of options quickly. So, I asked ChatGPT to generate five color choices. I try out all of these colors on my device and settle on a slightly different shade.
I consider ChatGPT’s suggestions a great head start. They are advice. They’re not requirements. But those suggestions help me not spend all day aimlessly trying to pick just the right color.
I asked it to generate the P3 variants of our new colors to really pop on an iPhone. This is the perfect task for a computer. It’s an algorithmic problem. So why should I spend my time trying to generate the right transformations manually?
Lastly, I realized that the purple that I use for reminders is a bit stale on this new cream background. I don’t even have to tell ChatGPT the colors in my palette. I just drop in my entire color palette as a file and ask ChatGPT to find some alternate shades of purple that won’t clash.
Web Development Made Easy
Now that I have these new colors, I have to go apply them to Plinky’s landing page. I open Cursor and I tell it to update the website’s background color. It does so asynchronously while I’m working on something else.
Every app needs a landing page. And it used to be that not being an expert in web technology would stand in your way. But now, building a web page as an iOS developer is no longer a challenge.
Previously, I would spend weeks overcoming my mediocre CSS skills to build the perfect landing page, which would never come out quite right. Now, the answer is three simple steps:
- If you don’t know what technology to use, describe your problem to your favorite AI and talk through the trade-offs.
- Pick the web development stack that seems best for you from that conversation. Share a bunch of website designs that you like.
- Continue iterating with AI. Describe what you like about the design and what you don’t like and what it got wrong. Keep nudging it in the right direction. All in plain English or Spanish. Almost no HTML or CSS necessary.
And I’ll let you in on a secret step four: when it’s all done, a stupid trick that I absolutely cannot believe works is to have an LLM clean up after itself. We all know that Cursor probably wrote some subpar code and didn’t realize, but it can read its own code after the fact and identify what’s subpar. Simply say, “Go through the codebase to refactor, clean up, and improve the code that you just wrote.” And it does.
Git Recovery
If there’s one thing I believe about AI, it’s that AI lowers the barrier to entry for solving a problem. Few things have a higher barrier to entry than Git. I’ve been using Git for 15 years now, and I can use Git pretty well, but I still struggle with complex multi-stage problems like this.
In my redesign, I’ve accidentally overridden my previous theme with our new theme. I wanted the old theme to be accessible to users as a setting. Restoring this theme is very doable, but it’s tedious. It’s error-prone, and it’ll probably take me about 15 to 20 minutes to get right. Once again, I can simply describe the problem, the context, and have Claude Code fix my screw-up.
There’s a big shift that occurs in how you work when you only need to explain the problems rather than type out the solutions. Instead of memorizing or looking up arcane git incantations, I can work at a higher level of abstraction and get better results.
Scaling Challenges
While Claude Code is saving me from my git mistakes in one tab, I open up another instance of Claude Code. I need to think through the last phases of a feature that I’ve been working on for a few months. Plinky is going to start letting users import their links from other apps like Pocket or Instapaper.
Building an indie app means I store data in Postgres well, but scaling is a new challenge to me. I do know the things that I need to know, but that doesn’t mean I have a lot of experience doing them. I’ll probably make a bunch of mistakes along the way, and I find when that’s the case, often what I need is another set of eyes and someone to talk to.
Rather than asking Claude to write my code, I describe the concern that I have. I worry that Plinky won’t be able to scale to meet the needs of users who have tens or hundreds of thousands of links. This is an example where as a developer, I know the problems that can arise, but I may not know how to prevent those problems. I have Claude read over the code for my feature and provide an assessment. It makes recommendations and I understand almost all of them. I’m not afraid to ask questions and get recommendations about what I don’t understand because I want to learn how to build scalable systems.
These are good suggestions. So, I learn more about monitoring to make sure that I can catch and fix errors as quickly as possible. Agentic tools like Claude Code are great for helping get me up to speed on new APIs, new libraries, and new subjects entirely. This is how I spend the remainder of my day, learning the tools and technologies that will make me a better developer and help me build a better product.
That was a lot of work. So I wind down by playing with my ridiculous cat who is also not artificial intelligence.
Understanding AI Fundamentals
Good news, we’re finally done talking about me. I’ve just shown you some of the ways that I use AI on a day-to-day basis, but I barely mentioned writing code. What I did mention were plenty of important parts about being a professional software developer. I can assure you that in the hands of a skilled developer, writing code is what AI does best. To understand why, we need to break AI down into its fundamental parts. We want to build production software beyond vibe coding. And to do so, we need to understand the fundamentals of AI.
Where AI Lives Today
AI is everywhere these days. Some of those are places where people look for AI, like in ChatGPT. But you can also find AI in places where people don’t necessarily want it, like at the top of Google search results.
There are three main modalities for how software developers interact with AI: chat (like the aptly named ChatGPT), IDEs (like Cursor or VS Code), and agentic coding tools such as Claude Code, Gemini CLI, or Codex. As you’ve seen, all these can play a meaningful role in a developer’s workflow.
ChatGPT Is Not a Chatbot
So I’m going to make a bold claim. From the first time that you use ChatGPT, you’re being lied to.
What’s confusing about the name ChatGPT is that it wasn’t made for chatting. Instead, I like to think of ChatGPT as the world’s best report generator. ChatGPT synthesizes the inputs that people provide it and transforms that into tangible outputs.
This is where large language models shine. They are language transformers, whether that language is English, Spanish, or Swift. Being a skilled communicator places you ahead of most other people using ChatGPT.
Well-articulated inputs lead to higher quality outputs. Otherwise, it’s a lot like the expression “garbage in, garbage out.” I use the term “outputs” rather than “text” because you don’t always want text back. As you’ve seen, we can have ChatGPT generate text, files, or most valuably, software.
I have a simple rule: don’t talk to ChatGPT the way you talk to your bestie. Talk to ChatGPT the way you talk to a coworker who you want to get help from. That approach will more easily tap into an LLM’s ability to reason about text and transform it into something valuable.
Can You Write Code Better Than AI?
The state-of-the-art in AI coding changes seemingly every month. So, I want to ask you about your experiences. Raise your hand if you think that you can write code better than AI.
Okay, good challenge. That seems to be about 70% of the audience. Correct me if I’m wrong, but the problem is you’re all wrong.
On the one hand, none of you should be raising your hands, but on the other hand, all of you should be raising your hands. The question is incomplete. What does it mean to write better software than someone else? Is it following patterns and best practices? Is it writing performant code? Is it inventing novel approaches? Or is it solving the problems that users have? Of course, the answer is all of the above depending on the context.
A claim I’m willing to make is that I bet on a greenfield project, AI can generate high-quality production code faster than you. And I bet it can do it in any language — ones you’ve probably never even tried, some you may not have even heard of. And you may say, “Well, I’m not working on a greenfield project.” And you’d be right. That’s not how developers spend most of their time, but it does happen sometimes. As I demonstrated earlier with my natural language curl tool, I think we’re going to be building a lot more software on demand in the future. But I also use LLMs in a large codebase all the time. I know they can deliver positive results that make me more productive. And yet I still will assert that sometimes an LLM will do the stupidest possible thing imaginable.
A Poll on Agentic Tools
So let’s take another poll. Right now Claude Code is the coding tool that people love most, but it won’t necessarily be forever. Feel free to substitute OpenAI Codex or Google’s Gemini whenever I mention Claude Code. I’m going to ask a series of questions. So this time raise your hand and keep it up until the answer is no.
Have you used Claude Code, Gemini, or Codex?
Okay, remember keep it up. Seems to be almost 100% of the audience. I’m pretty sure of that.
Have you used Claude Code for your main way of working for at least one week?
Hands go down.
Have you sat and worked on a prompt for at least 30 minutes?
More hands down.
How about an hour?
Have you set up any advanced features like slash commands, sub-agents, or custom output styles?
Have you used an LLM in concert with Claude to guide and review Claude’s work? I’m looking at one person who had his hand up still.
Okay, so that’s almost none of you. And the number of hands keeps going down, which makes me think that many of you are not using a tool like Claude Code to its fullest. And that’s not a judgment. It’s actually a positive. It tells me that there’s still a lot of untapped potential to improve your agentic coding workflows.
If I came on stage and showed you some new exciting feature in the latest version of Swift, you’d probably start ignoring my talk and go try that feature out immediately. I’m not saying to go ignore my talk, but do adopt the same mindset when it comes to trying out new AI features early and often.
Why LLMs Do Dumb Things
I’m going to make one more bold claim. Large language models are stupid.
Why do LLMs do dumb things? The answer is simple. LLMs have no mental model. They have no sense of coherence. They’re just a computer program trained to provide the next word in a sequence based on the previous word in that sequence. The word that they provide is based on what they’ve seen in their training process, which itself is imperfect.
This is not thinking. This is a smarter form of guessing. But if LLMs are so dumb, then why am I up here telling you that they can write production-ready code?
LLMs Are Just One Piece
LLMs are just one piece of a complete AI system like Claude Code or Cursor or ChatGPT. They’re the part that we talk about most because they’re truly innovative and they’re very new and shiny. But any meaningful AI system that you interact with is more than an LLM. LLMs generate text, but AI systems add memory, planning, reasoning, and guardrails to give that text context, structure, direction, and purpose. These additional layers provide the foundations for mental models that are missing from an LLM alone.
These more complete AI systems have a name that you’ve probably heard everywhere. They’re called agents. The term “agent” is a bit of a buzzword, but it’s not just a buzzword. It’s a design pattern.
The Agent Pattern
Think about what you do when approaching a problem: you ask for context. Too little and you can’t form a clear mental model. Too much and you drown in details, pushing out what actually matters. Striking the right balance is just as important for us as it is for agentic systems.
Then we have tools. You write your code in an IDE. This is your tool. But that IDE is built upon fundamental primitives. Your IDE is made with a UI framework, AppKit, the Swift compiler, the C compiler under that, all the way down to the operating system, the kernel, and the hardware input pipeline that turns your key presses into code.
Coding agents use Unix tools like ls to list files, grep to search through code, cat to read files, and echo to write them. And then at a higher level, they also call on abstract tools like web search, a code interpreter, or even specialized systems like another large language model to generate and refactor code. This tool usage is what separates an LLM from an agent. The same way that advanced tool usage separates humans from animals.
Setting the Right Goal
Setting the right goal for an agent is the single most important thing that you can do. The better that you define the goal, the better your results will be. The more explicit you are about what an agent should or shouldn’t do, the higher the odds that an agent will succeed at the task in the way that you envision success.
Think about when you’ve been handed a design that left out key details. You only discover the gaps while coding, which means rework, frustration, and wasted time. But if those gaps are caught early, the wrong thing doesn’t get built at all. Vague or incomplete requirements always means more work later on. Whether you’re delegating work to a teammate or to an agent, clarity upfront saves you from surprises down the road.
Putting this all together: given a success criteria to achieve, the right context, and a bunch of tools, agents will run in a loop until they solve the problem that they’ve been told to solve. Agents don’t work like us and we shouldn’t expect them to. They don’t reason with a real mental model. They brute force their way to answers. But surprisingly, brute force works remarkably well.
Why Everyone Is Building Agents
With clear goals and the right tools, an agent can tackle problems that once required bespoke and hard-to-build algorithms — all without just typing out a solution.
Thinking even bigger: this is why everyone is rushing to build agents and why you as a developer have an edge. You can think systematically and wire apps together in ways that others simply cannot.
The Anatomy of a Plan
Now that we have an understanding of the fundamentals of agentic systems, we’re ready to finally start coding. Well, not writing code — working with an agentic system that will write code for us.
Working with an agent is all about planning. In my experience, this is the hardest part to teach. People think and work so differently. So rather than a singular prescriptive solution, I want to share a framework for thinking: the anatomy of a plan.
Let’s build a hypothetical plan for a hypothetical weather app. The app already shows people the weather based on their location. And now we want to add a new feature that alerts users when a rain shower is expected.
Working with AI every day for almost 3 years has helped me define a successful plan by these four principles: context, description, success criteria, and additional considerations.
Context Is Everything
As you can see from this tweet, context is everything. Context is a story of how you got here and where you want to go. Imagine you’re telling a coworker that you need help building this alert feature. What do they need to know before we even begin crafting a solution?
To provide good context, we want to share everything that’s relevant to the problem that we’re solving. To do so, we need to answer these four questions:
- What does our app already do?
- Why are we building this feature?
- Who is the user?
- What are future considerations we should know ahead of time?
Breaking Down the Problem
As we start to gather context for our new alert feature, I realize the scope is too big. What we actually need to do is break down the problem into a subset of smaller problems. Building rain alerts actually requires multiple work streams, not just a single linear path:
- We’ll need to register for push notifications in the app.
- To do that, the server needs to build infrastructure for sending push notifications.
- We’ll need server-side business logic to generate dynamic copy for push notifications based on the actual weather.
We can continue going deeper. Previously, the user would simply open the app and the server would make an API call to retrieve the latest weather as needed. Now, we’ll have to rebuild parts of our server to always poll for new data so we can send timely push notifications.
Countless more considerations will come up. This is our context. The context we share will continue to grow as we better define the feature.
Assembling Context
Our job is to assemble as much context as we can to solve our problems today and in the future. Nothing is ever going to be as simple as telling an agentic system, “Build a new feature.” We’re going to have to work hard to communicate the complexity. This is true of agentic systems as it is for people.
If you’re talking to a coworker with a lot of background knowledge, they’ll understand nuance and subtleties by filling in details from their experience. But if you’re trying to explain the problem to someone new at your company, you often have to provide additional context and be very explicit.
That’s exactly what working with an LLM is like. They’re very smart, but they have a fresh perspective. So, you need to fill the gaps in for them. And that is why we build up context for an agentic system. Otherwise, AI systems will make a bunch of assumptions that we then have to fix later after it’s made plenty of mistakes.
Sources of Context
We should pull in context from every source we have. Your codebase is an obvious starting point. If you want an agent to write good code, the number one factor that determines success is the existing quality of your codebase. This is the opposite of how most people approach starting off experimentation with LLMs. They try to vibe code something and are blown away by how much progress they’ve made so quickly. We’re here trying to build production-grade software. So that’s not going to work for us.
Your results will be much better if your codebase is well modularized with small files, readable functions, and consistent patterns. All of the things that we’ve been saying are important for years. But no codebase is perfect. So if you don’t have that yet, use an agent to help get you there one task at a time.
Besides your code, there are many other sources for context:
- Jira tickets help break down the work that needs to be done, which is helpful to AI systems.
- Notion and Google Docs should be added as well to provide meaningful product documentation.
- Designs from Figma provide visual direction.
- Sharing API docs is the best way to have an agent build a feature to spec.
- You can even include screenshots of other apps that you want to emulate and guide an agent to build something similar as a starting point.
As we continue, we’ll generate markdown docs that contain more valuable information: context about our system’s architecture, how features work, specific design choices that we’ve made, and anything that explains an important decision.
We don’t even have to write this documentation. We can tell an agent to read through our working sessions and write helpful documentation for us. Then all we have to do is review and edit the results afterwards.
Writing Your Description
Context and description go hand in hand. Your description is ostensibly your prompt plus context. And our description will be a plain English articulation of your problem statement, while context is everything that leads to the problem statement.
Most people prompt like this: “Build a rain alert feature.”
But you need to continue being detailed, describing your problem from every perspective involved. Very detailed.
I actually have a lot more to say, but it’s very hard to fit an hour’s worth of prompt writing on a slide. Now that we have a bunch of context, the problem we describe will make a lot more sense to an agent. If we say that we’re building a rain alert feature that notifies a user 15 minutes before rain starts in their location, the agent will understand why we’re building this feature and how we should approach the problem.
Over time, you get used to writing long introductory prompts and going back and forth with your agent to develop a plan. But if you have trouble with that, just ask ChatGPT to help you develop a better prompt. Don’t just ask your agent for what to do. Give it a starting point and have it ask you interrogative questions to get you there.
Parallel Workflows
As you develop more advanced workflows, you can even start building multiple features at once. For our app, I can build the server’s notification architecture in one instance of Claude Code while building the client-side code in another tab. And in a third tab, I’m asking Cursor to provide step-by-step instructions for generating APNS certificates so I can finally start sending those push notifications.
This is why planning is so powerful. It’s worth the trade-off of spending an hour to document all this when that means an agentic system can do a whole day’s worth of work or even a week’s worth of coding with only minimal involvement from me.
Defining Success
We’ve described the problem that we want to solve, but what does “done” look like? You have to be specific with your goals. Otherwise, you’ll end up with unexpected results.
We’ve established that we want a user to receive a notification 15 minutes before it starts to rain with location-appropriate accuracy. Success is not just sending the notification. Success has multiple criteria:
Timeliness: It’s important that a user receives this notification 15 minutes before it starts to rain. Not 5 minutes, not 2 hours. Our success metric is properly alerting users 15 minutes before it starts to rain.
Accuracy: If it’s going to rain for 15 minutes, we have to tell the user. If we don’t tell the user and they get rained on, then we haven’t succeeded.
Annoyance: A sufficiently smart AI would think, “Well, if I just send a user a push notification every 15 minutes, the user will never get rained on.” And that’s why it’s important to tell an agent what’s right and what is wrong.
Once we set a success criteria, the agent will do whatever it can to succeed. It’s crucial to ensure that an agent knows what success looks like. That’s how we make non-deterministic systems more deterministic.
Edge Cases
And lastly, we have edge cases — because edge cases are always the last thing a developer thinks about.
I’m on stage right now. So, this is my location, but people move around. I could be on a bike. I could be on a train. I could be in a car. Or I could be walking at different speeds. Maybe I haven’t opened the app since yesterday and I’m in a completely different city.
Now, we can mitigate this edge case by making the server always send a silent push notification to synchronize a user’s location before deciding to send a rain alert notification.
This is exactly the kind of detail that we need to mention in our plan if we want an agent to build production-grade software. There are going to be countless other edge cases like this, which is true in our usual product development process as well.
Planning Is Thinking
The process that I’ve described isn’t just writing and documentation. It’s thinking.
If you began building a large production system without thinking ahead, I wouldn’t expect you to see good results. And if you don’t think ahead before asking an agentic system to complete a task, you shouldn’t expect good results either.
There are unending techniques, styles, and preferences for the process of constructing your plan. Lately, I’ve been working with Claude in a GitHub issue, so it has access to my code and the ability to generate draft pull requests while we’re planning. This is a nice trick that helps make sure that we’re similarly aligned about how the code should work since ultimately the code is what matters. But these practices change all the time, so you should experiment and find what works for you.
One Last Tip
One last tip for planning is to add a phrase like this to the end of your initial planning prompt:
“Before we begin, ask me any clarifying questions you need to understand the problem fully.”
This will actually help you work with AI to catch a lot of assumptions that you’ve made. This will help make sure that you don’t miss any important context that an agent needs. You’ll find that there’s at least one thing you didn’t consider every time.
One More Thing
Speaking of one more thing, I have just one more thing to share.
I want to go back to earlier in the talk where I told you that agents don’t solve problems like we do.
- Hummingbirds can see colors that humans aren’t even able to imagine.
- Elephants communicate through the ground by producing and detecting seismic rumbles.
- Orb weaver spiders capture sound with their silk, using their webs to hear.
- Sharks can feel electric fields, and sea turtles cross oceans using the Earth’s magnetic field for navigation.
Why am I up here rattling off a bunch of animal facts after talking about AI for 30 minutes? Because the way that we perceive the world is grounded in our experiences. Humans have five senses, and it’s impossible for us to imagine what others would feel like. A shark probably doesn’t know that other animals don’t sense electric fields because it’s all the shark has ever known.
What if we begin to question our assumptions for how software should be built? Maybe it doesn’t have to be the way it is just because it’s the only way that we’ve ever known.
Final Thoughts
I can’t teach you everything about AI. I don’t know everything about AI. Instead, I came here for two reasons:
- I wanted to share the foundations of how AI systems work.
- I wanted to ask you to imagine that maybe your experience building software with AI isn’t complete. Neither is mine.
Behaviors of AI systems are so emergent that we’re all kind of figuring things out as we go. And while I don’t want you to abandon everything you know about software, I think it’s important to have epistemic humility — humility about a technology that’s constantly changing and that none of us had ever used three years ago.
Thank you all so much for listening.
Q&A
Q: If I had to choose an animal, I’m a dinosaur. And every once in a while when I’m stuck writing something, I’ll pull out paper and a pencil. I’m not afraid of AI, but I like writing code. What do you say to someone like me that maybe my code isn’t as pretty as I can get Claude to write, but I like what I do?
A: Yeah. I mean, I think you should write software how you want to write software. I’m not actually here to force anyone into this. I just want to show that there is definitely more possibility in the world than we imagine. And there’s this quote — telling someone that a love song already exists doesn’t stop them from writing another love song. Same thing for programming.
Q: Telling someone that a to-do list exists doesn’t mean they won’t build another one.
A: Exactly.
Q: Someone wants to know how do you judge if the context that you give an agent is working for or against the task? Is it just vibes or how do you know?
A: It’s all vibes, man. No. I think that it’s just like any skill. Someone asked me a couple days ago, how am I supposed to teach a junior developer how to code when they have AI? And it’s the same thing. They didn’t know how to code. They went through the same mistakes that a senior developer probably has already forgotten that they too at one point didn’t know everything. And so you learn by iterating and trial and error and practicing and then you see what techniques work and what don’t work — in the same way that when you were starting to program, you had to try things out. Go to Stack Overflow, posts on Google, then YouTube came around. There’s infinite ways to learn. There’s infinite ways to continue getting better at this skill.
Q: How does someone become a senior developer without doing the things that we used to do as junior developers?
A: Yeah, I actually think that you still need to do them. That’s kind of the reason that I said in this talk this is a process of learning. This is not a substitute for learning. This is a different modality for learning. It’s a new way to learn. It’s a new way to build things. And it’s true that there are some people who use AI and they start to forget things. But if you want to be the best version of yourself that you can be, then it’s an imperative for you to continue to learn. And I use AI in that way to help me continue learning. But there are definitely people who use AI to stop thinking. And I don’t think that’s a good thing.