Step One to Using AI Well: Stop Chatting with AI

By 2026, AI has become widespread. Companies are all-in on it. Meta even blocked out an entire week for mandatory AI training. But here's what I keep noticing: most people, including heavy users, are interacting with AI the same way they did two years ago. They open a chat window, type a question, wait for an answer. The only difference is they've swapped GPT-4o for GPT-5.2 or Doubao, or upgraded from free to Pro.

That's better than not using AI at all, but it's nowhere close to optimal. I'm convinced, and I'll show you evidence below, that the productivity gap between using AI and using AI well isn't 30%. It's an order of magnitude. Just because you use ChatGPT, even heavily, doesn't mean you've joined some AI-native vanguard where you can sit back and relax. Most people are using AI like someone who got a car but still drives horse-carriage routes: same roads, same speed, just a different engine. The real gap comes down to whether your way of working actually matches how AI is capable of operating.

Here's a recent real example from my own work. I needed to improve an algorithm. From the initial meeting to map out the direction, through analyzing failure cases, to implementing the fix and verifying results, AI (specifically Cursor) ran autonomously for about 45 minutes. It completed the full loop on its own: design, implement, test, find issues, diagnose, fix, verify again. Every failing case was resolved. My role throughout was to set the direction and review the outcome. Doing the same thing in ChatGPT would conservatively take five to ten times longer. Where does that 10x gap actually come from? I'll explain the why first, then walk through the complete example to show the how.

Why the Chat Window Is a Ceiling

Starting around late 2024, a new category of AI tools emerged: Cursor, Claude Code, Codex. On the surface they look like coding tools, but they represent a fundamentally different way of using AI compared to ChatGPT. A lot of people assume they're just ChatGPT for programmers, but my experience is that they're useful for almost all knowledge work. The difference plays out on three levels.

Level 1: The feedback loop. You ask ChatGPT to write some Python. It writes it. You copy it to your IDE, run it, it errors. You paste the error back, it gives you a revision, you run it again, still wrong, you paste again. In this cycle, you become the human errand runner in the feedback loop: AI produces, you test, you shuttle information back and forth, AI revises. You've gone from the person directing AI to the person doing the legwork.

The core difference with Cursor is that it's connected to your execution environment. It writes code and runs it directly. Sees an error, fixes it, runs it again. The loop is AI-driven. This turns AI from a consultant who gives advice and walks away into an employee who can work independently. The consultant says their piece and leaves, with no idea if it was right and no accountability. The employee validates their own work and fixes problems when they find them.

This is also why a lot of people think AI is unreliable: they've been using open-loop AI that fails and doesn't know it. Give it a closed loop, and reliability improves dramatically.

Level 2: Context supply. The bottleneck on AI output quality, much of the time, isn't how smart the model is. It's how much relevant context the model can see. Same model, enough context: correct result. Same model, guessing blind: it fills in the gaps with something that might be completely wrong.

A reader recently commented: between Deep Research from the major AI providers versus plugging a search API into a local tool, which is better? My answer: I haven't opened Deep Research in months. The search quality isn't the issue. It's just too limited in what it can actually solve. Say I want to compare two algorithms for my specific use case at work. "My use case" requires careful description, because it directly determines what dimensions matter for the comparison: what my data looks like, whether I care about latency or accuracy, what the deployment constraints are. With Deep Research, I have to spend a lot of time explaining all that background. In Cursor, I just @ a few internal docs and meeting notes, and AI immediately has all the context. Even if the search capability is slightly weaker, the results are more relevant and the whole thing is faster.

ChatGPT's bottleneck is often context supply: it's hard to feed it enough information. Cursor-style tools solve exactly that problem.

Level 3: Asset accumulation. ChatGPT's usage pattern is consumptive. You put in time, you get an answer, the answer gets used, then it's gone. Every conversation starts from zero. Cursor is investment-style. You needed an internal doc? Save it to the project folder. AI keeps making the same mistake? Spend two minutes writing a rule. Your team has conventions everyone follows? Write them down so AI knows too. Each of these is a one-time investment with compounding returns.

Over time, this creates a flywheel: the more you use it, the more you've accumulated, the better AI understands your project, your preferences, your working style. ChatGPT is always a stranger who needs a full briefing every time. Cursor becomes a collaborator who gets more in sync with you over time. One resets to zero; the other compounds.

These three levels, feedback loop, context, and asset accumulation, are why that 45-minute example was possible. But knowing the reason isn't enough. What matters is how to actually make this work day to day. The full example below shows that.

Three Tiers: A Complete Example

Before walking through it, let me introduce a framework I've developed through practice, which I call the Three Tiers. Every step in your work produces information. How you handle that information determines how much AI can help you. The Bad tier: information disappears (neither you nor AI can see it later). The Better tier: information gets recorded in a human-readable format (human-friendly, AI-unfriendly). The Best tier: information gets stored AI-first, then made human-readable. I'll apply this framework to every step below.

Step 1: The meeting. The team's weekly sync, where we discussed cases where an algorithm was failing on certain data and brainstormed hypotheses and improvement ideas.

Bad tier: meeting ends, nothing is captured. Better tier: write up a Google Doc with meeting notes. This is already a solid practice. It increases your visibility, your colleagues know what happened, and it's easy to reference later. But AI can't easily access this content: Google Docs require login, the format is messy, and every time you want AI to reference it you have to manually copy and paste. Better tier is human-friendly and AI-unfriendly.

Best tier: use Zoom AI Companion or a similar tool to auto-transcribe the meeting, save it as a .md file, put it in a meeting_notes directory inside your work folder. Time cost is nearly zero, but AI can now directly reference every detail from that meeting going forward.

Step 2: Analyzing the data. I needed to look at how the algorithm performed across different inputs, and document the specific failure scenarios and their causes. Same Three Tiers logic: Bad tier is jotting a few URLs in a sticky note and clicking through them when you need to show someone. Better tier is writing it up in Confluence. Best tier is creating an analysis_notes.md in your work folder with each failure case's link, failure reason, and observations.

Worth noting: the Best tier in these two steps takes about as much time as the Better tier, sometimes less, because .md formatting is far simpler than Confluence, and you can have AI help you organize it.

Step 3: Writing code to improve the algorithm. This is where the Best tier really shows its value. Because all the information from the first two steps lives in the same folder, I open Cursor, @ the meeting notes, @ the analysis notes, and tell AI: based on this, design an improvement and implement it, then verify that the failing cases are fixed.

Look at how complete the context is that AI has at this point. It knows why the algorithm needs to change. It has improvement ideas (the meeting notes have that discussion). It knows the specific failure patterns and their causes (the analysis notes have that). It knows the success criteria (which cases need to be fixed). That last piece is the most critical. A lot of people tell AI what to do but skip what "done" looks like. It's like a race with no finish line: AI runs by feel, you judge by feel. But give AI a clear finish line (all these failing cases must pass), and it can run the entire loop from design to implementation to verification on its own: write code, run tests, find problems, diagnose, fix, verify again. That's what happened in those 45 minutes.

(What's going on behind the scenes is actually more complex than it sounds: AI automatically broke the task into subtasks, scheduled multiple agents to work in parallel, with the main agent handling design and review while sub-agents handled coding and testing. But that's a more advanced topic.)

What if you did this same thing in ChatGPT? You'd have to manually paste in every piece of context. Maybe you paste the meeting notes as background, then open another chat for the code changes, copying back and forth between your Python environment and the chat window constantly. Beyond the inefficiency, this approach lacks any self-correction ability. You have to review every intermediate result yourself, decide where things went wrong, and manually feed that feedback back in. The hassle is secondary; the bigger cost is all the detours. AI can skim a thousand lines of logs and identify the problem in seconds. A human needs specialized visualization tools just to see what's happening. That's where the 10x gap comes from: on one side, information fully connected, loop automated; on the other, information siloed, loop driven by hand.

Step 4: Writing documentation and preparing the presentation. Because all the analysis, code, and results are in the same folder, I have AI generate a technical document directly from that content, then paste it to Confluence.

Notice the order: generate in Cursor first, then copy to Confluence. AI first, then humans. This reversal is actually the deepest mindset shift in the entire workflow. The traditional approach is human-first: I write the document, then maybe have AI polish it. The Best tier is AI-first: information lives in a format AI can consume (.md files), AI does the main work (generates the document), and only then does it get converted to a human-readable form (Confluence page). The result is less time spent, higher quality output, and the AI-consumable source material stays in your folder for future use.

From the meeting to finished documentation, the whole thing took half a day.

When you handle every step with the Best tier, all information converges in the same folder, forming what I called the Mono Repo pattern in a previous post. AI can naturally access all the context across every topic. At that point, AI's capability takes a noticeable leap, because it finally has access to your complete information map. Think back over your work last week. How many steps were Bad tier? How many were Better tier? If most of your answers are Bad and Better, that's the gap between where you are and 10x productivity.

Stepping back and looking at this workflow, there's a fundamental shift: in the traditional model, the human is the primary executor and AI is the assistant. In this workflow, it's reversed. AI is the primary executor; the human's role is to set direction, define success criteria, and make judgment calls. Put it another way: our conception of AI should upgrade from have AI help me write code to have AI help me solve problems. Writing code is just one piece of solving problems. If you give AI enough context and a clear definition of success, it can complete the entire loop independently, and your role becomes the one who sets the problem. Your value lies in knowing which direction the algorithm should go, and knowing what a successful result looks like. That kind of judgment is your core capability as a professional, and it's exactly what AI depends on you to provide.

This applies to every profession. Engineer, data analyst, product manager, researcher. If your work involves gathering, analyzing, deciding, and producing information, the Three Tiers apply, and the value of a feedback loop is there. The only difference is whether the loop AI runs for you involves writing code, doing analysis, writing documents, or something else.

Getting Started

The tools will change. Today it's Cursor and Claude Code; tomorrow it'll be something else. But three things are durable: a feedback loop that lets AI correct itself, context supply that lets AI understand your world, and asset accumulation that makes your collaboration with AI more efficient over time. These are the underlying principles, independent of any specific tool.

If you do one thing today, here's my suggestion: find a project you're currently working on, create a folder, and spend 30 minutes copying all the relevant documents, notes, and meeting records into it. Then, even for work you'd normally turn to ChatGPT for, resist that impulse, open Cursor instead, and start your next conversation with AI from there. You'll feel the difference immediately. Start now.