The Brain Is Outside the App

The Brain Is Outside the App

Last year I got really into weightlifting. I pushed myself too hard, too fast, and ended up blowing out my knee doing heavy squats like an idiot.

After nine months of recovery, I was excited to get back into lifting - but this time I wanted to do it correctly. So naturally, I spent a lot of time with ChatGPT learning the principles of weightlifting. I added the following points to my thread context:

  • 35-year-old male, returning to lifting after a knee injury from heavy squats last year
  • 45 minutes, 4x per week to train
  • Goal: maximize strength while feeling good and keeping injury risk low
  • Focus on the 80/20 power law principles
  • Program must be indefinitely sustainable

I learned a ton about lifting (probably worth its own post), but here are the highlights:

  • There’s a scale called Rate of Perceived Exertion (RPE) from 1 to 10, where 10 means lifting to failure, 9 means leaving one rep in the tank, 8 means leaving two, etc. Most of my training - especially at the beginning - should sit around RPE 7 (about three high-quality reps still in the tank).
  • I should focus my attention on the quality of each muscle contraction to develop my neuromuscular connection. Weightlifting is about the quality of the contraction - not the number on the bar.
  • For most lifts I should use a 3-1-1 tempo. For squats, that means three seconds down, a one-second hold at the bottom, and one second controlled up.

My whole conception of weightlifting was turned upside down. I realized I’d been mindlessly training at RPE 9–10 indefinitely, with almost no attention to form, tempo, or how my body actually felt. It’s no wonder I got injured. Motivated by everything I’d learned, I asked it to generate me a workout plan. It came back with deep explanations of each lift - form, breathing, and psychological tips - along with a specific breakdown of which exercises to do, how many sets and reps, how much weight, and what the expected RPE should be.

There were all kinds of interesting details such as “If the first two sets feel RPE 6–7, complete the second two sets with 5 lb more weight.” I could keep asking followup questions until I fully understood the intention behind each workout.

For the next several weeks, I would do the proposed workouts and keep track of notes with questions and comments I had while lifting. After each session, I’d paste those notes into the chat thread and use the updated context to generate the next ideal workout. It worked, but copying formatted text from a mobile app to a chat window while sweating in a gym rack was getting old fast. It was the beginning of a real iteration loop between my workouts, my notes, and ChatGPT - just with a lot of manual glue at every step. The experience was so good that I decided to streamline it.

I found an app called Hevy, which had a really clean interface for creating workouts, logging them, and adding notes. It had a large library of lifts, the ability to specify RPE, and - most importantly - an open API.

I spun up codex-cli (a command-line coding agent that can run arbitrary bash commands and edit files; OpenAI’s response to Claude Code). I vibe-coded a set of CLI tools that used the Hevy API to read and write workouts. Next, I wrote my AGENTS.md file, which explained my lifting objectives and how to call the tools I had just built. Within an hour, I had a nice CLI-based UX with the following loop:

  1. Download all my workouts from Hevy
  2. Load the last three workouts into the context
  3. Answer the questions and respond to the comments from those workouts. Based on my performance, generate the ideal next workout and ask for my feedback. Keep iterating until I’m satisfied with the plan.
  4. Once I approve the workout, export it back to Hevy

I’ve been running this loop for a couple weeks now and it’s been incredible. The Hevy frontend is really great: I can log sets and add specific comments and questions with as much detail as I want. On the backend, the agent uses all of those detailed logs and notes to generate the next workout. Best of all, I’ve been making steady progress, feel fantastic, and don’t feel at risk of injuring myself at all.

In and of itself, this has been a crazy experience. But what really fascinates me is what it points to more broadly: an early glimpse of how software and AI will work in the coming years.

Principle 1: The brain is outside the app

We’ve had a decade+ of apps. Our phones have hundreds of them. The UX within individual apps is generally pretty good, but the UX across apps is absolutely horrible. That’s not an accident. Product teams have spent years building walled UX gardens where data is meant to flow in, not out. Exposing clean, high-fidelity exports has been treated as a strategic risk - it makes it easier for users to leave. So while some companies offer integrations (e.g. send your workouts to Apple Health), they’re usually brittle and lossy. Each side has a different, often incompatible data model or simply doesn’t bother to send a complete payload. As a result, most of the time we live inside of individual apps. I think this is going to really start to change.

Consider the lifting example above. In the pre-AI world, I would have spent almost all of my time in Hevy. Any questions I had about lifting would have to go to a different app (Google Search, YouTube, etc.). The friction to do this was so high that I usually didn’t bother.

In the AI world, Hevy is just a frontend. The data from Hevy is immediately extracted, centralized, and paired with rich context about who I am and what I’m trying to achieve. All of my interaction with that data happens through a conversational codex chat where I can ask arbitrary questions, get clarifications, and reshape the plan. Once I’m done, I simply export the resulting workout back into Hevy. The brain is outside the app.

Principle 2: Using codex to build its own tools

One of the most surreal parts of this setup is the stack of abstraction layers. I’m talking to codex in natural language, asking it to design and refine the CLI tools that talk to Hevy’s API. Then, in the very same thread, I’m asking it to call those tools and do something even more abstract: reason about my training history, answer my questions, and design the next workout.

The key shift is that the agent isn’t just using tools - it’s authoring and maintaining the tools based on the specific needs of the task at hand. In the old world, I’d open an editor, read the API docs, write a script, run it, fix the bugs, and slowly arrive at something usable. That something would be static. Iterating on it was (relatively) expensive and so software would be slow to improve.

Now I can say, “Build a cli command that fetches my last three workouts and prints the logs and notes in markdown format,” and codex will sketch the interface, write the code, run it, and wire it into the loop. If I don’t like the behavior, I just ask it to change the tool, and it does.

Because the agent can continuously edit its own tools, the boundary between “product features” and “one-off scripts” starts to dissolve. I can ask for super-specific behavior like, “Tag any sets where my knee felt weird, surface them in the summary, and bias future workouts away from those movements unless progress is stable,” and it can memorialize that logic into the tools and context. The tools become a living extension of the conversation, shaped continuously by what I ask for and how I respond, instead of a static feature-set defined in advance by a vendor.

Principle 3: It’s all about the iteration loop

Underneath all of this is something simple: an iteration loop. In the old world, my loop was bad. I would show up at the gym, push myself to RPE 9–10 in every set, maybe remember roughly what I’d lifted last time, and call it a day. I wasn’t systematically capturing what was happening in my body, what felt off, or which cues actually helped. The cost of turning raw experience into learning was so high that I mostly didn’t bother - and the result was predictable: I got stronger for a little bit and then blew out my knee.

A good iteration loop does three things:

  1. Capture exactly what actually happened
  2. Reflect on what it means (vs prior expectations)
  3. Turn that reflection into the next concrete (and optimal) action

The new setup does all three with almost no friction. Every workout, I log sets and RPE in Hevy, but I also add free-form notes: moments of hesitation, questions about form, small signals from my knee. The agent pulls all of that into context, reasons about it in light of my goals, and proposes a next workout that explicitly explains what changed and why. If something doesn’t feel right, I say so, and the loop updates again.

The result is a genuinely tight feedback cycle between my body, my notes, and the agent. I’m not just following a program, I’m actually learning how my body responds to different stresses and adjusting in real time. Progress feels smoother and more grounded, and the risk of injury feels much lower - not because I’m being more conservative, but because the iteration loop has turned me into a better lifter.

Principle 4: Context is everything

Every agent run lives inside a context. I like to think about that context in two layers: static context and dynamic context.

Static context is the canned information that gets loaded into every run or thread, no matter what tools are called. For me, that’s captured in AGENTS.md: who I am, what I’m optimizing for, how aggressively I want to push, how I think about risk vs reward, what kind of explanations motivate me, and the structure of the lifting loop itself.

Dynamic context is everything that moves. These are the tools and APIs that pull in my actual workouts, notes, and logs from Hevy. I can vibe-code these tools pretty quickly, but they’re only as good as the static context they’re plugged into. The tools say what happened and the static context encodes what I care about and how to respond.

Getting the static context right is the highest-leverage part of the whole system, because it guides every future agent run. My best iteration loops don’t just end with a better next workout - they end with an update to AGENTS.md so the learning is permanently baked into the static context.

Principle 5: Advait’s Super Custom Lifting Agent™

The more I’ve refined this setup, the more obvious it feels that no off-the-shelf product can catch up to it. Advait’s Super Custom Lifting Agent™ is built from the inside out: it knows my injury history, my risk tolerance, my weird psychological quirks, and the exact iteration loop I actually stick to. It’s not a generic “fitness solution”. It’s a bespoke workflow that grew out of my own behavior and keeps adapting with every session.

Commercial products have to design for large segments and lowest-common-denominator workflows. My agent doesn’t. It can afford to be hyper-specific and opinionated because the only user it has to please is me. If I decide I care more about long-term joint health than short-term PRs, that shift shows up directly in AGENTS.md and in the tools it calls. Within a day, the entire system is behaving differently.

Once you get used to this level of fit, it’s hard to imagine going back to vendor-defined experiences. And this is just the starting point. Next, I can wire in my runs from Strava, or even pull in scanned blood test results and let the agent reason across all of it. There’s effectively zero chance any single vendor is going to integrate this deeply with the random constellation of tools I use.

Principle 6: Opening the walled gardens

For decades, software vendors have treated data exfiltration as a threat. The whole game was to build a sticky product, keep data inside the walls, and make switching costs as high as possible. Open APIs and clean exports were considered nice-to-have marketing bullet points at best.

The agentic world flips that logic. The more accessible an app’s data and actions are to my personal agent, the more valuable that app becomes to me. I didn’t pick Hevy because it had the flashiest workout UI. I picked it because it had a solid data model and an open API that I could plug directly into my lifting agent. As a result, Hevy is now 100x more valuable than any closed competitor, because it’s wired straight into my context, tools, and iteration loop.

This goes beyond just having an API. Apps can go a step further and ship an llms.txt file that explains the app, its data model, and its key APIs in a format that’s easy for agents to consume. Instead of forcing every agent to reverse-engineer the product from scattered docs and UI flows, llms.txt gives them a clean, LLM-friendly map of how to work with it.

In this world, the vendors that make it trivial for agents to interface with their products on behalf of their users will become instantly more valuable than those who don’t. The best apps will look less like walled gardens and more like high-quality frontends on top of deeply interoperable, well-documented data and action surfaces.

Principle 7: Every business will grow its own agents

What I’m doing with lifting is just a tiny, personal version of what every business will end up doing for their own loops. Instead of buying monolithic products and trying to contort workflows around them, companies will build their own agents that sit at the center and orchestrate their loops.

These agents will have rich static context: how the business makes money, what products and services it offers, how its customers behave, and what good looks like in each function. They will wire up dynamic context via tools that read and write data across CRMs, ticketing systems, data warehouses, financial systems, and internal docs. The result is the same pattern as my lifting setup: custom static context, bespoke tools, and a tight iteration loop that reflects how the organization actually operates.

In that world, vendors don’t disappear, but their role shifts. Businesses will still buy best-in-class apps for specific surfaces (sales UI, support UI, analytics, etc.), but those apps will be treated as frontends with data stores that are plugged into the company’s own agents. Operational intelligence and decision-making will live in the agents and their context, not in any single vendor’s product.

What’s next?

In some sense, nothing here is that complicated. I hurt my knee because my iteration loop was bad. I wasn’t paying attention, I wasn’t learning from my own data, and I outsourced “what should I do next?” to vibes. Many businesses operate the same way. All kinds of incredible ground-truth information flows through these businesses every day in the form of tickets, emails, dashboards, meeting notes, and more. Very little of this information is actually used to change how the organization thinks or behaves. The iteration loop is incredibly slow and leaky.

I’m really excited to see the next generation of businesses that adopt the agent-first mindset to tighten their iteration loops. It feels like an order-of-magnitude improvement is within reach. The way that we’ll think about business operations will change entirely. Best of all, it is a non-zero-sum game. Businesses will be able to operate more efficiently and effectively, employees will be able to generate more value with higher leveraged tools, customers will benefit from lower costs and better service, and the world will be a fundamentally better place. Very exciting times ahead.