AI Finance

The AI Productivity Tax Nobody Budgets For

Saul Mateos

I'm the Guy Who Tells Everyone to Adopt AI

I'm also the guy losing 40 hours a week to AI training, debugging, and context maintenance. Both things are true.

I run multiple Claude terminals right now. One is working through our board prep. One is building a financial model. One is drafting a client update. That's not a brag. That's what month eight looks like after you've committed to this for real.

What I'm about to describe isn't a cautionary tale. AI adoption has been one of the best operational decisions I've made. But the story your vendor tells you, the one about 10x productivity and frictionless implementation, leaves out about 40 hours a week. This is what those hours look like.

The Wins Are Real

Let's start with what actually works, because the gains are not small.

Board prep used to take me six hours. This means pulling the latest numbers, reconciling variances, writing commentary, formatting slides, and doing two or three revision loops with our managing partner. Now it takes 90 minutes. The AI handles the first draft of every section. I review, add the judgment layer, tighten the narrative. Done.

Weekly reporting dropped from four hours to about 45 minutes. I still review every output carefully. I'm not sending AI-generated numbers to clients without eyes on them. But the mechanical work, the pulling, formatting, and table-building, happens in the background while I'm doing something else.

Data pulls that used to require manual queries against multiple sources now surface as automated summaries every morning. I open my laptop and the overnight numbers are already organized, compared to the prior period, and flagged for anything unusual.

On paper, across a small team, we're saving 15 or more hours per week per person. That's real. It changes what's possible.

Here's the part the productivity charts don't capture.

The Tax Nobody Budgets For

The savings are real. The overhead is also real. And nobody in the AI vendor ecosystem has any incentive to tell you about the overhead.

Training overhead is relentless. Every new workflow needs to be explained. Every prompt needs testing, refinement, and periodic review as the underlying work evolves. When we built our board prep workflow, I spent roughly 12 hours getting the prompt architecture right. I tried different phrasings, validated outputs against past decks, caught the places where the AI made confident-sounding errors, and built in safeguards. That 12 hours is not counted in the "six hours saved per board prep" math. It's buried.

And training isn't a one-time event. The business changes, so the context the AI works from has to change. New client, new reporting structure, new metrics the managing partner cares about. Someone has to update the reference documents, test whether the prompts still produce accurate outputs, and catch the places where the old context is now producing wrong answers. That someone is usually me.

Debugging loops eat more time than I expected. AI hallucinates. Not constantly, and not randomly, but enough that you cannot skip the review step. The scary version isn't when the AI is obviously wrong. It's when the AI is confidently, plausibly wrong. A number that's slightly off. A variance explained in a way that sounds reasonable but misses a key driver. A cash flow projection that uses a stale assumption because the reference file wasn't updated.

When we catch one of these, the loop starts. Reverse-engineer what the AI did. Find where the context was wrong or the prompt was ambiguous. Fix the prompt. Test again. Verify the fix didn't break something downstream. This is typically a one-hour exercise for a simple issue. For anything that touched a complex workflow, it can be a morning.

Context management is a full-time job. AI is only as good as the context it has. That means someone has to maintain the knowledge base. Update the reference documents when the business changes. Verify that the prompts written three months ago still make sense given what's changed. Deprecate workflows that are no longer needed. Build new ones for problems that didn't exist last quarter.

This is not technical work exactly. It's organizational work. It requires understanding both the AI capabilities and the business deeply enough to know when the two are out of sync. On a small team, that role defaults to whoever implemented the system. That's me.

Integration maintenance is the silent killer. Our current stack has Claude connected to Notion for knowledge management, Notion feeding into our reporting workflows, the reporting workflows pulling from our database, and the output flowing into the client reporting layer. When it works, it's magic. When one piece breaks, everything downstream is wrong and nobody always knows it right away.

These integrations don't break on predictable schedules. They break when an API changes, when a third-party service updates their schema, when a prompt that worked under one model version produces different results under an updated one. They tend to break at the worst possible time. The Thursday at 4pm version of this is now a recurring joke on our team, because it happens more than I'd like to admit.

The Adoption Curve Has Three Phases

Phase 1 is the hype period. Weeks one through four, roughly. Everything seems possible. You're measuring productivity gains and sending screenshots of the outputs to your partners. You're already planning the second and third use cases. The numbers look great because you're counting the outputs without counting the implementation time.

Phase 2 is the dip. This usually sets in around month two and can last through month six. The 40-hour tax shows up. The first major debugging loop happens. The prompt that worked perfectly for three weeks suddenly produces garbage because something in the context shifted. You realize the knowledge base needs constant maintenance. A team member who was excited in month one starts going back to their spreadsheets because the AI workflow "feels too complicated."

Most teams I've talked to abandon ship in Phase 2. They declare that "AI didn't work for us" and go back to the way things were. What they actually experienced was a normal adoption curve, and they got off the ride at the low point.

Phase 3 is the climb. This starts around month six if you stay in it. The gains start compounding. The team has enough experience with the tools that basic issues get handled without escalating to you. The prompts are dialed in. The workflows are stable. New implementations take less time because you've built a library of patterns that transfer across use cases. The 40-hour weekly tax starts declining.

The catch: you have to survive Phase 2 to get to Phase 3. And most teams don't, because they never budgeted for Phase 2 in the first place.

What the 40 Hours Actually Looks Like

This is the question I get from every finance leader who's heard the headline number. "Forty hours a week on what?"

It's not sitting in front of a computer typing prompts. It's distributed across the week in ways that are hard to see.

It's the 20 minutes in the morning reviewing the automated reporting output before it goes anywhere. It's the hour on Tuesday because the board prep AI produced a number that didn't match and I had to trace where the disconnect happened. It's the 90-minute session on Thursday building a new workflow for the client analysis that used to be done manually. It's the Friday afternoon reviewing the knowledge base documents and updating three of them because we closed a new client and the context needs to reflect that.

Individually, none of these feel like "AI overhead." Collectively, they add up to a substantial chunk of the week. The difference is they're building something. The debugging makes the system more reliable. The context updates make the outputs more accurate. The new workflows compound over time.

Compare that to the hours they replaced. The six hours on board prep didn't compound. Doing it manually next month would take six hours again. The AI overhead is an investment. The manual work was pure expense.

That reframe doesn't make the 40 hours easy. But it changes whether you stick with it.

Month 8: Honest Update

I'm in the climb. The 40 hours is down to about 25.

The team can troubleshoot basic issues themselves now. When a prompt produces something unexpected, the first instinct is no longer "get Saul." It's "let me check the context file and see what changed." That's a real shift. It took about five months to get there.

The workflows that have been running for four-plus months are genuinely stable. I rarely touch them. New workflows still require the full implementation investment, but the build time has dropped because I have a library of patterns to pull from.

I'm still the AI tutor and debugger-in-chief for anything complex. That won't fully change until we have more people with the right combination of business context and AI fluency. Building that takes time.

The 15 hours of savings per person per week is real and holding. The system is net positive and getting more positive each month.

What to Budget Before You Start

If you're planning an AI implementation, here's what the honest budget looks like.

Budget a minimum of 20 hours per week of senior time for the first six months. Not junior time. The person maintaining AI context needs deep enough business knowledge to know when an AI output is wrong, not just technically wrong but strategically wrong. That's not a task you can delegate to someone who doesn't understand the business.

Assume the productivity gains don't show up on net until month four or five. The gross gains come quickly. The net gains (gross savings minus implementation overhead) take longer. If your budget model shows positive ROI in month one, adjust the model.

Pick one workflow, not five. The temptation in Phase 1 is to automate everything immediately. The teams that survive Phase 2 are the ones that went deep on one workflow before expanding. Get one workflow fully stable, with a documented prompt architecture and a clear review process, before you add the next one.

Assign context ownership before you start. Someone needs to own the knowledge base and the prompt maintenance. Define who that is, what their responsibilities are, and roughly how much time per week you expect it to take. If you don't assign it, it defaults to whoever cares the most, which is usually whoever implemented it, which means it defaults to you.

The Both/And Is Actually True

AI is worth doing. The teams that figure this out will have a structural advantage for the next several years, and the gap will widen as the tools improve and the workflows mature.

AI adoption is also expensive. Not in licensing fees. In the time, attention, and cognitive load required to implement it correctly and maintain it over time.

Neither of those things cancels the other out. The productivity gains are real. The implementation tax is real. Budget for both, stay through Phase 2, and the math eventually works in your favor.

P.S. Three Claude terminals open as I finish this. Planning, executing, drafting simultaneously. The prompts are finally dialed in. Month eight looks nothing like month one. That's the point.

Want to talk about your finance function?

I spend 30 minutes with CFOs and finance leaders every week discussing how AI fits into their operations. No pitch, just a conversation.

Book a 30-Minute Conversation

or email us at hello@strategiq.so

More from Insights