Why the Org OS Cannot Run AI

Human in the Loop | Why the Org OS Cannot Run AI

V02 Agent Is Not the Default Answer

This is Part 2 of the Human in the Loop series.

It is not that AI cannot run. It is the company that cannot run.

Last piece, I said AI is not a new tool. It is a new division of labor.

This piece has to ask the harder question.

Why does the new division of labor, the moment it enters a company, jam up so many organizations?

On the surface, the jam looks different from company to company.

Cash-rich companies bought the tools, opened the accounts, granted the tokens, ran the training. Then the leadership team sits down and finds they cannot say whether the next step is layoffs, redeployment, more output, or higher quality.

Cash-poor companies cannot afford the big system. They tell their people to learn AI on their own. Six months later, everyone has tried a stack of tools, the group chat looks lively, and the operating numbers have not really moved.

One is jammed on "how do we use all this resource."

The other is jammed on "how do we even start with so little."

They look opposite.

Underneath, it is the same problem.

The old Org OS cannot run the new division of labor.

Org OS is not some fancy term.

It is the stuff a company quietly runs on every day. How roles are defined. How workflows move. Where knowledge lives. Who is on the hook for a piece of judgment. How errors get reviewed and rolled back.

Once AI is in, tasks get split. Judgment gets split. Knowledge gets called up in new ways.

A lot of companies are still using the old roles, the old workflows, the old performance reviews, the old approval chains to catch all of that.

The model is not failing to run.

It is the operating system inside the company that did not get rewritten.

So "the Org OS cannot run" is not a technical fault.

It is an organizational compatibility fault.

What a cash-rich company really fears is not the missing tool. It is money covering up the problem.

The problem in cash-rich companies is rarely "we are not on AI."

They can approve budget. They can buy seats. They can hand out tokens. They can hire vendors. They can train everyone.

Once those motions roll out, the organization quickly produces a surface prosperity.

Tools, in.

Training, in.

Pilots, in.

Reporting decks, in.

It looks advanced.

The real vacuum opens up at the next step.

I have seen a lot of conference tables where the jam is not "we do not believe AI is useful." It is that no one on the table has first cleared the ledger.

If one team frees up 200 hours a week, does that count as cost savings, faster customer response, higher quality, or a budget for new business experiments?

Until that question gets answered, every other department will interpret the value of AI in its own language.

Finance will see cost.

The business will see delivery.

HR will see roles.

IT will see system stability.

Every department is right.

The company has no shared answer.

Once AI makes a chunk of work faster, what is the company going to do with the capacity it just freed? Layoffs is one option. It is not the only one. Redeploy, produce more, raise quality, shorten response time, open new lines of business — those are options too.

The catch is that each of these options sits on a completely different organizational logic.

Layoffs require the company to know which work has actually been replaced versus which has just been reshuffled. Redeployment requires the company to know who can move into higher-value work. More output requires a workflow that can absorb it. Higher quality requires a new acceptance standard. Innovation requires a leadership team willing to point freed-up time at uncertain directions.

If these are not discussed up front, once AI goes live, leadership ends up making gut calls under pressure.

This is the decision vacuum inside a cash-rich company.

The tool is already in.

The capacity-allocation mechanism is not.

The company knows AI makes some things faster. It does not know who owns the speed, what to count it as, or how to allocate it.

At this point, staring at usage rates does not help much.

What matters is not how many times an employee opened the AI.

It is whether the organization has put the freed-up time, judgment, and knowledge back into the operating ledger.

The easiest miscalculation in a cash-rich company is converting the value of AI straight into "how many people we cut."

That math is too coarse.

Cutting headcount is one outcome. It is not even the first one worth asking about.

The better question is this.

Should the freed capacity go back to profit, back to customer experience, or back to organizational learning?

Back to profit, the motion goes toward cost down.

Back to customer experience, the motion goes toward faster response and higher quality.

Back to organizational learning, the motion goes toward product experiments, workflow rewrites, and people moving into new roles.

All three are reasonable.

The organizational design behind each is completely different.

If the CEO does not settle this first, the AI project will get pulled by every department toward its own KPI.

So what cash-rich companies really lack is not AI investment.

It is a decision mechanism for reallocating capacity.

Klarna is not a punchline. It is an engineering blueprint.

Klarna deserves its own dissection.

Without it, "the Org OS cannot run" can easily sound like methodology.

Dissect it, and you see it is not an AI-fail meme.

It is a blueprint.

On February 27, 2024, OpenAI published the Klarna case. In that moment, it was almost the cleanest poster for the AI customer-service replacement story. The AI assistant, in its first month, handled 2.3 million conversations, roughly two-thirds of Klarna's customer-service chat volume. Workload equivalent to 700 full-time agents. Average resolution time dropped from about 11 minutes to under 2. Projected profit improvement of $40 million for Klarna in 2024.

That set of numbers was perfect for a board deck.

It satisfied three appetites at once. Finance saw cost. The business saw response time. The CEO saw an efficiency story he could tell capital markets. On top of that, customer satisfaction did not collapse — at the time. The OpenAI page wrote that the AI assistant matched human agents on satisfaction, with higher accuracy, and a 25% drop in repeat inquiries.

Stop reading there and you get a very clean conclusion.

Standardized customer service is ready for large-scale replacement.

That conclusion only saw the first stretch of the road.

By May 2025, Bloomberg reported that Klarna had started re-hiring human agents. CEO Sebastian Siemiatkowski admitted that cost considerations had weighed too heavily in the organizational decision, and the result was lower quality. That interview was not a technical bug report. It was an organizational reflux signal. The company discovered that AI being able to handle a lot of conversations does not mean the company can hand service quality, complex emotion, exception judgment, and brand trust to the same automation logic at the same time.

This is the fourteen-month full cycle I keep pointing at.

From February 2024 — "AI equivalent to 700 agents, projected $40M profit improvement" — to May 2025 — "customers must always be able to reach a human, company is re-staffing." Klarna did not walk a failure story. It walked a complete organizational experiment curve. The replacement narrative held. The cost-benefit held. The speed metrics held. And then quality, accountability, and exception handling came back to bite.

The easiest misread here is treating Klarna as evidence that "AI does not work."

It is not. Klarna's later filings actually keep saying the AI assistant brought ~$39 million of cost savings in 2024 and another ~$59 million in 2025. In 2025 it handled 80% of customer-service chats, workload equivalent to more than 850 full-time agents. The AI line did not get killed. It kept running.

What got killed was the single-line algorithm of "organize AI purely on cost replacement."

The pressure this puts on a CEO sits here. You cannot only ask how many people AI replaced, how much money it saved, or how much average response time dropped. You also have to ask which questions AI can clear on its own, which questions must escalate to a human, when customers must have a human exit, who holds the adjudication authority on complex cases, who defines the signal that quality is dropping, and who pulls the trigger on the re-hiring threshold.

Not one of these is a model-parameter question.

They are all responsibility-chain, quality-boundary, and human-machine division questions.

Klarna's fourteen months took "replacement" out of the pretty numbers and back into the operating ground. Stage one, AI proves it can run volume. Stage two, the organization discovers that after volume comes quality. Stage three, the company starts putting humans back into the loop. The moment that sequence shows up, HITL stops being a conservative slogan and becomes an organizational necessity for the deep end of the replacement path.

So Klarna's value is not "do not use AI in customer service."

What it actually tells you is this. If your organization can only evaluate AI on a cost lens, then the more successful AI is, the more likely the organization will, in the next stage, expose holes in quality and responsibility. Replacement is not the endpoint. Replacement just lights up the cracks in the old OS more clearly.

The most dangerous place for a cash-rich company sits right here.

It has the budget, the vendor, the tool, the pilot, the shiny numbers. So everyone slips into thinking the stage-one metric is the answer for the full cycle. Klarna's fourteen months are there to remind the CEO: the prettier the stage-one number, the harder you should be pressing on who carries stage two.

$40M saved in a year. Fourteen months later, humans put back into the loop.

That turn is not a punchline.

It is a blueprint for the person in the corner office.

What a cash-poor company really fears is not the missing budget. It is mistaking activity for progress.

The problem in cash-poor companies is more hidden.

No big budget. No dedicated team. No way to roll out a system in one shot.

So the most common motion is:

The boss calls on everyone to embrace AI.

Leadership organizes learning sessions.

Employees try tools on their own.

Departments share cases.

These motions are not wrong.

They just tend to stop at the posture layer.

A salesperson uses AI to write a few emails. An HR person edits a few JDs. An ops person makes a few images. These are individual productivity gains. They do not necessarily move organizational capability at the company level.

What the boss should actually be asking is not "is everyone learning AI."

It is:

Which part of the operating chain got shorter because of AI?

Which wait got removed?

Which kind of knowledge moved out of one person's head and became an asset the organization can call on?

If these questions cannot be answered, AI will not show up in the P&L.

A cash-poor company should not start with a big system. It should not treat "everyone is learning" as the result either.

It should start by connecting AI to the operating ledger.

Which work eats the most time. Which workflows wait the longest. Which knowledge depends most on senior employees.

Without that audit, AI stays at the individual-tool layer.

The cognitive vacuum in a cash-poor company is not that the boss does not care about AI.

It is that the boss has not yet translated AI from an attitude question into an operating question.

The easiest misread in this kind of company is treating "learning" as progress.

The employees did learn.

Leadership did meet.

The group chat did pass around a lot of tool links.

But if no business owner can say cleanly: this month, this workflow waited two fewer days; this category of material no longer has to be checked back with senior staff; this delivery used to take three people coordinating and now takes one person plus AI — then AI is still standing outside the organization.

A cash-poor company has to hold itself back more.

It cannot copy the big-company move of building a platform, buying a system, and forming an AI transformation office on day one.

It has to find the smallest operating incision.

A delivery workflow that always gets rework. A business judgment that new hires can never quite pick up. A cross-department wait point the boss has to personally coordinate every week.

Only when those points get compressed does AI start turning from a tool into organizational capability.

This is why I prefer to call a cash-poor company's first step "running the ledger."

Not "transformation."

Five layers of breakpoints. They do not fail in order.

Break the Org OS open and what cannot run is five layers of breakpoints.

First, roles.

Job descriptions still say one complete person is doing one complete thing. AI has already split that thing into machine recognition, machine generation, human review, exception handling, and final accountability. Leave the role description untouched and people get squeezed between old responsibilities and new tools.

Second, workflows.

The workflow diagram still assumes every step is a human step. AI has already slid into the generation, retrieval, judgment, and recommendation slots. Leave the workflow unchanged and AI runs in a grey zone. When something breaks, only then does everyone discover the workflow never wrote down which steps can pass automatically and which steps must stop for a human.

Third, knowledge.

A lot of companies think they have a knowledge base. What they actually have is a pile of documents. A document library can store files. It cannot explain on its own which role, in which scenario, should call which knowledge — and it has a hard time letting the judgments that came out of that call get reviewed afterward.

Fourth, responsibility.

Once an AI suggestion gets adopted, whose responsibility is it? The tool user, the workflow owner, the business owner, the vendor?

Leave the responsibility chain unwritten, and the deeper AI goes into the workflow, the more responsibility blurs.

Fifth, governance.

Permissions, audit trails, logging, rollback. If they do not enter daily operations, AI ends up running on individual conscience. Individual conscience can hold a small pilot. It cannot hold organization-scale operation.

Put these five layers back onto Klarna and it gets clearer.

Role layer. Customer service is no longer "the person who answers questions." It is split into basic Q&A, complex judgment, emotional handling, refund adjudication, brand-trust maintenance. AI can carry the first half. Someone has to catch the second.

Workflow layer. When AI cannot answer, when the customer is unhappy, when the case touches money and credit, the escalation path has to be written down up front. Not figured out after the complaint.

Knowledge layer. AI assistant handling high-frequency questions does not mean the organization knows which knowledge has been structured and which judgment still leans on humans. After high-frequency knowledge gets automated, what is left is harder, fuzzier, more experience-bound.

Responsibility layer is the hardest.

AI made a suggestion. The customer accepted the result. Later the experience drops. Whose count is that — the model, the customer-service team, the cost decision, or the CEO?

When Sebastian Siemiatkowski admits in a media interview that cost weighed too heavily, the responsibility chain has just been pushed back to the corner office.

Governance layer decides whether any of this can be caught early.

What is the signal that quality is dropping? Repeat inquiries, complaints, refunds, brand negatives, human-escalation rate — which one trips the rollback? Without those thresholds, "AI customer service" can only prove itself with short-term cost numbers.

These five layers are not a consulting framework.

They are the operating questions that show up after go-live.

Leave any one of them empty for long and the organization will instinctively absorb the new division of labor with the old playbook. The result: it looks like you adopted AI; in reality you pressed AI back into the old workflow.

The worse part is that these five layers do not break in order.

Often knowledge breaks first. The model cannot find the right material, so people start patching by hand. The patching breaks the workflow, because every department patches its own way. Once the workflow goes loose, responsibility breaks, because nobody can say which step was whose call. Once responsibility blurs, governance turns into after-the-fact blame. Only then does the role layer surface — turns out, this role stopped being the old role a long time ago.

So what leadership sees is usually the symptom at the tail.

Employees complain the tools are bad. The business complains AI does not understand the context. IT complains the requirements are unclear. HR complains the role boundary is getting blurry.

Underneath, it is the same system problem.

March, 1991, named a brutal organizational learning paradox: adaptive processes reinforce exploitation — what we already know, what is short-term effective — faster than they reinforce exploration, and that combination can become self-destructive over time. Klarna's problem sits right there. Cost savings, faster response, conversation volume — those are all short-term effective moves. But if service quality, exception judgment, and the responsibility loop do not get rewritten alongside, short-term effective pushes the organization toward long-term fragility.

So this piece does not use "Org OS" as a metaphor.

It is not a clever phrase.

It is a reconciliation statement.

Have the roles been rewritten?

Does the workflow have a downgrade path?

Has knowledge turned into an asset?

Is responsibility assigned?

Does governance have thresholds?

Leave any one of those five blank and the faster AI goes live, the louder the old OS will throw errors.

No budget? Stop shouting "transformation." Build three ledgers first.

If a company has no money for a big system and no mature AI transformation team, my suggestion is: stop shouting strategy.

Build three ledgers first.

First, the time ledger.

Not the lazy version where you ask employees "where could we improve efficiency." The version where you break out the most time-consuming work: who is doing it, for how long, how many repetitions, and why this person is the one doing it right now.

The point of a time ledger is to make AI stop serving abstract efficiency and start serving specific time sinks.

Second, the waiting ledger.

What a lot of organizations actually waste is not action time.

It is waiting time.

Waiting on approvals. Waiting on confirmations. Waiting on materials. Waiting on cross-department replies. Waiting on the boss to decide. If AI only speeds up individual actions and does not remove waits, operating results will not move much.

Third, the knowledge ledger.

Which experience lives only in senior heads. Which judgments new hires cannot pick up. Which information has to get re-asked of a human every time. Which SOPs were written but barely used.

The point of a knowledge ledger is to turn individual experience into something the organization can call on.

These three ledgers are not fancy.

They work.

They turn "learning AI" from an attitude question into an operating motion. The boss is no longer just asking whether people are using AI. The boss is asking:

Which time sink got compressed?

Which wait got removed?

Which kind of knowledge got structured?

Once the ledgers are on the table, AI starts to enter the company.

The three ledgers have another benefit.

They force leadership to stop talking abstractly.

Once the time ledger is on the table, everyone knows where AI should hit first.

Once the waiting ledger is on the table, everyone sees that what is really slowing the company down is not always the frontline employee — it may be the approvals, the reviews, the cross-department confirmations.

Once the knowledge ledger is on the table, everyone sees what "experienced" actually means: which categories of judgment never got distilled out.

That is when AI gets a real entry point.

Not a generic learning task. A specific problem: compress this time, remove that wait, structure this knowledge.

For a cash-poor company, this matters more than "everyone embrace AI."

Because the ledger lets the boss see one thing.

Whether AI is actually making the company better at operating.

Boss B is the most dangerous one. He thinks he already gets it.

The easiest one to fall in is not the boss who knows nothing about AI.

The boss who knows nothing knows they know nothing. They go ask someone.

The truly dangerous one is the other type.

He has been to the courses. He has heard the talks. He has bookmarked the tool lists. He forwards AI cases in the group chat every day. He says "we have to embrace AI" in meetings.

He is not a refuser.

He is often the most active person in the company.

Six months later, the operating numbers have not moved.

Sales are still waiting on materials.

Delivery is still reworking.

New hires are still asking senior staff.

The boss is still personally unblocking cross-department jams every week.

The employees did learn a lot of tools. Leadership did hold a lot of meetings. But the workflows that actually got shorter, the waits that actually got removed, the knowledge that actually got distilled — there are not many to put on the table.

This is the Boss B cognitive trap.

He thinks he has stepped inside AI.

He is still standing outside it.

Why?

Because he is treating AI as a learning topic, not an operating topic.

The learning topic asks: are people using AI? Which tool is good? How do I write this prompt? Which department can share a case?

These questions create activity and lower anxiety. They do not, on their own, change the company's operating structure.

The operating topic asks completely different things.

Which workflow waited two fewer days this month?

Which rework category got compressed?

Which judgment that used to require a senior employee can now be caught by the knowledge ledger?

Change the question and AI moves out of the toolbar and into the P&L.

So the first step in a cash-poor company is not building an AI strategy.

It is building three ledgers.

The time ledger answers "where is time leaking."

The waiting ledger answers "where is the organization jammed."

The knowledge ledger answers "where is experience locked up."

These three ledgers are not sexy. They turn "embrace AI" from an attitude question into an operating question.

The hardest moment for Boss B is realizing the actions he was most proud of may have walked right past the core problem.

He had people learn tools. Right move. Without a time ledger, tools just scatter into individual productivity.

He had departments share cases. Right move. Without a waiting ledger, cases will not automatically cut through approval, review, and cross-department confirmation.

He had senior staff mentor new hires. Right move. Without a knowledge ledger, experience still migrates from head to head, not into an organizational asset.

This is not the boss being lazy.

This is the boss treating AI as "a new skill employees need to learn" instead of "a new operating motion the company needs to rewrite."

I have stepped into this trap inside my own projects. At the start I always reached for a stronger tool, a better agent, a more complete automation. The real moves, when I pushed, kept landing in unglamorous places. Which step always gets reworked. Which step always waits on someone. Which kind of judgment forever has to be checked back with a small handful of senior people.

Once the ledger is on the table, the tool knows where to hit.

We are still dissecting this. I am not claiming it is solved.

One thing I can say cleanly already.

The boss knowing AI the best does not mean the company uses AI the best.

The boss being able to recite AI trends does not mean the organization has stepped inside AI.

Only when the time ledger, the waiting ledger, and the knowledge ledger enter the operating review does AI stop being a learning atmosphere and start becoming an operating account — one that can be assigned, reviewed, and continuously invested in.

Boss B's exit is not another course.

It is moving himself from "the head of AI learning" to "the first owner of the AI operating account."

The next cut lands on the HR three-pillar model

When the Org OS cannot run, the pressure travels fast to HR.

Not because HR should "be in charge of AI."

Because the moment AI enters the new division of labor, roles get rewritten.

Once roles get rewritten, hiring profiles change. Training content changes. Performance metrics change. Talent reviews change.

The tech side can speak first on whether the system runs.

The business side can speak first on whether the result is good.

But once this enters daily operations, the company has to answer:

What kind of person does this role actually need now?

How do we develop them?

How do we evaluate them?

How do we assign responsibility?

These questions do not go around HR.

HR used to split a lot of this across three pillars: COE on policy, HRBP on the business, SSC on delivery. In a stable business environment, that structure had value.

The moment AI splits tasks, judgment, and knowledge apart, all three pillars get interrogated at once.

Can the policy COE wrote actually explain the responsibility boundary in human-machine collaboration?

When HRBPs sit with the business, can they read which roles have already been split open inside?

If SSC only ships standard delivery, can it carry workflows that AI has rewritten?

This is not "should HR learn AI."

This is "should HR's own division of labor get rewritten."

So the next piece is not about HR using AI.

It is about why, once AI is in, the HR three-pillar model is the first thing that gets punched through.

Read on

Previous: AI Is Not a New Tool. It Is a New Division of Labor.
Series hub: Human in the Loop
Next: Why the HR Three-Pillar Model Breaks

Why the Org OS Cannot Run AI

Table of Contents

Why the Org OS Cannot Run AI

It is not that AI cannot run. It is the company that cannot run.

What a cash-rich company really fears is not the missing tool. It is money covering up the problem.

Klarna is not a punchline. It is an engineering blueprint.

What a cash-poor company really fears is not the missing budget. It is mistaking activity for progress.

Five layers of breakpoints. They do not fail in order.

No budget? Stop shouting "transformation." Build three ledgers first.

Boss B is the most dangerous one. He thinks he already gets it.

The next cut lands on the HR three-pillar model

Read on

Keep reading

If this essay describes your organization, run the diagnostic next.

Why the Org OS Cannot Run AI

Table of Contents

Keep reading

If this essay describes your organization, run the diagnostic next.

Subscribe to Uncle J's Insider