Everyone has AI. Few can turn it into work.

Abstract

Everyone has AI now. Access is no longer the advantage. The advantage is whether an organization can turn model output into accountable work: work with context, ownership, source records, review paths, measurable outcomes, and a way to learn from what happened.

The market has moved past the simple adoption question. McKinsey reports broad AI use, but much smaller shares report scaled programs, EBIT impact, or high-performer status. BCG finds that many firms still see minimal material value despite substantial investment. The gap is no longer only about who has tools. It is about who can absorb the output into the way the business actually runs.

This is where structured intelligence becomes concrete. It is the operating layer that gives AI work usable context, evidence, controls, ownership, review, and institutional learning. Without that layer, useful output can die before it becomes useful work.

Adoption has become too broad a word. The denominator changes the story.

Many AI programs stall after the demo because the business process around the output is missing.

Productivity is a workflow property, not a model property.

Structured intelligence is the missing layer: context, evidence, controls, ownership, review, and learning.

Measurement discipline

AI adoption numbers hide the real problem

AI adoption is often discussed as a single curve. That framing hides the real problem. Different sources measure different units: organizations, firms, workers, employment-weighted firms, product usage, and tasks.

McKinsey reports that 88% of survey respondents say their organizations use AI regularly in at least one business function. The Census Bureau reports a much lower past-two-week current-use range for BTOS businesses. The Federal Reserve shows why both can be true: adoption looks different when the unit is the firm, the worker, or the worker inside an AI-adopting firm.

Figure 1 Adoption changes when the unit of analysis changes.

Use these rows as a denominator map, not as a funnel or a benchmark average.

Unit Measure What it can support

Survey respondents 88% Regular AI use in at least one business function was widely reported. Business function 70% Generative AI has entered at least one function in many organizations. BTOS businesses 17-20% Past-two-week current use remained materially lower. RPS workers 41% Work-related generative AI use is ahead of many firm-level measures. Employment-weighted firms 78% Many workers sit inside firms that have adopted AI somewhere.

These measures should not be averaged. They explain why adoption can look mature in one dataset and early in another.

Operating absorption

Using AI is not the same as operationalizing it

The most useful McKinsey finding is not just that 88% of respondents report regular AI use. It is the gap between that figure and the smaller set of organizations reporting scaled programs, EBIT impact, and high-performer status. McKinsey

Figure 2 Adoption is broad. Operating absorption is narrower.

The gap opens after access, when work has to be redesigned around review and accountability.

Respondents reporting regular AI use in at least one function 88% Respondents scaling AI programs within at least one function ~33% Respondents attributing any EBIT impact to AI 39% McKinsey AI high performers 6%

Values are survey-reported shares from one McKinsey survey. They are related maturity signals, not a single conversion funnel.

Interpretation

The bottleneck is not the first draft, answer, summary, or suggestion. The bottleneck is the system around the output: who requested it, what source material it used, what standard applies, who owns the result, who reviews it, and how the organization learns from it.

Value capture

Where AI value dies

Value rarely dies at the moment of generation. It usually dies afterward. A pilot can produce a strong answer and still fail if the company cannot place that answer inside the way the business actually runs.

The sources point to five recurring failure modes: value is not measured, governance trails deployment, data context is fragmented, agents are given more ambition than controls, and employees use AI faster than the organization redesigns the work around them. BCG IBM

Figure 3 The hard part starts after the output.

Enterprise AI value fails when useful work cannot move through context, control, and measurement.

Value gap 60%

BCG reports a large group of firms seeing little material value despite AI investment.

Governance gap 21%

Deloitte respondents reporting mature agentic AI governance.

Data gap 45%

IBM-cited business leaders reporting data accuracy or bias as a leading barrier to scaling AI.

Agent risk >40%

Gartner forecast of agentic AI projects canceled by the end of 2027.

Organization gap 67% / 32%

Microsoft's association between reported AI impact and organizational versus individual factors.

These sources measure different populations and methods. Read the figure as a failure-mode map, not a single benchmark.

This is the downside of the current cycle. Companies can spend on licenses, demos, and pilots while still leaving managers with unowned outputs, unclear review standards, uncertain data quality, and no durable way to learn from completed work. The organization may look active without becoming more capable.

IBM's 2025 CEO study points to the same issue from the executive side: surveyed CEOs reported that only 25% of AI initiatives had delivered expected ROI over the previous few years, only 16% had scaled enterprise-wide, and 50% said rapid investment left disconnected technology. Deloitte's 2025 ROI survey found rising investment, but only around one in five surveyed organizations qualified as AI ROI Leaders. These are different measures. They point to the same operating problem: spend is easier to approve than value is to measure and absorb. IBM Deloitte

Task diffusion

AI changes tasks before it changes jobs

Anthropic's Economic Index is valuable because it looks below the org chart. The first report found that roughly 36% of occupations had AI use in at least a quarter of associated tasks, with observed usage leaning toward augmentation at 57% versus automation at 43%. Anthropic

The March 2026 update showed usage becoming less concentrated inside Anthropic's Claude data. The top 10 tasks fell from 24% of Claude.ai traffic in November 2025 to 19% in February 2026, and about 49% of jobs had at least a quarter of tasks observed in Claude usage. Anthropic Economic Index

Figure 4 Usage is spreading across tasks while feasibility and adoption remain uneven.

The most granular evidence in this source set is task-level and Claude-specific.

Claude task coverage 36%

occupations saw AI use in at least a quarter of associated tasks.

Concentration fell 24% -> 19%

share of Claude.ai traffic accounted for by the top 10 tasks from Nov. 2025 to Feb. 2026.

Observed job exposure 49%

jobs with at least a quarter of tasks observed in Claude usage.

Feasibility gap 94% / 33%

theoretical versus observed task coverage in Computer and Math occupations.

Anthropic's data is platform-specific and classifier-mediated. It is useful for direction and task composition, not for total labor-market measurement.

This is how enterprise AI should be evaluated. The better question is not whether a job is "automated." It is which tasks have enough context, repetition, evidence, and review discipline to become repeatable AI-supported work.

Productivity evidence

AI works when the workflow is ready

Productivity is a workflow property, not a model property. The evidence is not a single number. It is a set of boundary conditions. A QJE study of 5,172 customer-support agents found that access to a generative AI assistant increased issues resolved per hour by 15% on average, with larger gains for less experienced and lower-skilled workers. QJE

METR found the opposite in a different setting. In a randomized controlled trial with 16 experienced open-source developers working on familiar repositories, access to early 2025 AI tools made completion time 19% longer. Developers had expected a 24% speedup. METR now labels the result as historical evidence because model capability has moved since the study window. METR

Figure 5 AI productivity depends on the work system around the model.

The same model class can help or slow work depending on context, task shape, and review load.

Customer support field study +15% Software development studies summarized by Stanford +26% Marketing output studies summarized by Stanford +50% Historical early-2025 METR developer RCT -19%

The studies use different tasks and methods. The relevant pattern is variance, not a universal productivity estimate.

The variance is the lesson. AI helps when the task is bounded, the source material is available, the output can be reviewed, and the feedback loop is short. It can slow work down when the user has to reconstruct context, verify too much, or absorb errors that the workflow did not anticipate. The model matters, but the surrounding work system determines whether capability turns into productivity.

Agent reality

Agents expose the operating-model gap

Agents matter because they expose what chat can hide. A person can use a model informally and absorb the risk alone. A system that acts on business work needs authorization, source discipline, policy boundaries, exception handling, and a reviewer who can trust the path taken.

The data does not support a claim that agents are already running the enterprise. It supports a more precise claim: agent activity is real, production deployment is early, and weak operating controls are a likely failure mode.

Figure 6 Agent ambition is ahead of production maturity.

Agents convert informal assistance into an operating-model question.

Stanford HAI Single digits

agent deployment across nearly all business functions.

McKinsey 23%

respondents scaling an agentic AI system within at least one function.

McKinsey 39%

experimenting with AI agents.

Gartner >40%

agentic AI projects predicted to be canceled by the end of 2027.

Gartner's figure is a forecast. McKinsey and Stanford use different definitions. Read the figure as a maturity map, not a direct comparison.

This is why agents are not mainly a feature question. They are an operating-model question. If software can act, the organization must define the allowed action, the evidence threshold, the reviewer, the exception path, and the feedback loop.

Operating-model effect

The bottleneck moves from output to review

Faster work is the first-order effect. More work entering the system is the second-order effect. The third-order effect is that review, approval, governance, and quality control become the bottleneck. The fourth-order effect is a new operating model.

First order Outputs get faster.

Drafts, summaries, analyses, and code suggestions can appear quickly.

Second order Demand increases.

More people ask for more work because the apparent cost of a first draft falls.

Third order Review becomes scarce.

The organization now waits on context, judgment, approval, and trust.

Fourth order The operating layer changes.

The business needs systems that structure work before AI acts and preserve learning afterward.

Implication

The missing layer is structured intelligence

The answer is not to push every process toward autonomy. The answer is to make business knowledge usable before software acts. Documents, policies, research, customer context, decisions, approvals, and expert judgment need to become structured intelligence: managed, reviewable, reusable, and ready for accountable work.

This is unikode's interpretation of the evidence: the next useful software layer will be judged less by whether it adds another prompt box and more by whether it preserves context, binds work to evidence, routes review, keeps human direction intact, and turns completed work into institutional learning.

Context before output

Task, policy, source material, customer record, and review standard should be present up front.

Work object before workflow

The business should define the thing being produced or changed before asking software to act.

Review matched to risk

NIST points organizations toward risk management across design, development, use, and evaluation.

Learning after completion

Microsoft finds organizational factors are more strongly associated with reported AI impact than individual effort alone.

The strategic question is no longer, "Do we have access to AI?" It is, "Can our organization convert intelligence into governed execution without losing context, control, or learning?" The winners will not be the companies with the most AI usage. They will be the companies that can turn intelligence into accountable work.

Limits

What this analysis does not show

The sources use different samples, time windows, definitions, and methods. The charts are evidence signals, not a combined benchmark. Vendor and consulting sources are paired with public-sector and academic evidence because no single source measures the whole market.

This analysis also does not argue that every workflow should become agentic. Some work is better handled by ordinary software, a focused assistant, or direct professional judgment. The narrower claim is that consequential AI work needs structured context, ownership, measurement, and review.

Research basis

Used for field evidence that AI assistance increased customer-support productivity by about 15% on average, with heterogeneous worker effects.

View paper

METR: Early-2025 AI and open-source developer productivity

Used as historical early-2025 randomized evidence that AI can slow experienced developers in complex, familiar repositories when review and context burdens are high.

View study