Abstract
Everyone has AI now. Access is no longer the advantage. The advantage is whether an organization can turn model output into accountable work: work with context, ownership, source records, review paths, measurable outcomes, and a way to learn from what happened.
The market has moved past the simple adoption question. McKinsey reports broad AI use, but much smaller shares report scaled programs, EBIT impact, or high-performer status. BCG finds that many firms still see minimal material value despite substantial investment. The gap is no longer only about who has tools. It is about who can absorb the output into the way the business actually runs.
This is where structured intelligence becomes concrete. It is the operating layer that gives AI work usable context, evidence, controls, ownership, review, and institutional learning. Without that layer, useful output can die before it becomes useful work.
Adoption has become too broad a word. The denominator changes the story.
Many AI programs stall after the demo because the business process around the output is missing.
Productivity is a workflow property, not a model property.
Structured intelligence is the missing layer: context, evidence, controls, ownership, review, and learning.
Measurement discipline
AI adoption numbers hide the real problem
AI adoption is often discussed as a single curve. That framing hides the real problem. Different sources measure different units: organizations, firms, workers, employment-weighted firms, product usage, and tasks.
McKinsey reports that 88% of survey respondents say their organizations use AI regularly in at least one business function. The Census Bureau reports a much lower past-two-week current-use range for BTOS businesses. The Federal Reserve shows why both can be true: adoption looks different when the unit is the firm, the worker, or the worker inside an AI-adopting firm.
Use these rows as a denominator map, not as a funnel or a benchmark average.
These measures should not be averaged. They explain why adoption can look mature in one dataset and early in another.
Operating absorption
Using AI is not the same as operationalizing it
The most useful McKinsey finding is not just that 88% of respondents report regular AI use. It is the gap between that figure and the smaller set of organizations reporting scaled programs, EBIT impact, and high-performer status. McKinsey
The gap opens after access, when work has to be redesigned around review and accountability.
Values are survey-reported shares from one McKinsey survey. They are related maturity signals, not a single conversion funnel.
Interpretation
The bottleneck is not the first draft, answer, summary, or suggestion. The bottleneck is the system around the output: who requested it, what source material it used, what standard applies, who owns the result, who reviews it, and how the organization learns from it.
Value capture
Where AI value dies
Value rarely dies at the moment of generation. It usually dies afterward. A pilot can produce a strong answer and still fail if the company cannot place that answer inside the way the business actually runs.
The sources point to five recurring failure modes: value is not measured, governance trails deployment, data context is fragmented, agents are given more ambition than controls, and employees use AI faster than the organization redesigns the work around them. BCG IBM
Enterprise AI value fails when useful work cannot move through context, control, and measurement.
BCG reports a large group of firms seeing little material value despite AI investment.
Governance gap 21%Deloitte respondents reporting mature agentic AI governance.
Data gap 45%IBM-cited business leaders reporting data accuracy or bias as a leading barrier to scaling AI.
Agent risk >40%Gartner forecast of agentic AI projects canceled by the end of 2027.
Organization gap 67% / 32%Microsoft's association between reported AI impact and organizational versus individual factors.
These sources measure different populations and methods. Read the figure as a failure-mode map, not a single benchmark.
This is the downside of the current cycle. Companies can spend on licenses, demos, and pilots while still leaving managers with unowned outputs, unclear review standards, uncertain data quality, and no durable way to learn from completed work. The organization may look active without becoming more capable.
IBM's 2025 CEO study points to the same issue from the executive side: surveyed CEOs reported that only 25% of AI initiatives had delivered expected ROI over the previous few years, only 16% had scaled enterprise-wide, and 50% said rapid investment left disconnected technology. Deloitte's 2025 ROI survey found rising investment, but only around one in five surveyed organizations qualified as AI ROI Leaders. These are different measures. They point to the same operating problem: spend is easier to approve than value is to measure and absorb. IBM Deloitte
Task diffusion
AI changes tasks before it changes jobs
Anthropic's Economic Index is valuable because it looks below the org chart. The first report found that roughly 36% of occupations had AI use in at least a quarter of associated tasks, with observed usage leaning toward augmentation at 57% versus automation at 43%. Anthropic
The March 2026 update showed usage becoming less concentrated inside Anthropic's Claude data. The top 10 tasks fell from 24% of Claude.ai traffic in November 2025 to 19% in February 2026, and about 49% of jobs had at least a quarter of tasks observed in Claude usage. Anthropic Economic Index
The most granular evidence in this source set is task-level and Claude-specific.
occupations saw AI use in at least a quarter of associated tasks.
Concentration fell 24% -> 19%share of Claude.ai traffic accounted for by the top 10 tasks from Nov. 2025 to Feb. 2026.
Observed job exposure 49%jobs with at least a quarter of tasks observed in Claude usage.
Feasibility gap 94% / 33%theoretical versus observed task coverage in Computer and Math occupations.
Anthropic's data is platform-specific and classifier-mediated. It is useful for direction and task composition, not for total labor-market measurement.
This is how enterprise AI should be evaluated. The better question is not whether a job is "automated." It is which tasks have enough context, repetition, evidence, and review discipline to become repeatable AI-supported work.
Productivity evidence
AI works when the workflow is ready
Productivity is a workflow property, not a model property. The evidence is not a single number. It is a set of boundary conditions. A QJE study of 5,172 customer-support agents found that access to a generative AI assistant increased issues resolved per hour by 15% on average, with larger gains for less experienced and lower-skilled workers. QJE
METR found the opposite in a different setting. In a randomized controlled trial with 16 experienced open-source developers working on familiar repositories, access to early 2025 AI tools made completion time 19% longer. Developers had expected a 24% speedup. METR now labels the result as historical evidence because model capability has moved since the study window. METR
The same model class can help or slow work depending on context, task shape, and review load.
The studies use different tasks and methods. The relevant pattern is variance, not a universal productivity estimate.
The variance is the lesson. AI helps when the task is bounded, the source material is available, the output can be reviewed, and the feedback loop is short. It can slow work down when the user has to reconstruct context, verify too much, or absorb errors that the workflow did not anticipate. The model matters, but the surrounding work system determines whether capability turns into productivity.
Agent reality
Agents expose the operating-model gap
Agents matter because they expose what chat can hide. A person can use a model informally and absorb the risk alone. A system that acts on business work needs authorization, source discipline, policy boundaries, exception handling, and a reviewer who can trust the path taken.
The data does not support a claim that agents are already running the enterprise. It supports a more precise claim: agent activity is real, production deployment is early, and weak operating controls are a likely failure mode.
Agents convert informal assistance into an operating-model question.
agent deployment across nearly all business functions.
McKinsey 23%respondents scaling an agentic AI system within at least one function.
McKinsey 39%experimenting with AI agents.
Gartner >40%agentic AI projects predicted to be canceled by the end of 2027.
Gartner's figure is a forecast. McKinsey and Stanford use different definitions. Read the figure as a maturity map, not a direct comparison.
This is why agents are not mainly a feature question. They are an operating-model question. If software can act, the organization must define the allowed action, the evidence threshold, the reviewer, the exception path, and the feedback loop.
Operating-model effect
The bottleneck moves from output to review
Faster work is the first-order effect. More work entering the system is the second-order effect. The third-order effect is that review, approval, governance, and quality control become the bottleneck. The fourth-order effect is a new operating model.
Drafts, summaries, analyses, and code suggestions can appear quickly.
More people ask for more work because the apparent cost of a first draft falls.
The organization now waits on context, judgment, approval, and trust.
The business needs systems that structure work before AI acts and preserve learning afterward.
Implication
The missing layer is structured intelligence
The answer is not to push every process toward autonomy. The answer is to make business knowledge usable before software acts. Documents, policies, research, customer context, decisions, approvals, and expert judgment need to become structured intelligence: managed, reviewable, reusable, and ready for accountable work.
This is unikode's interpretation of the evidence: the next useful software layer will be judged less by whether it adds another prompt box and more by whether it preserves context, binds work to evidence, routes review, keeps human direction intact, and turns completed work into institutional learning.
Context before output
Task, policy, source material, customer record, and review standard should be present up front.
Work object before workflow
The business should define the thing being produced or changed before asking software to act.
Review matched to risk
NIST points organizations toward risk management across design, development, use, and evaluation.
Learning after completion
Microsoft finds organizational factors are more strongly associated with reported AI impact than individual effort alone.
The strategic question is no longer, "Do we have access to AI?" It is, "Can our organization convert intelligence into governed execution without losing context, control, or learning?" The winners will not be the companies with the most AI usage. They will be the companies that can turn intelligence into accountable work.
Limits
What this analysis does not show
The sources use different samples, time windows, definitions, and methods. The charts are evidence signals, not a combined benchmark. Vendor and consulting sources are paired with public-sector and academic evidence because no single source measures the whole market.
This analysis also does not argue that every workflow should become agentic. Some work is better handled by ordinary software, a focused assistant, or direct professional judgment. The narrower claim is that consequential AI work needs structured context, ownership, measurement, and review.
Research basis
Evidence behind the argument
These sources were selected because they separate adoption, task diffusion, scaling, productivity, and risk. That separation is necessary for a defensible view of enterprise AI maturity.
McKinsey: The State of AI, 2025
Used for adoption, enterprise scaling, EBIT impact, high-performer share, workflow redesign, and early agentic AI adoption indicators.
View reportBCG: The Widening AI Value Gap
Used for the gap between AI investment and material value, including BCG's finding that many firms report minimal revenue and cost gains.
View reportIBM: CEOs double down on AI while navigating enterprise hurdles
Used for CEO-reported ROI, enterprise-wide scaling, and disconnected technology risks from fast AI investment.
View studyDeloitte: AI ROI and elusive returns
Used for investment momentum, delayed payback, ROI leader concentration, and significant measurable ROI rates for generative and agentic AI.
View analysisStanford HAI: 2026 AI Index Report
Used as the annual reference point for corporate AI adoption, labor-market signals, and productivity research summaries.
View reportStanford HAI: 2026 AI Index, Economy chapter
Used for generative AI business use, agent deployment maturity, productivity findings, and labor-market caveats.
View chapterU.S. Census Bureau: AI Use at U.S. Businesses
Used for firm-level U.S. business adoption from the Business Trends and Outlook Survey, including the 17% to 20% past-two-week current-use range from December 2025 to May 2026.
View analysisFederal Reserve: Monitoring AI Adoption in the U.S. Economy
Used for the distinction between firm adoption, worker adoption, and employment-weighted exposure through 2025.
View noteAnthropic: Introducing the Economic Index
Used for task-level evidence from Claude conversations, including occupation coverage and the augmentation versus automation split.
View reportAnthropic Economic Index: Learning curves
Used for the March 2026 update on task diversification and the share of jobs with at least a quarter of tasks observed in Claude usage.
View reportAnthropic: Labor market impacts of AI
Used for the distinction between theoretical AI capability and observed task coverage across occupations.
View reportMicrosoft: 2026 Work Trend Index
Used for the argument that individual AI use is ahead of the organizational systems needed to support and compound it.
View reportDeloitte: Agentic AI is scaling faster than guardrails
Used for the gap between agentic AI ambition and mature governance practices across surveyed enterprises.
View analysisGartner: Agentic AI project cancellation forecast
Used as a caution that unclear business value, cost, and risk controls can stall agentic AI projects before production.
View releaseNIST: AI Risk Management Framework
Used for the risk-management baseline around trustworthiness considerations in the design, development, use, and evaluation of AI systems.
View frameworkIBM: The True Cost of Poor Data Quality
Used for the data-quality and governance barrier to scaling AI, including concerns about data accuracy and bias.
View analysisIBM: The Biggest AI Adoption Challenges for 2026
Used for the broader list of enterprise deployment constraints: fragmented data, governance, security, skills, cost justification, and workflow integration.
View analysisQJE: Generative AI at Work
Used for field evidence that AI assistance increased customer-support productivity by about 15% on average, with heterogeneous worker effects.
View paperMETR: Early-2025 AI and open-source developer productivity
Used as historical early-2025 randomized evidence that AI can slow experienced developers in complex, familiar repositories when review and context burdens are high.
View study