Designing Accountability into the Agentic Economy

Mar 17, 2026

Article voiceover

0:00

-21:52

A procurement AI at a mid-size manufacturer identifies a supply-chain bottleneck. Without waiting for sign-off, it sources an alternative component from a supplier in a different jurisdiction, negotiates a price, executes a purchase order, and reroutes the production schedule. By the time the procurement team reviews the dashboard, the parts are in transit. The production line never stops. Two things are true about this. The first is that the AI committed the company to a contract in a foreign jurisdiction without legal review, without checking whether the new supplier meets the organisation’s ESG requirements, and without confirming that the alternative component is compatible with downstream quality certifications. No human exercised judgment on any of these questions. The second is that the production line did not stop. In a manufacturing context where a single day of downtime can cost hundreds of thousands of dollars, the AI did exactly what it was optimised to do — and it did it faster and arguably more effectively than the three-day approval cycle it bypassed. The human process exists for good reasons, but it has costs too: delays, missed windows, escalation fatigue. The AI’s speed is not just a technical feature. It is an economic argument. This is the agentic economy in miniature. AI systems are no longer just recommending actions or automating internal processes. They are entering markets, signing contracts, selecting counterparties, and allocating resources — acting as economic participants in their own right. NVIDIA’s GTC 2026 this week features sessions on agentic AI architectures (S81584, S82173) that are explicitly designed for this class of autonomous economic activity. The Blackwell platform and NIM microservices provide the inference infrastructure; the agents built on top of them are already transacting.

The governance question raised by the previous article was how to preserve the reliability benefits of autonomous systems without sacrificing accountability. The question raised by this article is different and, in some ways, harder: when an AI system acts as an economic and potentially legal entity, who is liable for what it does?

What “Agent” Actually Means — and What It Conceals

In law, an agent is a person authorised to bind a principal to obligations. Agency law is centuries old and rests on authority, consent, fiduciary duty, and capacity — all of which assume legal personhood. In computer science, an agent is a software system that perceives its environment and takes actions to achieve goals. It does not imply legal capacity or fiduciary obligation.

The agentic AI industry uses the word in the computer science sense but derives its persuasive power from the legal one. When a vendor says its product is “an AI agent that acts on your behalf,” the phrase imports centuries of trust and accountability that the software does not possess. An AI “agent” cannot be sued. It holds no assets against which a judgment can be enforced.This matters at board level because the language shapes where liability is expected to land. If a human agent enters a contract on your behalf and something goes wrong, the legal framework for allocating responsibility is well established. If an AI “agent” does the same, the framework is not — and the gap between the word’s connotation and the technology’s legal standing is where exposure accumulates. Kolt’s 2025 analysis of AI agent governance is precise on this: the existing liability chain — developer, deployer, user — was designed for a world where AI systems produce outputs that humans then act on.

In the agentic model, the AI acts directly. Developer, deployer, and user all have a claim to not being the one who decided. The counterpoint is that this ambiguity is not entirely new. Vicarious liability, respondeat superior, and product liability have allocated responsibility in distributed-causation situations for decades. The question is whether those doctrines can adapt at the speed the agentic economy demands.

The strategic implication is concrete. An organisation deploying AI agents faces a choice between two postures: treat the agent as a tool (in which case the deployer is responsible for everything it does, as with any other product) or treat it as a delegated decision-maker (in which case the liability chain is ambiguous and the organisation is exposed to the argument that it “should have known” what the agent would do). Neither posture is cost-free. The first constrains what the agent is allowed to do. The second constrains what the organisation can disclaim when it goes wrong.

A Note on Sources

This article uses the AIID/CSET subset (214 incidents, 86 variables) as its primary dataset, supplemented for one finding by Hugging Face model registry counts and OECD AI Policy Observatory regulation counts. Three databases exist at scale — OECD AIM (~13,750 entries), AIAAIC (~2,200+), and the AIID/CSET subset used here — and the choice of the smallest is explained in Article 1: it is the only one that codes each incident for autonomy level, harm severity, sector, and dozens of other variables simultaneously. The cost is sample size. Findings that clear significance thresholds are marked confirmed; those that do not are flagged as directional or not significant.

Five Findings and What They Mean

Finding 1: Complexity is not where the risk concentrates

Multi-entity incidents — those involving multiple developers, deployers, or AI systems — show *lower* average harm severity (1.17) than single-entity incidents (1.42). The difference is not statistically significant (p = 0.897), but the direction is the opposite of what complexity-risk intuition predicts.

**What this means for general counsel and risk committees:** The instinct to focus governance attention on the most complex multi-agent deployments may be misplaced. Incidents involving a single, less-visible operator with fewer safety resources are at least as dangerous in the current record. The risk is not in the number of parties; it is in the absence of institutional capacity behind them.

**The caveat that matters:** Agent-to-agent transactions at scale are recent. The AIID may not yet capture the class of cascading failures that multi-agent commerce introduces. The data says the problem has not arrived. It does not say the problem will not arrive.

**The strategic choice:** Organisations building agentic supply chains can either front-load attribution architecture (logging which agent made which decision at each handoff, before something goes wrong) or rely on forensic reconstruction after an incident. Xu’s 2026 proposal for blockchain-based agent accountability ledgers is one version of front-loading: every agent action recorded on a tamper-proof ledger, creating an audit trail at the infrastructure level. The limitation is that recording what happened and determining whether it was appropriate are different problems. A ledger is an audit tool, not a governance framework. But organisations that build neither will be reconstructing from partial logs when the first multi-agent failure hits litigation.

### Finding 2: Governance is not keeping pace — by any measure The ratio of publicly available AI models to binding AI regulations was approximately 2.5:1 in 2018. By 2025, it was 29:1. The trend fits an exponential model (R² = 0.898, p = 0.0003). Governance is not closing the gap. The gap is accelerating.

**Three honest caveats.** Models are not agents — most Hugging Face listings are research artefacts, not autonomous economic actors. Not all models require their own regulation. And binding regulation is not the only form of governance; contractual clauses, insurance requirements, and internal policies exert pressure that the OECD does not count. The 29:1 ratio overstates the precise gap.

**What the caveats do not resolve:** Even at a generous discount, the direction is exponential separation, not convergence. Arbel, Salib, and Goldstein’s 2026 paper exposes a deeper issue: until regulators agree on how to *count* AI systems for legal purposes (is a base model fine-tuned into ten thousand variants one system or ten thousand?), the governance gap cannot even be accurately measured, let alone closed.

**Who this affects most directly:** CFOs and audit committees. The gap means that an organisation deploying AI agents is operating in a space where binding regulatory standards have not yet been set. This is a commercial advantage (move fast, fewer constraints) and a financial risk (when regulation does arrive, retroactive compliance costs tend to exceed proactive compliance costs — the pattern is consistent across financial regulation, data protection, and environmental law).

**The strategic choice:** Build governance infrastructure now, when the organisation has discretion over its design, or build it later under regulatory duress, when the design is dictated from outside. Neither is free. The first costs time and resources on standards that may not match whatever regulation eventually arrives. The second risks retrofitting a framework onto systems that were not designed for it.

Finding 3: Rights violations scale with autonomy — and the exposure is specific

High-autonomy systems (Levels 2 and 3) show a rights violation rate 2.67 times higher than human-controlled systems (Fisher’s exact test: p = 0.030). This is a confirmed finding. The correlation-not-causation caveat is real: higher-autonomy systems tend to be deployed in rights-sensitive domains (content moderation, facial recognition, automated hiring, predictive policing) precisely because the volume of decisions exceeds human capacity. There is also a detection asymmetry: when a Level 3 system violates a right, there is no human intermediary to absorb the attribution, so the AI is named directly. The measured rate may partly reflect clearer visibility rather than higher actual incidence.

**Who this affects most directly:** Boards and ethics committees, but also procurement. The finding means that the autonomy level of an AI system is a material variable in its rights-risk profile. An organisation that treats a Level 1 recommendation engine and a Level 3 autonomous decision-maker as equivalent in its rights-impact assessment is not accounting for a 2.67× differential.

**The strategic choice and its commercial dimension:** Reducing autonomy reduces rights-violation exposure but also reduces the speed and throughput that made the agent valuable. A content moderation system that flags posts for human review (Level 1) is safer but handles a fraction of the volume. The choice is between accepting higher rights-risk at higher autonomy, or accepting lower operational capacity at lower autonomy. The organisations that resolve this well will be the ones that vary autonomy level by decision sensitivity rather than applying a single setting across all use cases — high autonomy for low-stakes, high-volume decisions; human review for rights-sensitive ones.

Chaffer and colleagues’ 2026 proposal for distributed legal infrastructure offers a third path: embed compliance verification into the agent’s transaction layer, so that rights-sensitive decisions are caught in real time rather than reviewed after the fact. This is governance at machine speed rather than litigation speed. Whether it works depends on whether rights-sensitivity can be reliably encoded in rules — straightforward for known categories, fragile for novel situations.

Finding 4: The “who do you sue?” problem is real but not yet a crisis

Across all 214 incidents, 89.7 per cent identify both a deployer and a developer. The “who do you sue?” question, frequently raised as the agentic economy’s central governance challenge, has a clear answer for most incidents in the current record: the organisations are named. But the number shifts with autonomy. At Level 3 (fully autonomous), the unknown-entity rate doubles to 22.6 per cent. This is the structural trend that matters: as systems act more independently, the connection between the action and the accountable organisation becomes harder to establish in public reporting.

**Who this affects most directly:** General counsel, company secretaries, and insurers. An organisation that deploys Level 3 agents has roughly one-in-five odds of being in a situation where the public record does not clearly link the incident to a responsible party. For the organisation itself, this may seem like protection. It is not. An unattributed incident is a litigation discovery waiting to happen; it is also a regulatory red flag in jurisdictions that are moving toward mandatory AI incident reporting (the EU AI Act’s Article 62, for example).

**The forward-looking implication:** The current attribution rate (89.7 per cent) describes a world where most AI incidents involve a single organisation’s system. The agentic economy is moving toward agent-to-agent chains: a procurement AI that contracts with a supplier’s pricing AI, which triggers a logistics AI’s routing decision. When the shipment arrives at the wrong specification, tracing the causal chain through three automated handoffs is a fundamentally different attribution problem. Fagan’s 2026 analysis of autonomous AI and ownership rules identifies the endpoint: when an AI’s outputs are sufficiently distant from any human decision, existing ownership and liability doctrines — which assume a traceable human principal — no longer apply. For boards, this intersects with intellectual property strategy as much as liability: if an agent autonomously generates a valuable output during a transaction, who holds the rights?

**The strategic choice:** Invest in attribution infrastructure (agent identity registries, transaction logging, chain-of-custody documentation for every automated handoff) or accept that attribution gaps will be resolved through litigation, at litigation cost. The insurance market is beginning to price this: organisations with demonstrable agent-governance documentation are likely to face lower premiums than those without, as AI-specific underwriting matures.

Finding 5: The inequality question is not in the incident data — it is ahead of it

High-autonomy systems are not statistically more likely to discriminate than low-autonomy systems (χ² = 0.72, p = 0.397). The hypothesis is not confirmed.But 47.2 per cent of all AI incidents — regardless of autonomy level — involve vulnerable populations (minors, protected characteristics, or a recognised discrimination basis). This is not an autonomy problem. It is a deployment-context problem: AI systems at every autonomy level are frequently placed in decisions that affect people who are least equipped to challenge them.

**The deeper implication, from the research:** Sharp, Bilgin, Gabriel, and Hammond’s 2025 work on agentic inequality reframes the question. The incident data captures individual AI failures. What it cannot capture is *structural* inequality: the systemic advantage that accrues to those who deploy AI agents over those who are subject to them. An AI agent that negotiates a contract for a sophisticated principal is optimising for that principal’s interests, potentially at the expense of counterparties who lack equivalent capability. Scale this across labour markets, financial services, and public-benefit allocation, and the result is a new axis of inequality between agent-deployers and agent-subjects. Catalini, Hui, and Wu’s 2026 analysis adds economic precision. The traditional dividing line between automatable and non-automatable work was routine versus non-routine. The agentic economy shifts this to measurable versus non-measurable: tasks whose outputs can be quantified (and therefore safely delegated to agents) versus tasks that resist quantification (and therefore resist safe delegation). The implication for workforce governance is that the roles at greatest risk are not necessarily the lowest-paid or least-skilled — they are the ones whose outputs are most easily measured. Some highly paid analytical roles are more automatable than some lower-paid roles that involve complex, unstructured human interaction.

**Who this affects most directly:** Boards with workforce exposure, remuneration committees, and organisations in public-facing sectors (financial services, healthcare, public administration) where the populations affected by agentic decisions are also the populations least likely to have their own agents advocating for them.

**The strategic choice:** Treat inequality impact as a compliance exercise (assess when required, report when mandated) or treat it as a strategic risk (assess proactively, because the reputational and regulatory cost of being identified as a driver of agentic inequality will arrive faster than the formal mandate). The 47.2 per cent vulnerable-population rate is the evidence that the exposure is already embedded in AI deployment patterns, regardless of autonomy.

The Five Choices This Article Surfaces

Each finding points to a decision that an organisation deploying AI agents is already making, whether explicitly or by default:

**1. Attribution architecture: before or after?** Build agent identity registries, transaction logs, and chain-of-custody documentation now, when design discretion exists, or reconstruct the chain from partial records after the first multi-agent failure reaches litigation.

**2. Governance timing: proactive or retroactive?** The 29:1 governance gap means formal regulation has not arrived for most agentic activity. Organisations can set their own standards now (with the risk of divergence from eventual regulation) or wait for regulatory direction (with the risk of retroactive compliance costs and embedded technical debt).

**3. Autonomy calibration: uniform or variable?** A rights-violation odds ratio of 2.67× for high-autonomy systems argues against a single autonomy setting across all use cases. The operational question is whether the organisation can match autonomy level to decision sensitivity — high autonomy for high-volume, low-stakes processes; human review for rights-sensitive ones — and whether the infrastructure supports that granularity.

**4. Liability posture: tool or delegate?** If the agent is a tool, the deployer is responsible for everything it does. If it is a delegate, the liability chain is ambiguous. Neither posture is cost-free: the first constrains the agent’s operational scope; the second constrains what the organisation can disclaim when the agent acts outside expectations.

**5. Inequality: compliance or strategic risk?** Nearly half of all AI incidents already involve vulnerable populations. Organisations can treat this as a disclosure obligation or as an early signal that the reputational and regulatory costs of agentic inequality will arrive before the formal mandates.None of these choices has a self-evidently correct answer. All of them are being made right now — in many cases by default, because the absence of a deliberate choice is itself a choice (to accept ambiguity, to defer governance, to apply uniform autonomy settings, to treat inequality as someone else’s problem). The function of the data is not to prescribe the right answer but to make the choice visible.

What Comes Next

The agentic economy’s governance challenges are real but uneven. Multi-entity complexity has not yet produced worse outcomes in the incident record. Attribution is better than commonly assumed for most incidents. Discrimination patterns do not differ by autonomy level. But the governance gap is widening, rights violations correlate with autonomy, and the structural inequality question — who benefits from agents and who bears their costs — is ahead of the data, not behind it.The legal system’s capacity to adapt is not zero. It has handled distributed causation, electronic commerce, and corporate agency before. But the pace of adaptation is the central tension. Kolt, Arbel, Chaffer, Xu, and Fagan each propose different architectures (liability reform, individuation standards, distributed legal infrastructure, blockchain ledgers, ownership doctrine revision), and none has been tested at the scale the agentic economy will demand. The next article turns to a domain where the governance stakes are physical rather than economic: robotics and embodied AI. When an AI system is not just transacting in a market but moving through a warehouse, driving on a road, or assisting in surgery, the consequences of governance failure are measured in injuries and lives, not contracts and dollars.

Tanya Matanda is a governance strategist bridging institutional oversight, AI governance, and fiduciary resilience. Her work supports boards, LPs, and regulators in designing governance systems fit for the AI era.

Rights Reserved. 2026 Matanda Advisory Services

Research and Audio Supported by AI Systems

Methodology and references available on request

Tanya's Substack

Discussion about this post

Ready for more?