legal compliance ai-agents risk-management general-counsel

AI Agents and Legal Risk: What Every General Counsel Needs to Know Before Deployment

Agentic Runbook ·

For most of the last decade, General Counsel could engage with AI at arm’s length. Review the vendor’s terms of service. Add an AI use policy to the employee handbook. Make sure the data processing agreement covers the new SaaS tool. That was enough.

It is no longer enough.

AI agents — systems that don’t just generate text but take autonomous actions, call external tools, read and write data, make decisions that trigger downstream workflows, and operate continuously without a human in the loop — are a categorically different legal surface than a chatbot or a writing assistant. When an agent books a vendor contract, denies a customer request, screens a job applicant, or advises on a regulated matter, the legal questions aren’t hypothetical. They’re immediate and they’re live.

The companies deploying these systems in 2025 and 2026 are doing so faster than their legal functions are catching up. The result is a deployment pattern where the technology decision was made by engineering and the business, the risk was assessed by neither, and the GC is finding out about it when something goes wrong — or when a regulator asks.

This post is a practitioner-level framework for General Counsel, CCOs, and VP Legal at mid-market companies who need to get ahead of that curve. It covers the five legal risk categories that matter most for agentic AI deployment, the human-in-the-loop requirements that are legally mandated versus best practice, what to demand from any AI vendor or consultant before you sign, and a 10-item pre-deployment checklist that legal can own.


Why Agentic AI Is Different: The Judgment-Call Gap

Traditional enterprise software executes defined logic. If the input is X, the output is Y. The legal questions are mostly about what the software does with data and who owns the system. AI agents are different in one critical way: they exercise judgment.

An agent deciding whether to escalate a customer complaint, which vendor to recommend from a shortlist, whether a contract clause is acceptable, or how to respond to a regulatory inquiry is making a judgment call that in a pre-AI world would have been made by a human employee. That human employee was subject to a defined scope of authority, accountable to a manager, subject to employment law, and legally attributable in ways that were well-understood.

The judgment-call gap is the space between what an agent does autonomously and what your legal, compliance, and governance frameworks were designed to govern. Filling that gap before deployment — not after — is the GC’s job.

The five risk categories below are where the gap is widest.


Risk Category 1: Data Privacy and PII

The core exposure

AI agents are data-hungry. They read documents, query databases, search email, process customer records, and in many architectures maintain persistent memory across interactions. Every one of those data flows implicates your privacy obligations under GDPR, CCPA/CPRA, and any applicable sectoral privacy regime.

The specific GDPR provision that most agents trigger is Article 22, which restricts automated decision-making that produces legal or similarly significant effects on individuals. If your agent is denying insurance claims, setting credit limits, prioritizing customer service queues, or making any other consequential determination about a natural person without human review, Article 22 applies. Its requirements — the right to explanation, the right to human review, the right to object — must be operationalized in the agent’s architecture, not just acknowledged in a privacy notice.

CCPA/CPRA adds deletion rights and opt-out rights that interact uncomfortably with fine-tuned models. If a consumer exercises their right to deletion of personal data and that data was used to train or fine-tune the agent model, “deletion” requires more than removing the record from your CRM. You need a documented protocol for model-level data excision or retraining — and a clear answer to the question of whether customer data was used for fine-tuning in the first place.

Data Processing Agreements (DPAs) with your AI vendor or hosting provider must be in place before any personal data flows through the agent. This is a contractual prerequisite, not an administrative checkbox. If the vendor’s DPA doesn’t specify data residency, processing purposes, sub-processor disclosure, and deletion obligations, it’s not adequate. Many standard AI vendor DPAs were drafted for SaaS tools and haven’t been updated to address the data handling profile of agentic systems.

What to get right before deployment

  • Conduct a Data Protection Impact Assessment (DPIA) for any agent processing personal data at scale. GDPR Art. 35 requires it for high-risk processing; even where it’s not strictly required, a DPIA forces the documentation discipline that a regulator or plaintiffs’ attorney will ask for.
  • Map every data flow: what PII the agent reads, what it writes, where it stores context, what gets sent to the model provider, and what gets logged.
  • Document your legal basis for each processing purpose. “Legitimate interests” is not a blanket pass — it requires a balancing test that should be documented.
  • Establish a deletion protocol with specific SLAs. At Agentic Runbook, our standard includes a 72-hour SLA for PII deletion from active agent state and a 30-day SLA for cryptographic erasure from cold storage and backups. These timelines need to be in your DPA and technically verifiable.
  • If the agent processes data across jurisdictions, map the data transfer mechanisms. Standard Contractual Clauses (SCCs) remain the primary mechanism for EU-to-third-country transfers; verify that your vendor has executed current SCCs and that their sub-processors are covered.

Risk Category 2: Contract and Liability Exposure

When agents make consequential errors

An agent that recommends the wrong vendor contract term, approves a transaction it shouldn’t, commits your company to a course of action based on a hallucinated fact, or sends a communication that creates an unintended legal obligation has created a liability event. The question is: who bears it, and was it foreseeable?

Under contract law, the relevant questions are: Did the agent have actual or apparent authority to bind the company? Was the counterparty’s reliance reasonable? Is the company estopped from denying the agent’s action? These are not novel doctrines — they’re standard agency law applied to a new fact pattern. Courts have been resolving agency authority questions for centuries; agentic AI doesn’t change the doctrine, but it does change how often the question arises and how fast.

Your vendor’s indemnification terms deserve specific scrutiny. Standard AI vendor indemnification provisions typically exclude: (a) outputs that result from customer-provided inputs, (b) use cases outside the documented permitted use, and (c) modifications to the model or system made by the customer. If your agent hallucinates a contract term and your company acts on it, the vendor will argue that the output was a function of your prompting, your fine-tuning, and your deployment configuration — none of which they indemnify. You need to know what your actual indemnification coverage looks like before you’re relying on it.

Decision logging is the contractual and evidentiary foundation for defending against liability claims. Every consequential agent decision — every tool call, every output that triggered a downstream action, every escalation or non-escalation — needs to be logged with sufficient detail to reconstruct what the agent knew, what it decided, and why. Without that audit trail, you cannot defend against a negligence claim, you cannot satisfy a regulatory inquiry, and you cannot diagnose what went wrong to prevent recurrence. At Agentic Runbook, full audit trail on all agent decisions is a non-negotiable architecture requirement, not an optional feature.

Practical steps

  • Review and annotate your AI vendor’s limitation of liability and indemnification provisions before deployment. Identify the gap between what you’re exposed to and what the vendor will cover.
  • Define the agent’s scope of authority explicitly in writing, and ensure that any external-facing communications from the agent are clearly attributed to the AI system, not to a human representative of your company.
  • Implement decision logging at the architecture level — not as an afterthought. Your logging should capture: inputs received, reasoning steps taken, tools called, outputs generated, and the human review step (if any) before action was taken.
  • Consider whether agent errors are covered under your existing E&O or professional liability policies. Most are not. Work with your insurance broker on an AI endorsement or a standalone policy before you need it.

Risk Category 3: IP Ownership

Who owns what the agent creates

The IP questions around agentic AI are genuinely unsettled in most jurisdictions, and the practical stakes are higher than most companies realize at deployment time.

Copyright in AI outputs under current U.S. Copyright Office guidance requires human authorship. Purely AI-generated outputs are not copyrightable. But the real question for most enterprise deployments isn’t “is this output copyrightable?” — it’s “who owns the code, the trained model, the fine-tuning data, and the agent system itself?” Those questions have answers, but they depend entirely on your vendor and consulting agreements.

At Agentic Runbook, our default is full IP transfer to the client: the agents we build, the code we write, and any fine-tuned model weights are the client’s property at handoff. We document this explicitly in our engagement agreements. That’s not the market default. Many AI consulting engagements retain rights to the underlying architecture, “background IP,” or model improvements — meaning your agent system may not belong to you even after you’ve paid to build it. Read the IP assignment clause before signing.

Fine-tuning data ownership compounds this. If you fine-tune a model on your proprietary documents, transaction records, or customer data, the resulting model weights contain implicit representations of that data. The question of who owns the fine-tuned model — and what rights you retain to your data as embedded in the weights — is not answered by most standard vendor agreements. It needs to be addressed explicitly.

Third-party training data exposure creates the inverse risk: the foundation model underlying your agent was trained on data that may include copyrighted material. Several major litigation cases are currently testing whether model outputs that closely reproduce training data constitute copyright infringement. This risk is on the horizon, not abstract — and it’s a reason to prefer model providers with documented, licensed training data provenance and indemnification for IP claims arising from their base models.

Practical steps

  • Audit the IP provisions in every AI vendor and consulting agreement before signing. Ask specifically: who owns the agent code, the fine-tuned weights, and any improvements made during the engagement?
  • Document your fine-tuning data: what data was used, what rights you hold in that data, and whether it includes third-party material that could create derivative work issues.
  • Include IP representations and warranties from your AI vendor covering their training data’s provenance. If they won’t provide them, price the risk accordingly.
  • For agents that produce customer-facing outputs (reports, analyses, creative work), include appropriate disclaimers about AI-assisted generation where business context warrants it.

Risk Category 4: Employment Law Implications

Automated employment decisions

If your agent participates in any stage of hiring, promotion, performance management, compensation-setting, or termination decisions, you are operating an automated employment decision tool — and several specific legal regimes apply.

EEOC guidance on AI in employment (the 2023 technical assistance document and subsequent agency coordination) takes the position that disparate impact liability applies to AI-assisted hiring tools. If your agent’s screening criteria produce statistically adverse outcomes for a protected class — regardless of the intent behind the criteria — you have an EEOC exposure. Your obligation to validate the tool for job-relatedness and business necessity doesn’t disappear because the decision-making is automated; if anything, the automation makes the pattern easier for a regulator to identify.

New York City Local Law 144 (effective since July 2023) requires employers using automated employment decision tools in NYC to conduct annual bias audits by an independent third party, publish the audit results, and provide notice to candidates and employees. This is not speculative — it’s an active enforcement regime. If you have employees or applicants in NYC and your agent touches hiring or promotion decisions, this law applies. Similar legislation is moving through Illinois, California, and the EU’s AI Act.

The EU AI Act classifies employment-related AI systems as “high risk” — a designation that triggers mandatory conformity assessments, human oversight requirements, transparency obligations, and registration in the EU database. If you operate in the EU, high-risk AI in employment is not a future compliance requirement; it’s a live one from August 2026.

Monitoring laws add a third layer. Agents that monitor employee productivity, communications, or behavior implicate state-level electronic monitoring notification statutes (New York, Connecticut, Delaware, others) and, in the EU, works council co-determination rights. If you have a European works council, they may have consultation or co-determination rights before you deploy an agent that monitors employee behavior — even if the monitoring is incidental to the agent’s primary function.

Practical steps

  • Map every agent touchpoint with employment decisions before deployment. Include not just hiring agents but any agent that generates performance data, flags productivity concerns, or routes work assignments.
  • Commission a pre-deployment bias audit for any automated employment decision tool. This is a legal prerequisite in NYC and a best practice everywhere else.
  • Review electronic monitoring notification requirements in every jurisdiction where your employees are located. Issue notices where required before the agent goes live.
  • Engage works councils in EU jurisdictions early — not after the deployment decision is made. The consultation requirement is procedural; getting it right is easier than restarting the process.

Risk Category 5: Sector-Specific Compliance

Financial services

Agents operating in financial services environments operate under a compliance regime that most technology deployments don’t face. SR-11-7 (the Federal Reserve’s model risk management guidance, applicable to banks and their supervised subsidiaries) requires model validation, documentation of model assumptions and limitations, independent review, and ongoing monitoring for model performance — all of which apply to AI agent models used in credit, trading, or risk management decisions.

SEC and FINRA have both issued guidance and proposed rules on AI use in investment advice and broker-dealer operations. The critical issue is the “best interest” standard: if an agent is providing investment recommendations or information that a customer could reasonably rely on as advice, it must meet Regulation Best Interest requirements, and the agent’s decision logic must be documented and auditable.

Healthcare

HIPAA Business Associate Agreements (BAAs) are required before any Protected Health Information (PHI) flows through an AI agent. Your model provider, hosting provider, and any sub-processors handling PHI must be covered under a BAA. Many standard AI vendor agreements don’t include BAA terms — they need to be negotiated specifically.

Software as a Medical Device (SaMD) classification under FDA guidance applies if your agent is making or contributing to clinical decisions. The threshold is lower than many organizations expect: an agent that analyzes patient data and generates a recommendation that a clinician acts on may qualify as SaMD. FDA clearance timelines and requirements are significant; if your healthcare agent might trigger SaMD classification, engage regulatory counsel before the build, not after.

Agents providing legal information, drafting legal documents, or operating in contexts where a client might reasonably rely on the output as legal advice create Unauthorized Practice of Law (UPL) risk. The definition of UPL varies by jurisdiction, but the common thread is that legal advice — as distinct from legal information — requires a licensed attorney. An agent that tells a user what their rights are in a specific situation, or that recommends a specific legal course of action, is in UPL territory in most U.S. jurisdictions. Law firms deploying agents for client-facing work need specific ethics review and, in many states, disclosure requirements.

Government contracting

If your company holds federal contracts or is pursuing them, CMMC (Cybersecurity Maturity Model Certification) applies to any system handling Controlled Unclassified Information (CUI) — which increasingly includes AI agent systems processing contract data, technical data, or export-controlled information. The CMMC 2.0 framework’s requirements for access control, audit logging, incident response, and configuration management apply to the agent infrastructure, not just to traditional IT systems. Defense contractors deploying agents need to map their agent architecture to CMMC controls before deployment.


Human-in-the-Loop: Legally Required vs. Best Practice

The question isn’t whether to have human oversight of AI agents — it’s which decisions require it by law versus which require it as sound governance.

Legally mandated human oversight

GDPR Article 22 prohibits solely automated decisions that produce legal or similarly significant effects on data subjects, unless the individual has explicitly consented, the decision is necessary for a contract, or a specific legal authorization applies — and even then, the controller must implement suitable measures including the right to obtain human intervention, to express a point of view, and to contest the decision. “Solely automated” means no meaningful human review in the decision path. A human who rubber-stamps an agent’s output without reviewing the inputs and reasoning is not meaningful human intervention for Article 22 purposes.

The EU AI Act mandates human oversight requirements for all high-risk AI systems — including employment AI, credit scoring, law enforcement, critical infrastructure, and education. High-risk system operators must be able to intervene, override, and if necessary disable the system. This requires architecture-level interrupt capability, not just a policy that says “humans can override.”

EEOC and employment law don’t mandate human review in terms as explicit as GDPR, but the disparate impact liability framework effectively requires it: if you can’t explain and justify each automated employment decision, and if you haven’t conducted bias validation, your exposure in an EEOC enforcement action or Title VII litigation is significant.

Hard-gating architecture

The technical implementation of human-in-the-loop requirements matters. A policy that says “consequential decisions require human review” is not enforceable unless the system architecture makes it impossible to bypass. This means:

  • Interrupt gates at defined checkpoints in the agent workflow — the agent stops, presents its reasoning and proposed action, and waits for explicit human approval before proceeding. LangGraph’s interrupt() mechanism implements this at the framework level.
  • Immutable audit logs that record the human review step, the identity of the reviewer, the timestamp, and the decision made. A log that can be edited or deleted is not an audit log.
  • Scope-limited tool execution so that even if the interrupt gate fails, the agent cannot take high-consequence actions autonomously. Least-privilege tool permissions and operation-level rate limits are the backstop.

The distinction between a policy and a hard gate matters in litigation and regulatory enforcement: the presence of a policy that was not technically enforced is often worse than the absence of a policy at all.


What to Demand from an AI Vendor or Consultant

Before you sign an engagement agreement or master services agreement with an AI vendor or consulting firm, legal should be able to answer the following questions affirmatively:

Data handling: Does the agreement specify exactly what data the vendor can access, what they can use it for, whether it can be used for model training, and what happens to it at contract termination? If the vendor won’t contractually commit that your data won’t be used to train their models, that’s a data governance problem and potentially a regulatory one.

IP transfer: Does the agreement transfer full ownership of the agent code, any fine-tuned model weights, and all work product to your company upon delivery? Or does the vendor retain rights to “background IP,” “base platform,” or “model improvements” that effectively means the agent system they built for you can be deployed for your competitors? Full IP transfer should be the baseline demand, not a negotiating luxury.

Liability terms: What does the vendor actually indemnify? Read the exclusions. Most AI vendor indemnification excludes outputs driven by customer inputs, customer-specified configurations, and customer modifications — which together cover almost every real failure scenario. Negotiate for broader indemnification coverage or price the unindemnified risk explicitly.

Audit trail and observability: Can the vendor demonstrate, not just represent, that every agent decision is logged with sufficient granularity to reconstruct the decision path? Ask to see a sample trace from their production systems. If they can’t produce one, their observability is not what they’re claiming.

Security and data residency: Where is the data processed? Where is it stored? Who are the sub-processors? Does the agreement give you the right to audit or require the vendor to provide SOC 2 Type II certification? Is there an incident response SLA?

Deletion obligations: What is the vendor’s documented SLA for data deletion upon request or at contract termination? Cryptographic erasure — not just logical deletion — should be specified for sensitive data, with verification available.


This checklist is designed for GCs and VP Legal to use as a gate-review before any AI agent goes to production. It does not replace a full legal review, but it establishes a minimum standard.

1. DPIA completed and documented. A Data Protection Impact Assessment has been conducted for every agent that processes personal data. The DPIA documents the data flows, the legal basis for processing, the risks identified, and the mitigations implemented.

2. DPAs in place with all vendors. Data Processing Agreements — including sub-processor disclosures, data residency commitments, and deletion SLAs — are executed with every vendor whose systems handle personal data processed by the agent.

3. GDPR Article 22 analysis complete. If the agent makes decisions with legal or similarly significant effects on individuals, the Art. 22 exceptions have been identified, the human-review mechanism has been implemented in the architecture, and the privacy notice has been updated.

4. IP ownership confirmed in writing. The engagement agreement or vendor MSA has been reviewed and confirms that full IP ownership — code, model weights, fine-tuning data, work product — transfers to your company. No retained background IP that covers the deployed system.

5. Vendor indemnification scope documented. The exclusions from the vendor’s indemnification provision have been read, catalogued, and presented to relevant business stakeholders. Insurance coverage for unindemnified risk has been confirmed or procured.

6. Employment law audit completed. If the agent touches any employment decision, an independent bias audit has been conducted, NYC Local Law 144 requirements have been assessed, and monitoring notification requirements in all relevant jurisdictions have been met.

7. Sector-specific requirements addressed. Finance: SR-11-7 model validation documentation exists. Healthcare: BAA executed, SaMD classification assessed. Legal services: UPL analysis conducted, ethics review completed. Government contracting: CMMC control mapping completed.

8. Hard-gate human-in-the-loop implemented. For all decisions legally or operationally requiring human review, the interrupt gate is implemented at the architecture level — not just as a written policy. The audit log captures reviewer identity, timestamp, and decision.

9. Decision audit trail validated. A sample of agent decision logs has been reviewed by legal and confirmed to contain sufficient detail to reconstruct the decision path for regulatory inquiry or litigation purposes.

10. Deletion and breach response protocols documented. Data deletion SLAs (both for user-request deletion and contract-termination deletion) are documented and technically implemented. An AI-specific incident response plan is in place, with defined notification obligations if an agent-related data breach occurs.


Closing: The Window to Get This Right Is Now

The regulatory environment for agentic AI is consolidating rapidly. The EU AI Act’s high-risk system requirements are effective now for new systems. State-level employment AI legislation is moving through multiple legislatures simultaneously. The FTC, CFPB, EEOC, and SEC have all issued guidance or enforcement signals on AI use in their respective domains. The question isn’t whether your AI agent deployments will be regulated — it’s whether you’ll have the documentation, the architecture, and the governance in place when they are.

The companies that get this right are the ones where legal was involved before the build, not after. A GC who reviews the vendor agreement after the agent is in production is reviewing a fait accompli. A GC who reviews the architecture before the build shapes it.

At Agentic Runbook, every engagement starts with a structured assessment — our Diagnostic Sprint — that includes legal and compliance surface area mapping alongside the technical and operational assessment. We transfer full IP ownership to clients, maintain documented data privacy ADRs with cryptographic erasure protocols, and build full audit trail capability into every agent architecture. Because the best time to handle these questions is before a single line of code is written.


Frequently Asked Questions

Q: Does GDPR Article 22 apply to every AI agent, or only specific use cases?

Article 22 applies specifically to decisions that are (a) solely automated, (b) produce legal or similarly significant effects on an individual, and (c) involve a natural person as the subject. Not every agent decision triggers it — an agent that summarizes documents or routes internal tickets is unlikely to be in scope. An agent that makes credit decisions, insurance determinations, employment screening decisions, or customer service outcomes that affect the customer’s legal rights or interests is squarely in scope. The key threshold is “legal or similarly significant effects,” which the EDPB interprets broadly to include decisions that significantly affect an individual’s circumstances, behavior, or choices.

Q: If an AI agent makes a mistake that causes financial harm, who is liable?

The legal analysis depends on several factors: the agent’s scope of authority, whether the counterparty’s reliance was reasonable, the applicable standard of care, and the terms of the vendor agreement. Under current law, liability generally flows to the company deploying the agent — the vendor’s limitations of liability typically disclaim responsibility for outputs. The company deploying the agent is treated as its principal. This means your company is liable for the agent’s actions within the scope of the authority you’ve given it, subject to whatever contractual limitations you’ve negotiated with counterparties and vendors.

Q: What is the EU AI Act’s “high-risk” classification, and does it apply to my company?

The EU AI Act Annex III lists the categories of AI systems classified as high-risk, which include: biometric identification systems, critical infrastructure management, education and vocational training systems, employment and worker management, access to essential services (credit, insurance, benefits), law enforcement, migration and border control, and administration of justice. Any company operating in the EU or offering products and services to EU residents that deploys AI systems in these categories must comply with the high-risk requirements — regardless of where the company is headquartered. For mid-market U.S. companies with EU customers or employees, the high-risk classification is very likely to apply to at least some AI agent use cases.

Q: Can we use a standard vendor NDA to protect the IP in our AI agent system?

No. An NDA protects confidential information from disclosure — it doesn’t transfer IP ownership. You need an explicit IP assignment clause (or “work made for hire” provision where applicable) in the services agreement. The IP assignment needs to cover: the agent source code, any fine-tuned model weights, all documentation and training materials, and all work product produced during the engagement. If the agreement doesn’t have this language, the default rule in most jurisdictions is that the vendor retains ownership of what they created.

Q: What should be in an AI-specific incident response plan?

An AI-specific incident response plan should address: (1) detection — how you identify that an agent has malfunctioned, made systematically incorrect decisions, or been compromised; (2) containment — the ability to disable or roll back the agent without disrupting dependent systems; (3) notification — which incidents trigger breach notification obligations under GDPR (72-hour supervisory authority notification), state breach notification laws, and sector-specific requirements; (4) root cause analysis — the obligation to investigate what went wrong and document findings; and (5) remediation — the corrective actions taken and the changes to agent architecture or governance to prevent recurrence. The plan should be tested, not just written.


Get Legal Clarity Before You Build

Our Diagnostic Sprint maps the legal, compliance, and technical surface area of your AI agent deployment before a single line of code is written — covering data privacy, IP ownership, employment law obligations, and sector-specific requirements. Four weeks. Fixed scope. You own everything.

Book a Diagnostic Sprint

Agentic Runbook designs, builds, and transfers agentic AI systems for mid-market engineering, finance, and operations teams. Start with a Diagnostic Sprint →

Ready to build your agentic team?

Start with a Diagnostic Sprint — a 2–4 week structured audit that produces your prioritized Agentic Roadmap.

Start with a Diagnostic →