What is HIPAA-compliant AI?

66% of US physicians now use AI tools in clinical practice. Only 23% of health systems have a signed Business Associate Agreement with their AI vendors.
That gap is not an infrastructure problem. The servers may be encrypted. The access logs may exist. The compliance failure happens in the data flow: an AI vendor processing patient records without a signed agreement, a memory store retaining diagnostic history without a retention policy, a scheduling agent receiving a full clinical record when it needed only a calendar slot.
HIPAA-compliant AI is not a certification, a product category, or a vendor designation. It is a property of how an organisation builds, deploys, and governs any AI system that touches protected health information. This article defines what that means in practice, what the law actually requires, and where most implementations go wrong.
What “HIPAA-compliant AI” actually means
No AI tool is inherently HIPAA-compliant. The US Department of Health and Human Services does not certify or approve AI products. HIPAA is a federal law enforced by HHS Office for Civil Rights. It sets requirements for how covered entities and business associates must safeguard protected health information. Whether an AI system meets those requirements depends entirely on how the organisation implementing it governs the data that flows through it.
Vendors may obtain independent security attestations, such as SOC 2 Type II reports or HITRUST certification, to demonstrate the effectiveness of their security controls. These are valuable assurance frameworks. They are not substitutes for HIPAA compliance and do not resolve the organisation’s own obligations.
The distinction between “HIPAA-eligible” and “HIPAA-compliant” is where most organisations lose ground. A vendor being HIPAA-eligible means they will sign a Business Associate Agreement. It does not mean that using their API automatically makes an implementation compliant. That responsibility remains with the covered entity. Configuration, access management, data minimisation, and audit controls are all the organisation’s obligation, regardless of what the vendor’s marketing materials state.
The three HIPAA rules that apply to AI systems
HIPAA is not a single rule. Three distinct rules govern how AI systems that touch patient data must be built and operated.
The Privacy Rule
The Privacy Rule governs the use and disclosure of PHI. It establishes the minimum necessary standard: an AI system should only receive, process, and transmit the specific data fields required for its function. A triage agent does not need payment history. A billing agent does not need clinical notes. A scheduling agent does not need a diagnosis.
The Privacy Rule also imposes purpose limitation. If a patient provides information to book an appointment, that data cannot be used to train a commercial AI model without proper authorisation or de-identification. This is a point of frequent non-compliance in healthcare AI deployments, where data collected for one operational purpose is subsequently used to fine-tune or improve vendor models without patient consent (HHS Privacy Rule guidance).
The Security Rule
The Security Rule mandates administrative, physical, and technical safeguards for electronic PHI. Technical safeguards include access controls, audit controls, integrity controls, and transmission security.
HHS OCR proposed the first major update to the Security Rule in January 2025. As of publication, the final rule is pending; HHS has listed finalisation on its regulatory agenda for May 2026, though the timeline is subject to change. If enacted as proposed, the update eliminates the longstanding distinction between “required” and “addressable” safeguards. Encryption, multi-factor authentication, network segmentation, annual penetration testing, and 72-hour incident recovery would all become mandatory for every covered entity and business associate, with no option to document an alternative approach. Organisations should be designing for these requirements regardless of the final publication date (Alston & Bird, November 2025; Healthcare Law Insights, February 2026).
The Breach Notification Rule
The Breach Notification Rule requires covered entities to notify affected individuals, HHS, and in some cases media outlets following a breach of unsecured PHI. AI systems must include breach detection and response mechanisms capable of meeting these notification timelines. Penalties under current rules reach up to $1.5 million per violation category per year (HHS enforcement overview).
What counts as protected health information in an AI context
PHI is more expansive than most engineering teams assume, and the compliance risk in AI systems is not from individual identifiers but from their combination.
The HIPAA Safe Harbor method identifies 18 categories of information that, when linked to a patient, constitute protected health information. These range from names and dates to IP addresses, device identifiers, and biometric data. The compliance risk in an AI context is not from individual identifiers in isolation: a diagnosis code alone (for example, E11.65, type 2 diabetes with hyperglycaemia) is not PHI. The same code combined with a date of service and a five-digit ZIP code creates a uniquely identifiable record.
AI agents create this combination risk at every inference call. A scheduling agent that receives a full patient record to confirm an appointment has transmitted PHI to every system that record passed through, including the LLM API, the memory store, the tool call log, and any downstream service the agent invoked. Each of those transmission points creates a compliance obligation (HHS de-identification guidance).
De-identification offers an alternative compliance pathway. Data from which all 18 identifiers have been removed (Safe Harbor method) or which has been certified as de-identified by a qualified expert (Expert Determination method) falls outside the definition of PHI entirely. There are no HIPAA restrictions on the use or disclosure of properly de-identified data. In practice, de-identification is most viable for AI model training and evaluation; it is not always practical for real-time agentic workflows where the agent’s task depends on knowing who the patient is.
The table below shows the 18 PHI identifier categories and how each surfaces in a typical AI agent deployment.
PHI identifier category |
How it appears in an AI agent context |
Names |
Patient greeting in a conversational interface or chat confirmation message |
Dates (except year) |
Appointment confirmation payload; discharge date in a clinical summary prompt |
Geographic data smaller than a state |
ZIP code passed to a scheduling agent to find nearby facilities |
Phone numbers |
Patient contact field retrieved for an automated outreach workflow |
Email addresses |
Included in intake form data passed to an LLM for triage routing |
Social Security numbers |
Pulled from EHR during benefits verification by a billing agent |
Medical record numbers |
Used as a lookup key in a multi-system agent workflow |
IP addresses |
Captured in web-based intake form logs; stored in agent session state |
Biometric identifiers |
Voice prints from a voice agent interaction recorded for quality review |
Full-face photographs |
Patient ID image attached to an intake document processed by a document agent |
Device identifiers and serial numbers |
Device ID from a patient portal session passed through to an LLM in a debugging log |
URLs |
Patient portal URLs containing embedded session tokens or record identifiers |
Account numbers |
Insurance account number retrieved by an eligibility verification agent |
Certificate and licence numbers |
Provider licence number included in a referral routing payload |
Vehicle identifiers |
Rarely surfaced in AI workflows; possible in transport coordination systems for patient logistics |
Fax numbers |
Included in referral documents processed by a document parsing agent |
Health plan beneficiary numbers |
Passed to an insurance verification tool call as part of a claims workflow |
Any other unique identifying number or code |
Any custom internal patient ID used as a key in cross-system agent queries |
How agentic AI changes the compliance picture
Before agentic AI entered healthcare workflows, PHI governance was primarily a function of user identity. Access control systems were built around roles: a nurse could access the scheduling module, a physician could access the clinical record, a billing administrator could access payment data. Those role boundaries were enforced at the system level, and compliance audits focused on who had logged in and what they had viewed.
Agentic AI operates differently. An agent does not log in. It traverses multiple systems within a single workflow, calling APIs, querying databases, and passing data between tools autonomously. A claim processing agent might access the EHR to retrieve a diagnosis code, query the insurance platform to check coverage rules, call a scheduling system to book a follow-up, and write a summary to a communication platform, all within one orchestrated sequence. Each of those actions touches PHI. Most of the role-based governance structures in place today were never designed to govern data at the level of individual tool calls within an automated workflow.
The compliance failures we observe in production healthcare AI deployments do not originate at the infrastructure layer. They originate in the data flow. The three most common failure modes are: a memory store that retains patient conversation history without a defined retention or deletion policy; a tool call that reaches an external API without a signed Business Associate Agreement covering that specific vendor; and a prompt context that includes a full patient record when the agent function required only a single field from it.
The global AI in healthcare market was valued at USD 36.67 billion in 2025 and is projected to reach USD 505.59 billion by 2033, according to Grand View Research. The speed of that growth makes the governance gap more consequential, not less. Every new AI workflow introduced into a healthcare environment creates new data flow points that must be assessed, governed, and documented.
There is no single default architecture for a HIPAA-compliant AI agent. The correct pattern depends on one question: where is PHI allowed to flow, and under what operational controls?
Wojciech Achtelik, PhD(c), AI Tech Lead, Vstorm
For a detailed treatment of architecture patterns and a pre-deployment checklist grounded in production deployments, see our article on how to build a HIPAA-compliant AI agent.
Ready to see how agentic AI transforms business workflows?
Meet directly with our founders and PhD AI engineers. We will demonstrate real implementations from 30+ agentic projects and show you the practical steps to integrate them into your specific workflows—no hypotheticals, just proven approaches.
The four non-negotiable requirements
Regardless of the AI system, the vendor, or the use case, four requirements apply whenever PHI enters an AI workflow. None of them is optional, and none of them can be delegated entirely to a vendor.
- Business Associate Agreement. Any AI vendor that receives, processes, or stores PHI on behalf of a covered entity is a business associate under HIPAA and must sign a BAA. This includes LLM API providers, cloud storage vendors, logging platforms, and any third-party tool an agent calls if that call involves PHI. A vendor being “HIPAA-eligible” means only that they will sign a BAA. The existence of a signed BAA does not make an implementation compliant. It is a legal prerequisite, not a compliance outcome.
- Encryption. PHI must be encrypted at rest and in transit. The current standard is AES-256 for data at rest and TLS 1.2 or higher for data in transit. Under the proposed 2026 Security Rule update, encryption would be a mandatory requirement for all covered entities, with no option to document an equivalent alternative. This applies to PHI stored in databases, in agent memory stores, in log files, and in any intermediate data cache an agentic workflow creates.
- Audit logging. Every request that touches PHI must be logged with sufficient detail to demonstrate to an auditor that the controls described in the organisation’s policies were actually operating at the time. Logs must record what data was accessed, by which system component, at what time, and under what authorisation. The logs themselves must be encrypted, access-controlled, and retained for a minimum of six years (HHS Security Rule audit controls).
- Access controls. HIPAA requires role-based access control that enforces the minimum necessary standard. In an agentic context, this means access controls must be applied at the tool call level, not only at the user or session level. An appointment confirmation agent must be provisioned with access to the scheduling system only. If that agent can also query the clinical record or the pharmacy system, the access control is non-compliant regardless of whether it ever exercises that access (Aptible HIPAA-compliant AI guide).
What HIPAA-compliant AI looks like in production
Vstorm has worked with a US healthcare provider serving more than 100,000 members to build agentic systems that operate within HIPAA constraints in production. The architectural decisions in that engagement illustrate the difference between compliant and non-compliant implementation in concrete terms.
The most consequential decisions were not about infrastructure. They were about data scoping at every agent tool call. A scheduling agent that handles appointment booking received only the fields required to complete that function: a patient identifier, a requested appointment type, and available time slots. It did not receive diagnostic codes, insurance details, or clinical notes, even though all of those were accessible in the underlying EHR. Enforcing that boundary required explicit scoping at the API integration layer, not reliance on the EHR’s default data model.
Memory management was the second critical area. Conversational agents that interact with patients over multiple sessions have a default tendency to retain context. In a compliant architecture, agent memory must be bounded: what is retained, for how long, and under what deletion schedule must be defined and enforced. Retaining a patient’s stated symptoms in an agent memory store without a documented retention policy is a PHI governance failure, regardless of whether that data is encrypted.
The third area was vendor coverage. Every external API the agent called, including the LLM endpoint, the calendar integration, and the notification service, required a signed BAA before the system went to production. This is operationally straightforward. It is frequently skipped.
For a detailed breakdown of the three architecture patterns we deploy in healthcare AI compliance contexts, including HIPAA-eligible cloud endpoints, on-premise local models, and PHI de-identification layers, see the full architecture article and the multi-channel AI agent for healthcare appointments case study.
Three common misconceptions
- “Our cloud provider is HIPAA-compliant, so we are.” Cloud providers that offer HIPAA-eligible services accept responsibility for the physical security of the servers, the underlying network, and the infrastructure they control. They do not accept responsibility for how an organisation configures the services it deploys on that infrastructure, who it grants access to, what data it inputs, or how it manages PHI within the application layer. Selecting AWS, Azure, or GCP as an infrastructure provider does not transfer compliance responsibility. The shared responsibility model places configuration, access management, and protected health information governance firmly with the covered entity.
- “We do not store PHI, so HIPAA does not apply.” Transmission to an LLM API constitutes disclosure under HIPAA. If a prompt sent to an external model includes PHI, a BAA is required with that model provider, regardless of whether the response is stored. Healthcare organisations that use consumer-grade or developer-tier LLM APIs without enterprise BAA coverage are in violation from the first API call that includes patient data, even if no data is ever written to disk.
- “HIPAA compliance is a one-time project.” HIPAA requires ongoing risk assessments. The Security Rule’s risk analysis requirement is not satisfied by a single audit at deployment. Every new AI tool, model update, API integration, or change to a data flow constitutes a change to the organisation’s technical environment and triggers a reassessment obligation. In practice, this means that healthcare AI compliance is an operational function, not a project with a completion date.
HIPAA-compliant AI is not a certification a vendor ships. It is not resolved by selecting a HIPAA-eligible cloud provider. It is not a project with a completion date. It is the outcome of applying a consistent set of architectural and governance decisions at every point in a system where PHI moves: the BAA signed before the first API call, the access scope defined for each agent tool, the retention policy applied to every memory store, and the audit log that records every PHI-touching request.
As healthcare organisations move from simple chatbots to multi-step agentic systems, the compliance surface grows with every tool call an agent makes. The organisations that get this right do not treat compliance as a constraint to work around. They treat it as an architectural discipline applied from the start.
For the deployment patterns, vendor decisions, and pre-deployment checklist drawn from Vstorm’s production healthcare work, see How to build a HIPAA-compliant AI agent. For an overview of how we approach healthcare AI compliance across the full project lifecycle, visit our healthcare page.
Ready to see how agentic AI transforms business workflows?
Meet directly with our founders and PhD AI engineers. We will demonstrate real implementations from 30+ agentic projects and show you the practical steps to integrate them into your specific workflows—no hypotheticals, just proven approaches.
Summarize with AI
The LLM Book
The LLM Book explores the world of Artificial Intelligence and Large Language Models, examining their capabilities, technology, and adaptation.



