The Core Distinction FDA Is Trying to Draw
There is a line in clinical AI that determines whether your software is a wellness tool, a clinical decision support system, or a regulated medical device. Getting that line wrong does not just create legal exposure. It creates patient safety risk, liability for your clinical partners, and potentially a regulatory enforcement action that shuts down a product that was genuinely helping people.
At TheraPetic®.AI, our clinical informatics team works directly with the software that powers mental health screening, support animal verification, and AI-assisted intake across the TheraPetic® Healthcare Provider Group. In that operational context, we read FDA's Software as a Medical Device framework not as an abstract compliance exercise, but as a live engineering constraint that shapes architecture decisions in real time.
This article breaks down the FDA SaMD framework as it applies to AI clinical tools in 2026: what triggers 510(k) clearance, what stays inside the Clinical Decision Support safe harbor, and how the Predetermined Change Control Plan guidance changes the compliance calculus for teams shipping adaptive ML models.
What Qualifies as Software as a Medical Device
FDA adopted the International Medical Device Regulators Forum definition of SaMD in its 2019 framework document and has built 2026 guidance on top of that foundation. The core test is whether software is intended to perform a medical purpose without being part of a hardware medical device.
"Medical purpose" is the operative phrase. Software that analyzes ECG waveforms to detect atrial fibrillation is clearly SaMD. Software that reminds a patient to take medication is clearly not. The difficult cases sit in between, and AI clinical tools almost always land there.
FDA uses a two-axis risk matrix drawn from the IMDRF framework. The first axis is the significance of the information the software provides to clinical care: treating or diagnosing directly, driving clinical management, or informing clinical management. The second axis is the state of the healthcare situation: critical, serious, or non-serious.
A tool that directly drives diagnosis of a critical condition sits at the highest risk level and faces the most rigorous regulatory pathway. A tool that informs non-serious clinical management sits at the lowest risk level and may qualify as non-device CDS. AI mental health screening tools typically land in the "informing clinical management" row, which is where regulatory classification gets genuinely complex.
The CDS Safe Harbor: Four Conditions That Keep You Out of FDA Jurisdiction
The 21st Century Cures Act, codified in Section 520(o) of the Federal Food Drug and Cosmetic Act, created an explicit carve-out for clinical decision support software that meets four cumulative conditions. Miss any one of them and you fall back into SaMD territory.
The four conditions are:
- The software is not intended to acquire, process, or analyze a medical image, an in vitro diagnostic test signal, or a signal from a hardware device.
- The software is intended to display, analyze, or print medical information that is generally used by healthcare professionals or is intended for use by patients or caregivers.
- The software is intended to support or provide recommendations to a healthcare professional about prevention, diagnosis, or treatment of a disease or condition.
- The software is intended to enable the healthcare professional to independently review the basis for the recommendations so that they do not rely primarily on the software recommendation.
That fourth condition is where most AI systems trip up. If your model is a black box and the clinician cannot trace how it reached its output, you cannot satisfy the independent review requirement. Interpretability is not just a nice engineering property. It is a regulatory necessity for staying out of 510(k) jurisdiction.
FDA's final guidance on Clinical Decision Support software, published under the framework established by the Cures Act, makes clear that the agency interprets "independently review the basis" to mean meaningful transparency, not a disclaimer screen. The clinician must actually be able to evaluate the supporting evidence, references, or logic chain the software used.
For LLM-based clinical tools, this creates a structural challenge. Transformer models do not produce audit-ready reasoning chains by default. Teams at TheraPetic®.AI address this through retrieval-augmented generation architectures where every clinical recommendation is tied to a retrievable source document, a specific DSM-5 criterion, or a documented clinical protocol. The RAG layer provides the independent review pathway FDA requires.
What Triggers 510(k) Clearance for AI Clinical Tools
If your software crosses out of the CDS safe harbor, the regulatory pathway depends on device classification. Most AI clinical tools that are not highest-risk will pursue 510(k) clearance by demonstrating substantial equivalence to a legally marketed predicate device.
Three common triggers push AI clinical software into 510(k) territory:
Direct diagnostic output without clinician mediation. If the software produces a diagnosis rather than a recommendation for clinical consideration, that direct output bypasses the independent review condition. An algorithm that outputs "Major Depressive Disorder" rather than "PHQ-9 equivalent score consistent with moderate depression, recommend clinician evaluation" is making a diagnostic claim that requires clearance.
Opaque model architecture that prevents source review. As discussed above, black-box models structurally fail the Cures Act safe harbor test. FDA has been explicit in its Q-submission feedback that vendors cannot simply add a disclaimer to a black-box output and claim the independent review condition is satisfied.
Integration with diagnostic hardware or signal acquisition. The first CDS condition excludes software intended to process signals from hardware devices. If your AI clinical tool ingests wearable data, connected diagnostic equipment output, or in vitro diagnostic test signals, the CDS exemption does not apply regardless of how the output is framed.
For teams building AI-assisted mental health intake, the practical implication is design specificity. The system architecture, the user interface, and the clinical workflow documentation all need to reflect that a Licensed Clinical Doctor makes the clinical determination and the software supports that process. That is not just a legal formality. It reflects how good clinical AI should work anyway.
The Predetermined Change Control Plan and Why It Matters Now
One of the most operationally significant developments in FDA's AI/ML SaMD framework is the Predetermined Change Control Plan, or PCCP. FDA issued its final guidance on PCCPs for AI/ML-Based SaMD, and it fundamentally changes how teams should think about model updates after clearance.
The traditional 510(k) model was designed for static devices. A cleared device is the device that was cleared. Any significant change triggers a new submission. For adaptive ML models that continuously retrain on new data, that framework creates an impossible compliance loop. Every significant weight update would theoretically require a new submission.
The PCCP resolves this by allowing manufacturers to define, in advance, the types of changes they anticipate making, the procedures they will use to implement those changes, and the performance evaluation methods they will apply to validate that changes stay within the cleared device's intended use.
A well-constructed PCCP submitted as part of a 510(k) allows the cleared device to evolve within pre-specified boundaries without triggering a new submission for each iteration. This is transformative for clinical AI teams. It means the regulatory pathway can accommodate continuous learning architectures as long as the change control boundaries are specified upfront and the validation methodology is documented.
For TheraPetic®.AI's perspective on this, the PCCP is not just a regulatory convenience. It is a patient safety framework. The discipline of specifying acceptable change boundaries before training forces engineering teams to define what the model is for, what populations it serves, what fairness metrics it must maintain, and what performance thresholds trigger a clinical review halt. Those are exactly the questions that prevent algorithmic harm in clinical deployment.
FDA guidance specifies that a PCCP must include a description of the anticipated modifications, a description of the methodology for implementing changes, and performance evaluation methods. Teams should treat the PCCP as a living engineering document that sits alongside model cards and datasheets, not as a one-time regulatory filing.
Where Clinical NLP and LLM-Based Screening Fall on the Risk Spectrum
The rise of GPT-class and MedPaLM-class models in clinical intake has introduced a new category of regulatory question that FDA's existing guidance is still catching up to. Large language models used for clinical screening present a distinct risk profile from earlier rule-based clinical decision support.
Rule-based CDS is predictable. Its reasoning chain is auditable. Its failure modes are enumerable. LLMs are none of those things by default. They hallucinate. They exhibit demographic performance disparities that are difficult to detect through standard clinical validation. They respond inconsistently to paraphrased inputs. For mental health screening, where the clinical stakes include suicidal ideation assessment and psychosis screening, these properties create genuine patient safety risk.
Research published in NEJM AI and JAMA Psychiatry has begun documenting performance gaps in LLM-based clinical reasoning, particularly across demographic subgroups and non-standard linguistic inputs. FDA's AI/ML framework, and specifically the PCCP guidance, requires manufacturers to specify the intended patient population and to define performance metrics that include fairness dimensions like equalized odds across demographic groups.
For clinical NLP tools that process patient-generated text during mental health intake, the regulatory question turns on whether the software is providing a recommendation a clinician reviews, or producing a triage classification the system acts on. Automated triage with no clinician in the loop almost certainly crosses the line into SaMD territory. Automated language processing that surfaces structured information for Licensed Clinical Doctor review has a stronger CDS safe harbor argument, provided the source transparency condition is met.
At TheraPetic®.AI, our HANK AI screening infrastructure is architected specifically around this distinction. Intake responses are processed to surface structured clinical indicators, flagged language patterns, and validated scale scores. A Licensed Clinical Doctor reviews those outputs before any clinical determination is made. The software informs. The clinician decides. That is the line FDA is drawing, and it is the right clinical model regardless of regulation.
Practical Guidance for Teams Building AI-Assisted Clinical Software
The FDA SaMD framework can feel like an obstacle to teams moving fast on clinical AI. The more accurate framing is that it is a forcing function for the engineering rigor that clinical AI requires anyway.
Several operational principles apply across teams working in this space:
Classify before you build. Use the IMDRF risk matrix and the four CDS conditions as design inputs, not post-hoc compliance checks. If your intended feature set does not satisfy all four CDS conditions, assume SaMD jurisdiction and plan accordingly.
Build interpretability into the RAG layer. For LLM-based tools, retrieval-augmented generation with explicit source attribution is the most practical mechanism for satisfying the independent review condition. Every clinical output should trace to a source document the clinician can evaluate.
Document intended use with clinical specificity. FDA's clearance decisions hinge on intended use. Vague intended use statements create regulatory ambiguity that is difficult to resolve later. Define the specific clinical population, the specific clinical context, and the specific role of the software in the clinical workflow.
Design the PCCP before you clear. If you anticipate model updates after launch, the PCCP framework is the path to sustainable compliance. Define your performance thresholds, your fairness metrics, your change boundaries, and your validation methodology as first-order engineering artifacts.
Maintain the clinician-in-the-loop explicitly. Not as a liability hedge, but as a structural design constraint. The software informs. The Licensed Clinical Doctor decides. Documentation of that workflow is both regulatory evidence and clinical ethics alignment.
The teams building the most responsible clinical AI right now are not the ones who have found clever arguments for staying out of FDA jurisdiction. They are the ones who have built to a standard that would survive regulatory scrutiny whether or not it is technically required. That is the standard TheraPetic®.AI applies to every clinical system we publish, and it is the standard this regulatory framework is designed to enforce.
For deeper context on the verification infrastructure we use to operationalize these principles, see the HANK AI architecture overview at servicedog.ai and the data governance framework at mydatakey.org. For the flagship clinical network context, see mypsd.org.
