The backend for an AI medical scribe

The scribe is the easy part. Where the transcript goes, how the draft becomes a signed note, who's recorded as the author, and which vendor signed a BAA — that's the backend. Here's how to model it honestly.

An AI medical scribe listens to a visit and writes a SOAP note. The demo takes a weekend. The real work is everything underneath: storing audio and transcripts as PHI, turning an AI draft into a clinician-signed, immutable note, recording who actually authored each version, and proving all of it to an auditor. This guide covers the data model an AI scribe actually needs — and the parts you can't fake.

What backend should an AI medical scribe use?

An AI medical scribe should use a clinical backend that stores data as FHIR R4 — transcripts and notes as DocumentReference, audio as Media/Binary, structured findings as Observation — behind a signed BAA, with a tamper-evident audit trail and provenance that distinguishes AI-generated drafts from clinician-signed notes. You vibe-code the scribe UI; you do not vibe-code that data layer.

Where transcripts and SOAP notes actually go

A scribe produces three kinds of artifact, and each has a correct FHIR home. Putting them in the right resource is what makes the data portable, queryable, and defensible later.

  • The audio recording → a Media resource (with the bytes held as a Binary), linked to the encounter. It is PHI the moment it is captured, so it lives encrypted in your cloud — never in a non-BAA bucket or a third-party transcription service that hasn't signed one.
  • The raw transcript → a DocumentReference with a transcript type and status: current, pointing at the encounter. Keep it; it is your evidence for how the note was generated.
  • The SOAP/DAP note → a DocumentReference for the document itself, or a Composition when you want the note structured into sections (Subjective, Objective, Assessment, Plan) the way a clinician thinks. A Composition is the FHIR way to say "this is an authored, sectioned clinical document with an author and a date."

With bonfireDB the scribe writes a draft note in one call, and the FHIR mapping happens underneath:

// AI scribe finishes a visit: transcript in, draft note out.
const draft = await clinical.notes.create({
  patientId,
  encounterId,
  format: "SOAP",
  text: aiGeneratedNote,        // becomes a DocumentReference / Composition
  source: {
    transcriptId,               // links back to the stored transcript
    model: "scribe-v2",         // recorded as provenance, NOT as the author
    status: "draft",            // not a signed clinical note yet
  },
});

The data model a scribe actually needs

A working scribe is not a notes table. It is a small graph of clinical resources, and the relationships are what give the note meaning. The minimum:

  • Patient — the subject of the visit, with a stable identifier the rest of the graph references.
  • Encounter — the visit itself (date, type, location, clinician). Every artifact the scribe produces hangs off the encounter, not loosely off the patient.
  • Transcript (and audio) — the source material, stored as above and linked to the encounter.
  • DraftNote → SignedNote — two states of the same document. The draft is editable and AI-authored; the signed note is immutable and clinician-authored. The transition is the whole game (next section).
  • Structured findings — vitals or scores the scribe extracts (e.g. a blood pressure, a PHQ-9 result) land as Observations, not buried in note text, so they're queryable and trendable.
  • Provenance + AuditEvent — the record of who and what: which model drafted, which human edited, who signed, who later read it. This is not optional metadata; it is the audit trail.

Recording a finding the scribe pulled out of the conversation is one call, and it becomes a real, searchable Observation:

await clinical.observations.record({
  patientId, encounterId,
  code: "blood-pressure",
  value: { systolic: 128, diastolic: 82, unit: "mmHg" },
});

// A screener the scribe administered, scored as a FHIR Observation:
await clinical.assessments.record("PHQ-9", {
  patientId, encounterId, score: 11,
});

The part you can't fake: provenance, signing, and audit

This is where most scribe MVPs are quietly non-compliant. An AI draft is not a medical record. It becomes one only when a licensed clinician reviews and signs it — and the system has to record that truthfully.

  • AI-vs-clinician provenance. The model is the source of a draft; it is never the author of a clinical note. The clinician who signs is the author of record. Store both: a Provenance resource that says "drafted by scribe-v2 from transcript X, edited by Dr. Lee, signed by Dr. Lee." Conflating the two — listing the AI as author, or hiding that AI was involved — is exactly the kind of thing an audit will surface.
  • Signed-note immutability. Once signed, a note cannot be silently edited. Corrections happen as a FHIR amendment (a new versioned entry that links to the original), not an in-place overwrite. Your backend has to enforce this, because "let the user edit the note" is the default an AI tool will generate.
  • Audit trail. Every read, write, sign, and amend produces an AuditEvent — automatically, not because someone remembered to log it. When a patient or regulator asks "who saw this and when," the answer has to already exist.
  • A BAA on every PHI vendor. The transcription API, the LLM, the database, the storage bucket, the logging pipe — each one that touches the audio, transcript, or note needs its own signed BAA. A BAA on your coding tool covers the tool handling your code, not your running app's data. We go deeper on this in can you vibe-code a HIPAA-compliant app.

The signing transition, done right, is one explicit call that flips the document to immutable and writes the provenance and audit entries for you:

// Clinician reviews the AI draft, edits, and signs.
const signed = await clinical.notes.sign(draft.id, {
  signedBy: clinicianId,        // author of record
  // -> note becomes immutable
  // -> Provenance: drafted-by(model) + signed-by(clinician)
  // -> AuditEvent emitted automatically
});
The trap with scribes isn't bad transcription — models are good now. It's that the draft-to-signed-note transition, the provenance, and the audit trail all look optional in a demo and are mandatory in a medical record. That's the part to build on infrastructure that does it for you, not the part to ask an AI to scaffold.

Reading it back, prepping the next visit, and getting out

A scribe isn't write-only. Clinicians read prior notes, the AI needs context for the next encounter, and you eventually need your data in a portable format. Three calls cover the common cases:

// Live-updating note view in the UI:
const { data: notes } = useClinicalQuery(
  clinical.notes.list, { patientId }
);

// Give the model grounded context before the next visit:
const context = await clinical.agent.sessionPrep({ patientId });

// Portability is not optional — export real FHIR R4:
const bundle = await clinical.fhir.export(patientId);

sessionPrep matters specifically for scribes: a good draft depends on the model knowing the patient's history, current problems, and prior plan — and pulling that context through an authorization-aware, BAA-covered path instead of dumping the whole chart into a prompt. That path is exactly what bonfire's agent layer (MCP + clean projections) is designed to provide, and why it's the part worth measuring: agents working over raw FHIR cap out around 50% on real tasks (FHIR-AgentBench), so the context call has to do better than the raw resource graph. FHIR export matters because a clinical record you can't take with you is a liability, not an asset.

Build vs. buy the data layer

You will build the scribe — the capture, the prompt, the review UI. That's your product. The question is whether you build the clinical data layer beneath it, and at pre-seed the honest answer is usually no.

  • Building it yourself means hand-modeling FHIR resources, implementing signed-note immutability and amendments, wiring provenance and an audit trail, getting per-patient authorization right, and signing a BAA with every PHI vendor — before you've validated the scribe. Months of regulated infrastructure, none of it your differentiator.
  • AWS HealthLake is a real, mature FHIR datastore and the sensible enterprise answer. It is also priced and shaped for enterprise: you still build the scribe-specific model (drafts, signing, provenance) on top, and the integration weight is real for a two-person team. See the honest bonfireDB vs. AWS HealthLake comparison.
  • bonfireDB is the pre-seed/indie answer: an open-source clinical backend (TypeScript + Postgres + pgvector, FHIR R4 underneath, no Redis), BAA-from-day-one via the managed option, and agent-native SDK calls like the ones above. The data model a scribe needs — Encounter, transcript, draft-to-signed note, provenance, audit, FHIR export — is the product, not a project.

The framing is deliberate: build the workflow, buy the regulated data layer. Other reframes of code-first FHIR exist (Medblocks has done strong work here) — pick whoever fits, but don't build the layer from scratch to learn what FHIR signing and audit cost.

Keep reading

TL;DR

  • An AI scribe's backend is FHIR underneath: transcript + note as DocumentReference/Composition, audio as Media/Binary, findings as Observation.
  • The data model is a graph — Patient → Encounter → transcript → DraftNote → SignedNote — not a notes table.
  • The parts you can't fake: AI-vs-clinician provenance, signed-note immutability (corrections as amendments), an automatic audit trail, a BAA on every PHI vendor, and FHIR export.
  • HealthLake is the real enterprise option; bonfireDB is the pre-seed/indie one — BAA-from-day-one and agent-native. Build the scribe, buy the clinical data layer.
  • Start with the scribe backend overview and the security & HIPAA model; compare options in comparisons.
FAQ

Frequently asked questions

What backend should an AI medical scribe use?

Use a clinical backend that stores data as FHIR R4: transcripts and notes as DocumentReference, audio as Media/Binary, structured findings as Observation. It should sit behind a signed BAA, keep a tamper-evident audit trail, and record provenance that separates AI drafts from clinician-signed notes. Vibe-code the scribe UI; do not vibe-code that data layer.

Where do the audio, transcript, and SOAP note actually get stored in FHIR?

Each artifact has a correct FHIR home. The audio recording is a Media resource backed by a Binary, linked to the encounter. The raw transcript is a DocumentReference of transcript type. The SOAP/DAP note is a DocumentReference, or a Composition when you want it sectioned into Subjective, Objective, Assessment, and Plan.

How should an AI scribe record provenance — is the AI the author of the note?

No. The model is the source of a draft; it is never the author of a clinical note. The licensed clinician who reviews and signs is the author of record. Store both in a Provenance resource: drafted by the model from a specific transcript, edited by the clinician, signed by the clinician. Conflating them is exactly what an audit surfaces.

How does a signed clinical note stay immutable — can you edit it later?

Once signed, a note cannot be silently edited. Corrections happen as a FHIR amendment: a new versioned entry that links back to the original, never an in-place overwrite. The backend has to enforce this, because "let the user edit the note" is the default an AI tool will generate. Every sign and amend also emits an AuditEvent.

Does a BAA on my coding tool cover the transcription API and the LLM?

No. A BAA on your coding tool covers that tool handling your code, not your running app's data. Every vendor that touches the audio, transcript, or note — the transcription API, the LLM, the database, the storage bucket, the logging pipe — needs its own signed BAA. Default coding-tool tiers do not cover PHI; some enterprise tiers do sign one.

Should I build the scribe data layer myself, use HealthLake, or use bonfireDB?

Building it yourself means months of regulated infrastructure before validating the scribe. AWS HealthLake is a mature, enterprise-shaped FHIR datastore; you still build drafts, signing, and provenance on top. bonfireDB is the pre-seed answer in early access: an open-source clinical backend with that model and a BAA via the managed option. Build the workflow, buy the data layer.

You build the app. Bonfire is the clinical data layer underneath.

bonfireDB is the clinical backend for an AI medical scribe — Encounter, transcript, draft-to-signed note, provenance, audit, and FHIR export, in your AWS under your BAA. Open source, early access.