Semantic Search for Clinical Notes

why it's different

You search the meaning, not the schema

"Find when this kid's anxiety started spiking around school" is a clinical question, not a FHIR query. bonfireDB embeds your clean operational views — notes, assessments, timelines — and runs hybrid search over them. Keyword recall catches the codes and exact phrases; vector recall catches the way a clinician actually wrote it.

Hybrid, not just vectors

Keyword + vector together. Exact-match precision where it matters, semantic recall where the wording varies.

Clean views, not raw FHIR

Search reads the same readable projections your app does — never raw Bundle JSON the agents can't parse.

ABAC at retrieval

Permissions enforced per-hit at query time. Filtering happens before results leave the database, never in the prompt.

Cited to the source

Every result points back to the record it came from. No floating snippets, no "trust me" answers.

One Postgres. Vectors next to the data.

bonfireDB stores embeddings in pgvector inside the same Postgres that holds the source of truth. No second vector store to sync, no copy of PHI shipped to a third-party index, no drift between "what's in the DB" and "what's searchable."

✓ Vectors stored beside the rows they describe — same database, same backups, same boundary.

✓ Embeddings generated under your BAA — Bedrock running in your VPC, PHI never leaves the perimeter.

✓ Filtering, scoping, and similarity in one query — no fan-out to an external service.

pgvector + Postgres + Bedrock-in-VPC. Nothing new to operate.

search.ts

const hits = await clinical.search.hybrid({
  patientId,
  query: "worsening anxiety around school stress",
  citations: true
})

// hits are ABAC-scoped to this caller,
// ranked by keyword + vector relevance,
// each carrying a citation to its source record
hits.map(h => ({
  view: h.source.view,        // notesByPatient | timeline | ...
  score: h.score,
  cite: h.source.recordId,    // the record this came from
  snippet: h.snippet
}))

retrieval safety

The permission check is the query, not the prompt

Putting "only show records this clinician can see" in the system prompt is a hope, not a control. bonfireDB applies ABAC at retrieval — the search itself never returns a row the caller isn't entitled to. The model only ever sees what it's already allowed to see.

Search-then-filter-in-prompt

Index returns everything, the prompt is told to "be careful," and one jailbreak or a sloppy chain leaks another patient's record into the context window.

Filter-at-retrieval (bonfireDB)

Every hit is patient- and tenant-scoped before it leaves Postgres. The unauthorized row is never embedded into the prompt in the first place.

honest comparison

The closest shipping thing is Google's — and it's sunsetting

Credit where it's due: Google Agent Search for healthcare already ships FHIR-native semantic search today, and it's a serious managed offering. The catch is that Google has announced its end of life for 2027-05-15 — so building your retrieval layer on it means building on a clock. bonfireDB makes a different bet: keep semantic search in the same Postgres you already run, so retrieval isn't a separate managed service that can be deprecated out from under you.

Agent Search for healthcare

FHIR-native semantic search, managed by Google — genuinely the closest shipping competitor. Announced end of life 2027-05-15. Re-verify the date and migration path against Google's current docs before you commit.

bonfireDB

pgvector colocated with your source of truth, ABAC at retrieval, cited hits, embeddings under your own BAA. Pre-launch and early-access — fewer batteries included than a hyperscaler service, by design: one boundary, nothing extra to operate.

how it stays fresh

A sidecar projection, kept current by the async index lane

Semantic search is a sidecar projection fed from the store — not a blocking step on your writes. When you save a note, the operational read models are fresh on commit; the embedding job runs on the async index lane and reports its own status. No write is held hostage to an embedding call.

Write commits

clinical.notes.create(...) returns immediately. Operational views are fresh on commit.

Index lane picks it up

The async index lane embeds the new content into pgvector beside the data — heavy work, off the write path.

Freshness is reported

The write's freshness object marks indexes.semanticSearch as pending → fresh, so you always know what's searchable.

Every write tells you what's searchable

No guessing whether the index caught up. The freshness lifecycle distinguishes what's committed and queryable now from what's still being indexed.

✓ Operational views fresh on commit — your list screens never lag.

✓ Semantic index reports pending until embedded — no silent rot.

freshness

{
  status: "committed",
  views: {
    notesByPatient: "fresh",
    timeline: "fresh"
  },
  indexes: {
    semanticSearch: "pending",
    agentContext: "pending"
  }
}

feeds the agents

The same clean retrieval your agents run on

Hybrid search isn't a separate stack from your AI layer — it's the retrieval primitive underneath it. Agents read these same cited, permission-aware projections, never raw FHIR by default.

Custom MCP builder

Expose hybrid search as a scoped, cited MCP tool — safe by default.

Custom MCP builder →

App-native primitives

Notes, assessments, observations — the typed views search reads from.

App-native primitives →

Always fresh

The freshness lifecycle that tells you what's indexed and what's pending.

Always fresh →

Clinical authorization & audit

The ABAC engine enforced at retrieval, every read audited.

Clinical authorization →

FAQ

Frequently asked questions

Do I need a separate vector DB for clinical notes?

No. bonfireDB stores embeddings in pgvector inside the same Postgres that holds your source of truth, so there's no second vector store to sync, no copy of PHI shipped to a third-party index, and no drift between what's in the DB and what's searchable. It's designed so filtering, scoping, and similarity all happen in one query.

How does bonfireDB do clinical RAG without leaking other patients' records?

ABAC is applied at retrieval, not in the prompt. Every hit is patient- and tenant-scoped before it leaves Postgres, so an unauthorized row is never embedded into the model's context in the first place. The model only ever sees what the caller is already allowed to see.

Is this keyword search or vector search?

Both. bonfireDB runs hybrid search — keyword recall catches exact codes and phrases, while vector recall catches the way a clinician actually wrote it. Results are ranked by combined keyword + vector relevance, and every hit cites the source record it came from.

Does semantic search run over raw FHIR® JSON?

No. Search reads the same clean, readable projections your app does — notes, assessments, timelines — never raw FHIR Bundle JSON that agents can't parse. FHIR® is a registered trademark of HL7, used descriptively here.

How does the index stay fresh after I save a note?

Semantic search is a sidecar projection fed by an async index lane, so no write is blocked on an embedding call. Operational views are fresh on commit, and each write's freshness object marks the semantic index as pending until it's embedded — so you always know what's searchable, with no silent rot.

🔎 Semantic search over clean clinical views