Pilot Healthcare Software Modelling LLMOps/MLOps Privacy/Explainability

Privacy-preserving clinical decision support

A privacy-preserving, explainable decision support system for a Dutch geriatric EHR platform. The challenge was extracting reliable signal from messy clinical notes and embedding it into existing clinician workflows, without introducing new screens, new logins, or new reasons to distrust the system.

Named: GeriMedica: geriatric EHR platform (NL)

Milestone

Pilot running

workflow integrated

Book AI R&D Triage

The blocker

S

Symptom

Clinicians had useful information in their notes but no way to surface patterns across patients at decision time.

R

Root cause

Clinical narratives are unstructured, messy, and context-dependent. Standard NLP pipelines failed on edge cases common in geriatric care.

P

Why it persisted

Existing tools required structured input; the EHR was rich in free text. The cost of annotation was too high to create a training set from scratch.

What was built

System-level. What it actually is: inputs, outputs, users.

  • Feature extraction pipeline: LLM-assisted extraction of clinical signals from narrative notes, with validation against structured EHR fields.

  • Evaluation harness: offline validation framework using annotated gold standard; failure mode analysis for common geriatric documentation patterns.

  • Explainability surfaces: per-prediction explanation designed to fit clinical workflow, 'why this patient, why now, what contributed'.

  • Human-in-the-loop design: override mechanism and confidence thresholds that trigger review rather than suppress output.

  • Privacy architecture: data minimisation, pseudonymisation, no raw note storage downstream.

  • Interfaces: inputs: EHR notes and structured patient data; outputs: flagged risk signals and explanation; users: geriatric clinicians in existing EHR interface.

Architecture diagram

D2
EHR dataFeature extractionExplainability surfacesClinician UIAudit log privacy-preserving

How we evaluated it

What "working" meant: baselines, metrics, guardrails, failure modes.

Definition of working

Extraction precision and recall on gold-standard annotated notes meets clinical acceptance threshold; workflow integration doesn't add friction.

Metrics tracked

  • Extraction precision/recall on annotated sample

  • Clinician acceptance score (pilot feedback)

  • False positive rate on decision-relevant flags

Failure modes checked

  • Confabulation: LLM hallucinating clinical facts not in notes

  • Documentation variance: clinicians writing same thing differently

  • Edge cases: atypical presentations common in elderly patients

Milestone

Pilot running

workflow integrated

Feature extraction pipeline validated. Explainability surfaces prototype tested with clinical input. Integration path into GeriMedica EHR defined. Pilot evaluation in progress.

Why it was hard

Constraints that shaped every decision.

Documentation heterogeneity

geriatric clinical notes follow no consistent structure; extraction needs to handle abbreviations, typos, and domain-specific shorthand.

Adoption constraint

any AI signal must appear inside the existing EHR interface, no separate tool, or clinicians won't use it.

Explainability requirement

clinical context demands 'why' at a granularity that most model explanations don't naturally produce.

Privacy and security

protected health data; GenAI integration had to account for prompt injection risk and data minimisation obligations.

EU AI Act relevance

high-risk AI classification likely; documentation, human oversight, and traceability were designed in from the start.

What comes next

If continuing: next hypotheses, next system increment, next risk gate.

  1. 1

    Prospective validation

    run the extraction pipeline on incoming notes rather than historical records to test real-time performance.

  2. 2

    Clinician feedback loop

    build a structured feedback mechanism so clinicians can flag wrong extractions without disrupting workflow.

  3. 3

    Subsidy pathway

    Dutch ZonMw / NWO health AI funding to formalize the evaluation study design.

Built with EU traceability + oversight expectations in mind.

Security-aware GenAI integration patterns. (ISPE)

Book the 30-min triage: you leave with a plan.

No demo, no deck, no pitch. A structured conversation about your specific situation.