What HIPAA-Compliant Robot Training Data Actually Looks Like

Why this is a regulatory gate, not a hardware one

Almost every publicly available physical-AI training corpus was collected somewhere a hospital lawyer would never have allowed it. Open X-Embodiment aggregated 22 academic and industrial labs. Bridge V2 was kitchens and tabletop manipulation. Build AI's Egocentric-1M (April 2026 release on Hugging Face) is sneaker resale, baristas, cooking, household chores. Mecka's EgoVerse paper — 1,362 hours, 80,000 episodes, 1,965 tasks, co-authored with Georgia Tech, Stanford, UCSD, ETH Zurich, MIT CSAIL, Meta Reality Labs and Scale AI (arXiv 2604.07607) — sits squarely in everyday non-clinical environments. Lightwheel's EgoSuite collects 20,000 hours per week with the same gig-worker model. None of it touches a sterile field.

Surgical robotics is now the highest-value vertical for the next generation of foundation models. Intuitive Surgical's da Vinci 5 platform has staked out the assist-and-automate roadmap. Medtronic Hugo crossed an FDA clearance milestone this cycle. CMR Surgical contributed surgical primitives to NVIDIA's Open-H reference dataset on March 16, 2026 — the first publicly disclosed surgical contribution to an open foundation-model corpus. The hardware is converging. The trained behaviors are not, because the demonstration data is locked behind HIPAA, 45 CFR 46 (the Common Rule for human subjects research), institutional review boards, and the credentialing posture of every operating room in the country. The gate is regulatory.

This post is what we have designed to clear that gate, and what we've learned designing it. We are not deployed. We have not enrolled a patient. We do not have an IRB approval letter to wave around. The design itself is the artifact we are publishing today.

The protocol design

An IRB-acceptable surgical capture protocol has at least five load-bearing pieces, and getting any one of them wrong is a six-month restart.

Informed consent has two surfaces, not one. The patient consents to having their procedure recorded and used for model training under defined retention and re-identification controls. The OR staff — surgeon, scrub tech, circulating nurse, anesthesiologist, perfusionist where present — each consent separately to being captured as an identifiable performer in egocentric video. Our draft protocol treats these as independent consent documents under independent withdrawal rights, because the case law and the bioethics literature both point that direction. A single combined consent risks a finding that one party's consent coerced the other.

The BAA scope has to be wider than people expect. A Business Associate Agreement with the hospital covers us as a downstream processor of Protected Health Information, but the BAA has to enumerate every PHI class we touch in a surgical egocentric feed — patient body habitus, identifying tattoos and surgical scars, visible operative-site anatomy that could re-identify in a rare-disease context, audible mentions of patient name or MRN, visible wristband barcodes, visible monitor readouts that include MRN or DOB, and the date-and-time metadata that combines with a procedure schedule to re-identify. The Safe Harbor de-identification standard (the 18 HIPAA identifiers) is the floor, not the ceiling, for multimodal capture — surgical video produces re-identification vectors that don't exist in tabular EHR data.

De-identification happens at capture time wherever possible. Face redaction, tattoo masking, and monitor-screen blurring run on-rig before anything leaves the OR's on-prem buffer. Audio is captured in two channels — one redacted in near-real-time for surgeon/patient name mentions using a finite-state model trained on the case's pre-loaded participant list, and one raw channel that stays inside the hospital's controlled environment and is destroyed on a defined schedule if not specifically retained for adjudication. Equipment serial numbers and asset tags are blurred at capture. Post-capture redaction is a second pass, not a primary defense.

The consent workflow is opt-in at the point of care with named withdrawal rights. Patients sign during the pre-op encounter, not in the immediate pre-procedural period where consent quality is contested. OR staff consent annually with per-case opt-outs. Withdrawal is honored on a documented timeline, and that timeline becomes a contractual term with the hospital — we're designing the propagation infrastructure to make it tractable. This is harder than it sounds; we'll come back to it.

The data flow is on-prem buffer first, then encrypted transfer to a controlled environment, then ingestion. Keys are split-custody between Kindly and the hospital partner so neither party can unilaterally decrypt a raw shard. Every read, every annotation, every model-training inclusion event is logged with the consent-document version that authorized it. The Minimum Necessary standard (45 CFR 164.502(b)) means we capture only what the protocol enumerates — no ambient OR video of unrelated cases, no whole-floor B-roll, no take-home rushes.

This is the protocol we have written. We have not yet submitted it to an IRB. The next step is partnering with a single institutional review board for a pilot scope. We do not have that partner under signature today.

What changes for the data pipeline

When the data carries PHI, the labeling and lineage stack has to be rebuilt around two things general-purpose annotation platforms never had to handle: covered-entity annotator status and per-record consent-version lineage.

Annotators must be BAA-covered individuals. That eliminates the standard Scale AI / Surge AI / general-crowdwork model for the clinical portion of the corpus. Our design uses a small bench of clinically-credentialed annotators — perioperative RNs and surgical techs hired under a BAA-extended employment agreement, working in a controlled environment with no local storage rights and no ability to export frames. Throughput will be lower than gig-scale annotation by roughly an order of magnitude. The trade is necessary.

Every annotation event carries a consent-version pointer. When a patient withdraws consent, we have to be able to compute the set of training shards that included any frame from their procedure, the set of derived labels that referenced those frames, the set of skill checkpoints trained on those derived labels, and the downstream evaluation runs that used those checkpoints. We're extending the lineage system documented in docs/INTEGRATED_LOOP.md — the same Project → Artifact → LabelingTask → LabelSubmission chain we built for FoodforThought — with a consent-version field at every node and a withdrawal-propagation worker that walks the derived-artifact graph. The FoodforThought lineage system already handles parent/derived relationships; the new work is the consent overlay and the deletion guarantees.

Right-to-deletion is a first-class pipeline operation. When the propagation worker resolves a withdrawal, it produces a manifest of every downstream artifact that must be regenerated. Skill checkpoints get a remove-and-retrain ticket. Eval runs are flagged as referencing-deleted-source. Anyone licensing the dataset would receive a delta notification so their own derived models can be re-trained against the post-withdrawal corpus. We do not yet know how much of a retraining tax this implies in practice. We will know after the first withdrawal we honor.

The InferAll AI gateway we operate (api.inferall.ai) gives us a chokepoint for any LLM-based labeling assist — entity extraction, action segmentation prompting, narrative summarization — to route through a logged, auditable inference path rather than direct provider APIs. That matters for the BAA chain.

What the credentialing actually looks like

OR access is a three-layer credential. Hospital privileges come first — issued by the institution after background check, drug screen, immunization audit, OSHA training, and sponsor sign-off. Vendor credentialing layers on top, typically through Reptrax, Vendormate, or symplr Vendor Connect, and includes the same checks plus procedure-area competency. Procedure-specific sign-offs — robotic surgery for da Vinci, ortho-specific for Mako, neuro-specific for Mazor — are the third gate and require named training hours with a system rep.

The reason zero of the sixteen funded physical-AI data-engine startups we tracked in our Mecka AI Alternatives map — Mecka, Lightwheel, Encord, Luel, Sensei, Cortex AI, Build AI, Human Archive, and the rest — can stand a rig up in an OR is not technology. It is identity. None of them have a clinician with active hospital privileges and an active vendor credential on staff. Hiring one is not a sprint; the credentialing trail takes 90 to 180 days per institution and the clinician has to keep clinical hours to retain privileges.

Our CXO Carmina Chua is a working surgical nurse with current OR time. Our COO Thomas Ray Lopez de Leon is a hospital-employed robotics coordinator on a live surgical floor. They are the credential. They are not future hires we will recruit after raising. This is the only structural advantage we are claiming in this post, and it is the one that maps directly to whether the protocol above can be executed at all.

What's in this for hospitals

The data-partnership economics are not yet standardized in clinical robotics the way they are in AI imaging. Our design assumes three levers: a revenue share or research-fund contribution paid per consented case scaled to procedure rarity, first-look licensing for skills developed from the partner's contributed corpus with a defined exclusivity window, and co-authorship on resulting publications.

The Cleveland Clinic, Mayo Clinic, and NYU Langone innovation arms already operate this thesis with AI imaging vendors: contributed de-identified corpora in exchange for revenue participation and first-look on downstream models. Clinical robotics data is the same shape of asset with a higher unit value per consented case and a thinner vendor field competing for the partnership. We are pattern-guessing here — none of these deals exist in surgical-robotics data yet, to our knowledge.

Closing

Kindly Robotics is pre-IRB, pre-contract, pre-real-OR-capture. The protocol described above is a design document, not an operating manual, and we have published it in this state on purpose. The credibility of a clinical data engine in 2026 turns on whether you can be honest about the things you have not yet done. The full pitch is at /pitch. The neutral map of the sixteen funded competitors we are not — and why their model does not extend into this vertical — is at /blog/2026-06-08-mecka-ai-alternatives. If you are evaluating clinical training-data vendors and we have gotten something in this post wrong, we want to hear it.

What HIPAA-Compliant Robot Training Data Actually Looks Like

What HIPAA-Compliant Robot Training Data Actually Looks Like

Why this is a regulatory gate, not a hardware one

The protocol design

What changes for the data pipeline

What the credentialing actually looks like

What's in this for hospitals

Closing

Related posts