- Total Size: 68.53 GB
- Total Records 886,521,320
- Concept-index – 21.0 Million records, exposing lab results and medicine details,
- Patient-index – 422 Million records. Note: Although the patients’ names were not in plain text it provides a clear understanding of where this information is stored. What was exposed however was an internal patient logging and tracking processes.
- Provider-index – 89 Thousand records exposing physician names, internal patient ID numbers (these are internal tracking numbers and shows the logging format), document locations and .CSV files, and other potentially sensitive information.
- The files also show where data is stored and references to “Production” data.
- The database was at risk of a ransomware attack and was publically accessible to anyone with an internet connection.
Report: Medical AI Company Exposed Millions of Records Online
Security researcher Jeremiah Fowler together with the Website Planet research team discovered a non-password protected database that contained 886,521,320 records. The total size of the dataset was 68.53 GB and contained medical related data. Upon further research there were multiple references to Deep6.AI including internal emails and usernames. We immediately sent a responsible disclosure notice and public access was restricted shortly after. The records appear to contain data of those based in the United States.
The type of data collected were divided in to the following sections:
Date, document type, physician note, encounter IDs (An interaction between a patient and healthcare provider(s) for the purpose of providing healthcare service(s)), patient ID, note, uuid, patient type, noteId, date of service, note type (example Nursing/other), and detailed note text. Some of this information was encrypted, but the notes and Physician information were in plain text. The danger would be if the patient ID were decrypted and the identity were exposed it would be clear to see their medical issues or diagnoses.
Deep6 takes raw medical data and tries to manage or organize it.
According to their website: “Deep6 AI’s software also identifies patients with conditions not explicitly mentioned in medical records. As a result, Deep6 AI’s software finds more patients who better match trial criteria in a fraction of the time”. Deepd6 is located in Pasadena, California, USA.
The exposed records revealed Physician Notes that provided intimate details of patient illness, treatment, medication, family, social and even emotional issues. These were very complete descriptions and it was surprising just how many small details were included in these notes. It is a rare look behind the scenes of how these notes look and the kind of information that is collected by medical workers.
Example of Physician Note:
“Sobbing and unable to stop, sat with pt at long intervals. Pt has never spoke to a therapist or been prescribed medications. Social Service Consult visited with pt and will follow during this admission. Received ativan 2mg IVB x2 during procedure with fair effect. Her home pain control for lupus takes Vicodin 2 tabs and Tylenol #3 2 tabs at HS.\nCV-VSS awaiting for PA catheter insertion before Flolan starts by MICU service”.
In a sampling of 10k records “patient” appears 8,672 times and some of these are in the Doctor’s notes. The word “Note” appeared 5,914 times. There were references to .csv documents and we can only speculate that these might have contained additional information. In theory if someone gained access to these .csv documents they could potentially match the detailed notes with the patient data, diagnosis, medicines, and treatments.
Details of the Discovery: