API

De-identification service in Azure Health Data Services

Securely anonymize clinical data while preserving its clinical relevance and adhering to the strict standards of the HIPAA privacy rule.

The de-identification service in Azure Health Data Services enables healthcare organizations to anonymize clinical data so that the resulting data retains its clinical relevance and distribution while also adhering to the Health Insurance Portability and Accountability Act of 1996 (HIPAA) Privacy Rule.

The service uses state-of-the-art machine learning models to automatically extract, redact or surrogate 28 entities — including the HIPAA 18 Protected Health Information (PHI) identifiers — supporting stronger privacy protection and more fine-grained distinctions between entity types, such as distinguishing between doctor and patient.

De-identification service in Azure Health Data Services is a third-party product: By requesting information, you consent to your information being shared with Microsoft.

PHI Anonymization

The de-identification service is designed for protected health information (PHI). The service uses machine learning to identify PHI entities, including HIPAA’s 18 identifiers, using the “TAG” operation. The redaction and surrogation operations replace these identified PHI values with a tag of the entity type or a surrogate, or pseudonym. The service also meets all regional compliance requirements including HIPAA, GDPR, and the California Consumer Privacy Act (CCPA).

The service does not guarantee compliance with HIPAA’s Safe Harbor method or any other privacy methods. We encourage users to obtain appropriate legal review of your solution, particularly for sensitive or high-risk applications.

Security

The de-identification service is a stateless service. Customer data stays within the customer’s tenant.

Role-based Access Control (RBAC)

Azure role-based access control (RBAC) enables you to manage how your organization's data is processed, stored and accessed. You determine who has access to de-identify datasets based on roles you define for your environment.

Developer resources

De-identification service in Azure Health Data Services

Related resources
Article

Announcing a de-identification service for health and life sciences

The de-identification service within Azure Health Data Services helps healthcare professionals de-identify their unstructured health data using state-of-the-art PHI detection, surrogation and industry best practices to protect patient data. The service maintains entity and temporal relationships in the resulting data, which maximizes the utility of the de-identified data for many downstream use cases including machine learning, real-world evidence and longitudinal research.

Article

Revolutionizing healthcare: De-identification service in Azure Health Data Services

Organizations across the healthcare spectrum can benefit from the de-identification service, with early adopters already planning to leverage the service to help advance some of their most prominent use cases.

Frequently Asked Questions

Can I use the de-identification service in Azure Health Data Services while I wait for it to become available on Optum Marketplace?

Yes. If you are an Azure customer you can test out a demo by following these instructions.

What are the key features of the de-identification service in Azure Health Data Services?

The de-identification service offers many benefits, including:

  • Surrogation: Surrogation, or replacement, is a best practice for PHI protection. The service can replace PHI elements with plausible replacement values, resulting in data that is most representative of the source data. Surrogation strengthens privacy protections as any false-negative PHI values are hidden within a document.
  • Consistent replacement: Consistent surrogation results enable organizations to retain relationships occurring in the underlying dataset, which is critical for research, analytics and machine learning. By submitting data in the same batch, our service allows for consistent replacement across entities and preserves the relative temporal relationships between events.
  • Expanded PHI coverage: The service expands beyond the 18 HIPAA Identifiers to provide stronger privacy protections and more fine-grained distinctions between entity types, such as distinguishing between doctor and patient.

Do I need an Azure subscription to use the de-identification service in Azure Health Data Services?

No. The de-identification service in Azure Health Data Services is hosted on your behalf under the Optum Azure subscription.

What types of data can the de-identification service in Azure Health Data Services process?

The de-identification service in Azure Health Data Services processes unstructured clinical, medical or health-related texts such as doctors' notes, discharge summaries, clinical documents, and electronic health records.

What languages is the de-identification service in Azure Health Data Services available in?

The de-identification service in Azure Health Data Services is currently only available in English.

How does the de-identification service in Azure Health Data Services ensure data privacy and security?

The de-identification service in Azure Health Data Services leverages Azure Language services. Data, privacy and security for Azure AI Language services can be found here.

Marketplace updates

Subscribe to The Spark

We're adding new products and services to rapidly expand our marketplace and the potential is huge. Be an early adopter and stay ahead of the trend with our bi-weekly e-updates.