Data Deduplication

Overview

Duplicate clinical data is a prevalent issue in the healthcare industry. Throughout their interactions with the healthcare system, patients often consult multiple physicians, resulting in repetitive logging of information in electronic medical records (EMRs). For instance, consider the numerous occasions when you have had to recount your own allergies and medications prior to an appointment!

To tackle this challenge, Particle provides a data deduplication solution that streamlines your clinical records retrieved from various networks. While some organizations may choose to manage deduplication independently, opting for Particle's deduplication service can expedite integration with our product and alleviate the operational burden associated with sorting through redundant data.

📘
Data Deduplication is an opt-in service provided by Particle. Reach out to your Particle Health representative to enable it for your organization.

Please find below a list of frequently asked questions regarding Particle's deduplication approach and the options available to you.

What does deduplication do?

Deduplication consolidates identical data elements across multiple so you only receive one unique data component. By doing this, we remove redundant data in order to reduce noise and make our output datasets more immediately actionable and easier to ingest.

Particle deduplication is available for both our FHIR R4 and Flat API.

How does it work?

Particle retrieves C-CDAs for a given patient from the network. Those C-CDAs are ingested, transformed into FHIR resources and then evaluated for duplicative data. Evaluation of whether data is duplicative is done at a resource-by-resource level by comparing discrete data elements, and if all of the elements are identical duplicative data are removed. Every non UUID field has to have an identical value for two resources to be considered identical.

Do I have visibility into Particle's deduplication workflow?

Yes! You will be able to pull Provenance records for complete visibility. Provenance provides a historical account of an entity. It is metadata that provides a trail of the activity that has occurred on the data, such as who created it, when it was created, and what changes have been made. Provenance is about tracking the origin and the history of data, which is vital in healthcare to ensure data integrity and reliability. Read more about Provenance and other data quality processing capabilities Particle has!

Am I required to use Particle’s deduplication workflow?

Particle's enhanced deduplication workflow is available to all customers. But, our customers are not required to use it except in the event they want certain capabilities i.e. deltas, or data normalization.

What data elements are deduplicated?

All FHIR resources supported by Particle are what customers can expect to see deduplicated. This means that all Flat datasets are what customers can expect to see deduplicated.

Which data formats does deduplication apply to?

If you utilize our FHIR or FLAT outputs, the data sent by Particle will be deduplicated and end users will see less redundancy of data. This will speed up (1) data cleaning and integration efforts for engineers and (2) clinical data reviews, enabling providers and clinicians to spend less time reviewing data and more time with direct patient care.

C-CDA documents are source data and, as such, are not deduplicated by Particle.

Can I test deduplication in Sandbox?

Yes! Simply add -dedupe to the Family Name of a Sandbox patient to see the deduplicated dataset for that patient.

If you are querying for Kam Quark in our Flat data flow, for example, your query payload should look like this:

{
    "address_city": "Brooklyn",
    "address_lines": [
        "999 Dev Drive"
    ],
    "address_state": "NY",
    "date_of_birth": "1954-12-01",
    "family_name": "Quark-dedupe", // -dedupe added to the family_name field
    "given_name": "Kam",
    "gender": "Male",
    "postal_code": "11111",
    "purpose_of_use": "TREATMENT"
}

Who is deduplication for?

Any customer who is pulling the FHIR and FLAT payload formats from Particle.

Overview