Data Deduplication

Duplicate clinical data is a prevalent issue in the healthcare industry. Throughout their interactions with the healthcare system, patients often consult multiple physicians, resulting in repetitive logging of information in electronic medical records (EMRs). For instance, consider the numerous occasions when you have had to recount your allergies and medications.

To tackle this challenge, Particle provides a data deduplication solution that streamlines your clinical records retrieved from various networks. While some organizations may choose to manage deduplication independently, opting for Particle's deduplication service can expedite integration with our product and alleviate the operational burden associated with sorting through redundant data.

Please find below a list of frequently asked questions regarding Particle's deduplication approach and the options available to you.

What does deduplication do?

Deduplication consolidates identical data elements across multiple C-CDA documents so you only receive one unique data component.

Particle deduplication is in effect for both our FHIR R4 and Flat API. Deduplication removes redundant FHIR resources and Flat data in order to reduce noise and make our output datasets more immediately actionable and easier to ingest.

How does it work?

Particle retrieves C-CDAs for a given patient from the network. Those C-CDAs are ingested and evaluated as a bundle of documents for duplicative data. Evaluation of whether data is duplicative is done by comparing discrete data elements and if all of the elements are identical, duplicative data are removed. Every non UUID field has to have an identical value for two resources to be considered identical.

Do I have visibility into Particle's deduplication workflow?

Yes! You will be able to pull Provenance records for complete visibility. Provenance provides a historical account of an entity. It is metadata that provides a trail of the activity that has occurred on the data, such as who created it, when it was created, and what changes have been made. Provenance is about tracking the origin and the history of data, which is vital in healthcare to ensure data integrity and reliability.

Am I required to use Particle’s deduplication workflow?

Particle's enhanced deduplication workflow is available to be enabled for all customers in January 2024. Customers will receive detailed communication through their Particle representative regarding how to leverage this functionality and impact on query workflow.

Particle's deduplication workflow is a requirement to leverage additional Particle Platform capabilities (e.g. data normalization, data deltas).

What data elements are deduplicated?

All FHIR resources supported by Particle are what customers can expect to see deduplicated.

This means that all Flat datasets are what customers can expect to see deduplicated.

Which data formats does deduplication apply to?

If you utilize our FHIR or FLAT outputs, the data sent by Particle will be deduplicated and end users will see less redundancy of data. This will speed up (1) data cleaning and integration efforts for engineers and (2) clinical data reviews, enabling providers and clinicians to spend less time reviewing data and more time with direct patient care.

C-CDA documents are source data and, as such, are not deduplicated by Particle.

Can I test deduplication in Sandbox?

Yes! Simply add -dedupe to the Family Name of a Sandbox patient to see the deduplicated dataset for that patient.

If you are querying for Kam Quark in our Flat data flow, for example, your query payload should look like this:

{
    "address_city": "Brooklyn",
    "address_lines": [
        "999 Dev Drive"
    ],
    "address_state": "NY",
    "date_of_birth": "1954-12-01",
    "family_name": "Quark-dedupe", // -dedupe added to the family_name field
    "given_name": "Kam",
    "gender": "Male",
    "postal_code": "11111",
    "purpose_of_use": "TREATMENT"
}

Who is deduplication for?

Any customer who is pulling the FHIR and FLAT payload formats from Particle.