deduplicate.it

Automated deduplication of literature searches for systematic reviews, scoping reviews, and meta-analyses. Upload files from multiple databases and receive a single deduplicated file.

Upload files to remove duplicates

Processing, please wait…

What you will receive

Deduplicated file — downloadable as RIS, CSV, MEDLINE, or XML
Exclusion log — all excluded references paired with the reference they were merged into
Flowchart — editable, pre-filled PRISMA-style flowchart

Methods text

⚠ Submitted for peer review. This tool has been submitted for peer review and is not yet formally published. Use at your own discretion — the source code is transparently available at github.com/dpurkarthofer/deduplicate.it.

Automated deduplication of literature search results based on normalised digital object identifier (DOI) and title was performed using deduplicate.it (Citation to be added, currently submitted for peer review). Subsequently, all remaining references and the deduplication log were reviewed manually.

Deduplication is based on exact DOI and title matching. References without a valid DOI, or without an exactly matching title, pass through unchanged — this is by design. Manual review of both the deduplicated output and the exclusion file is required.

Frequently asked questions

How does it work? ›

References are matched by exact equality of both their normalised DOI and normalised title. DOI normalisation follows the DOI Handbook (ISO 26324 §3.4–3.8). Title normalisation applies a reproducible pipeline: HTML entity decoding, trademark removal, NFC, extended Latin transliteration (ä→a, æ→ae, ß→ss…), NFD with diacritic stripping, Greek expansion (α→alpha…), lowercase, and reduction to [a-z0-9 ].

When several references share a DOI and normalised title, the one with an abstract is kept; otherwise the reference from the first uploaded file wins, with abstract length as the final tiebreaker.

When the same DOI maps to references with different normalised titles (DOI collision — common with conference abstract supplements), those references are not merged and appear at the top of the output for manual review.

How can I edit my flowchart? ›

You can edit the flowchart directly in your browser by clicking on it and then selecting the pen icon in the context menu that appears at the bottom of the diagram. Alternatively, download the flowchart as an HTML file using the download button above and open it with the draw.io web app or the draw.io desktop application.

Where can I learn more? ›

The algorithm and validation are described in the accompanying methodology paper, currently submitted for peer review (citation to be added upon publication). The full source code is openly available for inspection and reuse at github.com/dpurkarthofer/deduplicate.it.

Upload files to remove duplicates

Drop files here or click to browse

What you will receive