deduplicate.it

Automated duplicate removal for literature searches in systematic reviews, scoping reviews, and meta-analyses. Upload exports from multiple databases, remove duplicate references by DOI, and get a deduplicated RIS file ready for Rayyan, Covidence, or Endnote.

Upload export files to remove duplicates

Format is auto-detected from file content (extensions do not matter). Select all files at once; the first file in the list has the highest tie-break priority for keeping the best record when duplicates are removed.

.txt / .nbib:PubMed MEDLINE .txt:PubMed Summary .ris:Embase / Cochrane / CINAHL / Scopus / WoS .ciw:Web of Science tagged .bib:BibTeX .csv / .tsv:Scopus CSV / WoS CSV
📂

Drop files here or click to browse

Select as many export files as you need

File order = tie-break priority (first file preferred when records have equal quality).

Processing, please wait…

How to use

deduplicate.it supports the first step of literature search result deduplication in systematic reviews, scoping reviews, and meta-analyses by performing an auditable, DOI-based pass that removes exact matches of valid DOIs. Because both the source code and the validation are public and the algorithm is intentionally simple and human-readable, you can report this step transparently in your methods section, for example:

Preliminary deduplication based on digital object identifier (DOI) was performed with deduplicate.it, followed by manual audit and deduplication of records without a valid DOI. (Purkarthofer, Labenbacher, Bornemann-Cimenti, Landoni et al., XXXX, https://doi.org/10.XXXX/XXXXXXX)

How duplicate removal works

deduplicate.it performs a preliminary, DOI-based duplicate removal: records are matched by exact DOI equality after normalisation per the DOI Handbook (ISO 26324 §3.4–3.8):URL-decoding, prefix stripping, Basic Latin case fold only, and structural validation (^10.\d{4,}/).

When several records share a DOI, the algorithm keeps the one with an abstract first; otherwise the record from the first uploaded file wins, with longer abstract as a final tiebreaker. Records without a valid DOI are always retained unchanged.

Important limitation: this tool only deduplicates records that carry a valid DOI. Records without a DOI (and any duplicates among them) pass through untouched. In most modern literature exports the DOI coverage is high, so this step will remove the large majority of duplicates, but manual deduplication of the remaining output (e.g. in Rayyan, Endnote, or by hand) is still needed before proceeding to title/abstract screening.

The Deduplication Log (.csv) records every removed duplicate alongside the retained record, providing a full audit trail.