Categories
BibTex LaTex

How to remove duplicates in BibTex (.bib) file

So you want to reuse some of your existing bib files, but the BibTex processor complains there are duplicate entries in your bib files. How can I remove them?

Introducing bibtex-tidy

To remove the duplicates in BibTex, we will use the bibtex-tidy, a web app to format your BibTex source. Click the link to open it.

image 9
To remove duplicates in a BixTex file, we will use bibtex-tidy.

Once you open it, bibtex-tidy has two parts. The left side is the editor window, where you will copy and paste your BixTex code. The right side has a few options to format your BibTex code:

  • Whitespace options
    • convert tab to space and vice versa in your BibTex source
    • align values: align the equal signs in you BibTex entries
  • Values options
    • Encode BibTex values in curly braces
      • “BibTex Guide” becomes {BixTex Guide}
    • Use numeric values where possible
      • “1942” becomes 1942
    • Strip double-braced values
      • {{March}} become {March}
    • Drop all caps: make caps to title case
      • {MARCH} become {March}
    • Escape special characters: to make sure no weird characters in your LaTex output
    • Encode URLs with percent coded values
  • Sorting
    • Sort bibliography entries
      • You can then add the field to sort by, such as BibTex key
      • Multiple fields are separated by a space, such as key year
      • For descending order, prefix it with a dash, such as -key
    • Sort bibliography entry fields
      • You can then add the field order
      • Multiple fields are also separated by a space, such as title author
      • The default list is title, shorttitle, author, year, month, day, journal, booktitle, location, on, publisher, address, series, volume, number, pages, doi, isbn, issn, url, urldate, copyright, category, note, metadata.
  • Duplicates
    • As this is our focus here, we will discuss this in the next section
  • Clean up
    • Remove fields within BibTex entries
      • You can type in some fields that you would like to remove from you BibTex entries
    • Remove comments in your BibTex code
    • Tidy comments: remove whitespace surround comments

Use bibtex-tidy to remove duplicates in BibTex

First, copy & paste your BibTex (.bib) file content to the left side of the editor.

In the sidebar on the right, scroll down and look for “DUPLICATES” option.

Screen Shot 2020 04 16 at 15.21.32
The sidebar where you can remove duplicates in you pasted BibTex file.

You can check for duplicates with 4 criteria:

  • Matching Keys: this is the preferred option if you copy and paste BixTex entries from Google Scholar
  • Matching DOIs: this is good only if you have Digital Object Identifiers (DOIs) in your BixTex entries, most publishers like ACM and IEEE have the DOI field but Google Scholar does not have it
  • Similar author and title: use with caution as the similarity might not be intentional
  • Similar abstracts: use with caution as it is uncommon to have abstracts in you BibTex source

Then, click blue button “Tidy”, which will remove all duplicates.

Copy & paste to your BibTex (.bib) file. It is always a good idea to check the Latex output for any new errors. You can also choose to check the generated PDF documents to see if the tool removes non-duplicated.

In A Nutshell

How to remove duplicates in BibTex

  1. Open the bibtex-tidy web app: https://flamingtempura.github.io/bibtex-tidy/

  2. Paste your BibTex source to the editor pane

  3. Scroll down the right pane to “DUPLICATES”

  4. Select the duplicate criteria

  5. Click the blue button “Tidy” at bottom right

  6. Copy and paste the BibTeX source from the editor pane


References:

+6

By VarHowto Editor

Welcome to VarHowto!

1 reply on “How to remove duplicates in BibTex (.bib) file”

Thanks for creating this. This is an extremely useful tool, since I have many bib files that I want to consolidate. It would be very helpul if
one could save it directly to a bib file rather than having to do the screen capture.

+2

Comments are closed.