How to Clean Up Duplicate Data in Excel?

Beverly McNally
4 min readFeb 1, 2020

The data outsourcing companies often press to capitalise on shortcut methods of data entry. The data cleansing and extraction projects can be simplified using tools or formulae. But, it’s a challenge to clean up amassed data of millions of records. If you manually do, it seems a Herculean task.

This article assists you over how to deal with duplicate data. It includes finding replica entries using formulas and automatic functions and deleting them to prepare a clean list.

How to identify duplicates in an Excel database?

Manually, sorting a dataset in an order is a very boring job. Also, spending many hours on this trivial task can disturb other core jobs. So, here are a few simple steps to sort, identify and then, remove replica data entries automatically.

a. Identifying duplicates in one column:

· Open the excel sheet containing that data to be cleansed.

· Go to the data menu.

· Select ‘Remove Duplicates’ from Data Tools.

· A dialog box of ‘Remove Duplicates’ will open.

· Select check box of the column wherein the copied data exist.

· Also select the check box in the top right of that dialog box, reading ‘My Data Has Headers’. It would highlight the caption of the list.

· Click ‘Ok’. Another popup will display the count of duplicate records. Click ok to close it.

· The revamped list will appear with genuine entries only.

b. Identifying duplicates in multiple columns:

· Go to the spreadsheet containing the data that you need to clean up.

· Move to ‘Data’ menu.

· Click ‘Remove Duplicates’ from data tools.

· Tick the check boxes of the columns that you want to keep.

· A column may have cloned product names, but their unique values may differ in a separate column. So, you should select the columns carefully.

· Click ok.

· The cells carrying unique values will appear, despite having duplicate names in a column.

· You can sort them in an ascending or descending order before removing replica entries.

· Use data menu> sort & filter> sorting option for organizing data in an intended order.

How to use formula to find duplicates in Excel?

Duplicate data lead to wrong intelligence. You can’t evaluate appropriately, if a spreadsheet contains many entries with the same name or caption. So! Which formula will you execute to find duplicates in the excel file?

Let’s go through these examples that use an excel formula to identify multiple entries with similar data.

  1. Identifying duplicate in a cell:

This example uses the formula given in the formula cell, i.e. =If (B3=B2, “Duplicate”, “”).

The formula says if you B3 is equal to B2, or if B3 carries the same value as B2, show ‘Duplicate’ (as a result). The subsequent cells B4 and B5 have similar data. Therefore, the adjacent cells show ‘Duplicate’ as a result.

2. Identifying duplicates using Countif function:

This example contains formula to determine which cell carries duplicate entries. I used this formula:

=if (Countif (A2:A8)>1, “Duplicate”, “”)

While giving values and reference, you should keep it in the mind that countif function determines the repeated data. If a cell contains duplicate values, it counts and shows the value that you give in the double quotes (“”) if it matches the condition.

The aforementioned example has duplicate data entries with these captions- beans, oats and grams. As you press the Enter key, the value gets printed as duplicate before the names that are subsequently repeated.

After spotting replica entries, you can follow the steps that have been mentioned above for removing duplicates.

How to remove duplicates in Excel?

It requires a database wherefrom the duplicate entries will be deleted. Excel has some advanced features that ensure quick removal of duplicates. By following these steps, you can do so:

· Go to the database that has a lot of duplicates.

· Mouse over the data menu and select ‘Advanced’ from the ‘Sort & Filter’ option.

· A dialog box will pop up, viz. Advanced Filter. Hit ‘Copy to Another Location’ radio button.

· The ‘List Range’ will automatically fetch the range of that data.

· You need to define the address in the file where you want to copy it. For it, go to a blank cell on the same spreadsheet. Its range will be automatically selected.

· Now, there is a checkbox at the bottom with the caption ‘Unique Records Only’. Click it.

· Press Ok.

This is how the unique entries will be copied in the new cells on the same sheet.

End Note:

You should practice various excel formulae while accomplishing a data entry or cleansing or extraction project. Duplicity emerges as a big roadblock in your entry work. You can use shortcut keys, duplicate entry identification methods and formula to comb any size of data in a short span.

--

--