Microsoft Excel is a popular tool, especially for Business Analysis, however, it can be difficult to find and delete duplicate records. In Excel, removing duplicates is a common activity for those dealing with massive datasets.
It is possible to have duplicate data in a spreadsheet if you merge multiple tables or if multiple persons have access to the same document. As a result, the information is pointless.
The likelihood of encountering duplicate records grows in proportion to the size of the collection. Without proper diagnosis and management, they can become a major headache.
Find and Remove Duplicates
While duplicates can be helpful at times, they usually only cloud the picture. It’s preferable to find, highlight, and review duplicates before deleting them altogether.
- Choose the range of cells containing the duplicates you wish to delete. Take note that eliminating data outlines and subtotals is the most effective method for eliminating duplication.
- To get rid of duplicates, go to Data > Remove Duplicates and then choose the columns you want to purge.
- When you’re ready, select the OK button.
How to Remove Duplicate Values in Excel?
In Excel, you can use an in-built function to assist you in removing duplicate data points. Let’s check out how to get rid of duplicates in Excel.
1. To begin removing duplicates from a dataset, step one is to select the cells or a range of cells to work with. If you select just one cell, Excel will figure out the range for you.
2. Find the option labeled “Remove Duplicates” and click on it. Select Remove Duplicates under the DATA tab’s Data Tools section.
3. You’ll see a dialogue box like the one seen above. You can choose which columns to examine for duplicates.
Choose the ‘My data has headers’ option and then click OK if your data has column names.
If you select the “header” option, the top row will be ignored when searching for matches to remove.
4. Excel will prompt you to confirm the deletion of the duplicate rows. The number of unique values and the total number of duplicates detected and deleted are both displayed in the summary section of the dialog box.
5. it is clear that the duplicates have been eliminated.
Now that we know where to find the Advanced Filter, we can learn how to use it to get rid of duplicates in Excel.
Understand Filtering for Unique Values or Removing Duplicate Values
You can either filter for unique values or delete duplicate values if you want to acquire a list of distinct values. Each activity contributes to a larger goal.
There is, however, a major distinction. Filtering for unique values temporarily hides duplicate values, while the remove duplicate values option deletes them permanently.
It’s also important to realize that the value displayed in a cell has no bearing on the accuracy of a comparison of duplicate values.
Therefore, it is recommended that you always double-check before deleting anything. To achieve the desired outcomes, try filtering or conditionally formatting unique values.
Filter for Unique Values
Follow these instructions to isolate one-of-a-kind values:
- To get started, pick the range of cells to examine. You will need to check if the selected cell is located within a table.
- The next step is to go to the Sort & Filter section and click the Advanced Filter button.
- Select “Advanced” on the Data tab’s “Sort & Filter” sub-menu.
- A new window, titled “Advanced Filter,” will appear on your screen. One of the following options is available to you:
- Select Filter the list in place if you need to restrict the selection to the currently visible cells or tables.
- Follow these steps if you need to save the filtered output somewhere else:
- To copy the values, select “Copy to another location” from the menu.
- Type the cell reference into the “Copy to” box where the values should be copied.
- To temporarily hide the pop-up window, select “Collapse Dialog” (). Then, after picking a worksheet cell, go to the “Expand” menu ().
- Mark the box labeled “Unique records only,” and then press the OK button.
Using the Advanced Filter Option
Excel’s Advanced Filter feature allows you to eliminate duplicates and copies only the relevant data. Take a look at the guidelines below to learn how to use the Advanced Filter function.
Select the cells or range of cells in the dataset from which you wish to eliminate duplicates. When you choose a single cell and then click Advanced Filter, Excel will figure out the range for you.
Figure out how to get to the more complex filtering choices.
- Go to the DATA tab, then the Sort & Filter sub-tab, and finally the Advanced button.
- An input window will pop up. It is a menu with refined filtering choices.
- To save the unique values in a new location, choose the option to “Copy to another location.”
- Verify that the range of records displayed in the ‘List Range’ field corresponds to the range you entered.
- The range to which the unique values are to be duplicated can be specified in the ‘Copy to:’ box.
- Mark the box labeled “Unique records only.” This is the most important stage.
- To continue, select the OK button.
- Cell G1 will receive the exclusive values.
These were Excel’s built-in features that assisted in finding and deleting duplicates. Let’s move forward and find out how to implement a similar method in our own code.
Formulas for Eliminating Duplicates in Excel.
To illustrate, we’ll use a basic example with three columns: sport, athlete’s name, and medals won.
- Using an Excel formula, we can add up the columns and get a total this way. Next, we’ll remove any values with a count larger than one that appears more than once.
The “&” is a concatenation operator, therefore let’s use it to join the data in columns A, B, and C. In Excel, this would look like =A2+B2+C2.
- After entering the formula in cell D2, it will be duplicated down through each row.
We need a new column labeled “Count” to identify the multiple entries in Column D. This is why we execute the COUNTIF function in cell E2. =COUNTIF($D$2:D2,D2) is the formula to use.
This formula can be used to determine how many times each value in column D occurs.
When Count equals 1, it has only ever been used once and is therefore considered unique. It is considered a duplicate value if it is “2” or above.
- Selecting the Filter button now allows you to apply a filter to the Count column.
- The filter is located in the Sort & Filter subtab of the DATA tab.
- To narrow your search, use the filter in Column E. Choose “1” to discard all but the truly unique entries.
When you confirm the deletion of the duplicate data, the table will be updated. The resulting one-of-a-kind records can be copied and pasted.