The Coding Pad

Programming blog, discussions, tutorials, resources





Subscribe to get updates via email:

Delivered by FeedBurner

Finding and Deleting Duplicates in Excel

Posted in August 12th, 2007

Sometimes when working with large excel spreadsheets, you run into the problem of duplicates, and it helps to be able to find them and delete them without having to manually go through the spreadsheet. If you have two or more columns of data, and you need to find and delete the duplicates, this is one way to do it. Say you have all your data in columns A and B:

  1. Create a column that concatenates the information in A and B. You can either use the “CONCATENATE” function or use the ampersand (&) by entering the formula A1&B1 to cell C1. Remember, as with all formulae, you have to use the = sign before the formula, thus =A1&B1.
  2. Copy and paste the formula down the column to the end of your data. For example, if your data runs from rows 1 to 120, copy and paste your data down to cell C120. An easy way to do this is, in Windows, select from the cell where your formula is to the last cell, and press Ctrl+D, or Edit -> Fill Down. For a mac, I’m not sure there’s a short-cut, except using Edit -> Fill Down.
  3. The next step is to find and mark the duplicate entries. Go to column D, and in the first cell enter the formula enter the formula =IF(COUNTIF($C$1:C1,C1)>1,”Duplicate”,”Unique”
  4. Copy and paste this formula down the column to the end of your data using the same process as in step 2.
  5. This column now shows you what rows have duplicates and which have uniques. You now want to delete, or do something else to just the duplicates, so you want to sort your spreadsheet by duplicates and uniques. You can’t do this with your spreadsheet as it is now because of the formula, so copy all the data to a new worksheet, but you only want to paste values, not the formulas, so that you can sort them by column C. Use Edit -> Paste Special -> Values only.
  6. Sort your spreadsheet using Column C, so that you all the duplicates are at the end of your spreadsheet, and once you delete them, or do whatever you want to with them, you can then delete column C.

This leaves you with only unique entries in your worksheet. You can extend this method to multiple columns for more complex spreadsheets and it should work fine.

Happy Coding!





Related Posts

No User Commented In " Finding and Deleting Duplicates in Excel "

Subscribes to this post Comment RSS or TrackBack URL

Leave Your Reply Below

 Username

 Email Address

 Website

Sticky note: Please double check your comments before submit Please Note: The comment moderation maybe active so there is no need to resubmit your comment
 

 Subscribe in a reader

About The Site

It is what it is, my coding pad, a programming blog, yet another place for my restless mind.