In: Statistics and Probability
I have to figure this out in Excel.
I have 4 columns with names in a spreadsheet. I am trying to find the duplicates between columns and count them.
Example
A B C D
Steve John Chris John
Sarah Steve John Sarah
Carl Chris Paul Paul
Jenna Logan Sarah Jenna
Moe Norman Joe Logan
I am trying to identify the names that show up in multiple columns and count the times they show up, all in Excel.
Which excel functions can I use to find the duplicates between columns?
This example teaches you how to find duplicates (or triplicates) in Excel. Go here to remove duplicates.
1. Select the range A1:C10.
2. On the Home tab, in the Styles group, click Conditional Formatting.
3. Click Highlight Cells Rules, Duplicate Values.
4. Select a formatting style and click OK.
Result. Excel highlights the duplicate names.
Note: select Unique from the first drop-down list to highlight the unique names.
As you can see, Excel highlights duplicates (Juliet, Delta), triplicates (Sierra), quadruplicates (if we have any), etc. Execute the following steps to highlight triplicates only.
5. First, clear the previous conditional formatting rule.
6. Select the range A1:C10.
7. On the Home tab, in the Styles group, click Conditional Formatting.
8. Click New Rule.
9. Select 'Use a formula to determine which cells to format'.
10. Enter the formula =COUNTIF($A$1:$C$10,A1)=3
11. Select a formatting style and click OK.
Result. Excel highlights the triplicate names.
Explanation: =COUNTIF($A$1:$C$10,A1) counts the number of names in the range A1:C10 that are equal to the name in cell A1. If COUNTIF($A$1:$C$10,A1) = 3, Excel formats the cell. Because we selected the range A1:C10 before we clicked on Conditional Formatting, Excel automatically copies the formula to the other cells. Thus, cell A2 contains the formula =COUNTIF($A$1:$C$10,A2)=3, cell A3 =COUNTIF($A$1:$C$10,A3)=3, etc. Notice how we created an absolute reference ($A$1:$C$10) to fix this reference.
Note: you can use any formula you like. For example, use this formula =COUNTIF($A$1:$C$10,A1)>3 to highlight the names that occur more than 3 times.