Tips in T-SQL
This page is a reminder and how-to for my best tips in TSQL
How to delete duplicate records from a SQL Server table
In the table below we can identify various duplicate records. The pourpose of the query is to delete the duplicates
Table Anagrafiche
| FirstName |
Surname |
age |
| Mauro |
Rossi |
34 |
| Simone |
Casara |
34 |
| Mauro |
Rossi |
34 |
| Alessandro |
Chiappin |
30 |
| Alessandro |
Chiappin |
30 |
| Alessio |
Trevisan |
31 |
1) Create an identity on the table if it doesn't have one yet
| ALTER TABLE Anagrafiche ADD IDAnag int IDENTITY(1,1) |
Now the table has a column to identify every single record
| IDAnag |
FirstName |
Surname |
age |
| 1 |
Mauro |
Rossi |
34 |
| 2 |
Simone |
Casara |
34 |
| 3 |
Mauro |
Rossi |
34 |
| 4 |
Alessandro |
Chiappin |
30 |
| 5 |
Alessandro |
Chiappin |
30 |
| 6 |
Alessio |
Trevisan |
31 |
2) Join the table on itself. The keys of join have to be the fields from witch we have to delete duplications. In the example we delete duplicate
first names and surnames and we don't care about age.
DELETE A1
FROM #Anagrafiche A1 INNER JOIN #Anagrafiche A2
ON A1.Nome = A2.Nome AND A1.Cognome = A2.Cognome
AND A1.IDAnag > A2.IDAnag |
| IDAnag |
FirstName |
Surname |
age |
| 1 |
Mauro |
Rossi |
34 |
| 2 |
Simone |
Casara |
34 |
| 4 |
Alessandro |
Chiappin |
30 |
| 6 |
Alessio |
Trevisan |
31 |
The records deleted are the ones with higher IDAnag because the condition was
A1.IDAnag > A2.IDAnag