PC-filter: a Robust filtering technique for duplicate record detection in large databases
Paper
| Paper/Presentation Title | PC-filter: a Robust filtering technique for duplicate record detection in large databases |
|---|---|
| Presentation Type | Paper |
| Authors | Zhang, Ji (Author), Ling, Tok Wang (Author), Bruckner, Robert (Author) and Liu, Han (Author) |
| Editors | Galindo, Fernando, Takizawa, Makoto and Traunmuller, Roland |
| Journal or Proceedings Title | Lecture Notes in Computer Science (Book series) |
| Journal Citation | 3180, pp. 486-496 |
| Number of Pages | 11 |
| Year | 2004 |
| Publisher | Springer |
| Place of Publication | Germany |
| ISSN | 1611-3349 |
| 0302-9743 | |
| ISBN | 9783540229360 |
| Digital Object Identifier (DOI) | https://doi.org/10.1007/978-3-540-30075-5_47 |
| Web Address (URL) of Paper | https://link.springer.com/chapter/10.1007/978-3-540-30075-5_47 |
| Conference/Event | 15th International Conference on Database and Expert Systems Applications (DEXA'04) |
| Event Details | 15th International Conference on Database and Expert Systems Applications (DEXA'04) Event Date 30 Aug 2004 to end of 03 Sep 2004 Event Location Zaragoza, Spain |
| Abstract | In this paper, we will propose PC-Filter (PC stands for Partition Comparison), a robust data filter for approximately duplicate record detection in large databases. PC-Filter distinguishes itself from all of existing methods by using the notion of partition in duplicate detection. It first sorts the whole database and splits the sorted database into a number of record partitions. The Partition Comparison Graph (PCG) is then constructed by performing fast partition |
| Keywords | PC-filter; partition comparision filter; duplicate record detection |
| ANZSRC Field of Research 2020 | 469999. Other information and computing sciences not elsewhere classified |
| Public Notes | File reproduced in accordance with the copyright policy of the publisher/author. |
| Byline Affiliations | University of Toronto, Canada |
| National University of Singapore | |
| Microsoft, United States |
https://research.usq.edu.au/item/9z2q7/pc-filter-a-robust-filtering-technique-for-duplicate-record-detection-in-large-databases
Download files
1882
total views688
total downloads8
views this month4
downloads this month