Data
Cleaning: Problems and Current Approaches Rahm, Erhard;
Do, Hong-Hai, IEEE Bulletin of the Technical Committee on Data Engineering,
Vol 23 No. 4, December 2000
Data
Quality Mining -Making a Virtue of Necessity Jochen Hipp,
Ulrich Guntzer, and Udo Grimmer, Proceedings of the 6th ACM SIGMOD Workshop
on Research Issues in Data Mining and Knowledge Discovery (DMKD 2001)
A
Framework for Analysis of Data Quality Research R.Y. Wang,
V.C. Storey, and C.P. Firth, IEEE Transactions on Knowledge and Data Engineering
7 (1995), no. 4, 623--640
Monitoring data
quality problems in network traffic databases. F. Korn, S. Muthukrishnan
and Y. Zhu. VLDB 2003.
Mining Database Structure: Or,
How to build a Data Quality Browser T. Dasu, T. Johnson,
S. Muthukrishnan, and V. Shkapenyuk, In Proceedings of the ACM Conf. on
Management of Data (SIGMOD), 2002
Systematic
Development of Data Mining-Based Data Quality Tools Dominik
Luebbers, Udo Grimmer, Matthias Jarke, VLDB 2003
Potter's Wheel: An Interactive
Data Cleaning System V. Raman, J. Hellerstein. In Proc.
VLDB (Roma, Italy, 2001), pp. 381-390.
Schema
Mapping as Query Discovery R. J. Miller, L. M. Haas, and
M. Hernandez. In Proceedings of the International Conference on Very Large
Data Bases (VLDB), pages 77--88, 2000.
Real World Data is
Dirty: Data Cleansing and The Merge/Purge Problem Mauricio
Hernandez, Salvatore Stolfo. Journal of Data Mining and Knowledge Discovery,
1(2), 1998.
An Extenxible Framework
for Data Cleaning Helena Galhardas, Daniela Florescu, Dennis Shasha,
Eric Simon. ICDE2000
"Declarative Data Cleaning:
Language, Model, and Algorithms", Helena Galhardas, Daniela Florescu,
Eric Simon , Cristian-Augustin Saita, Dennis Shasha, Proc. of the
Int. Conf. on Very Large Data Bases (VLDB) ,Rome, Italy , September , 2001
Data Cleansing:
Beyond Integrity Analysis Jonathan I. Maletic and Andrian
Marcus. In Proceedings of The Conference on Information Quality (IQ2000).
Cleansing Data for Mining
and Warehousing Mong-Li Lee, Tok Wang Ling, Hongjun Lu,
and Yee Teng Ko. In Proceedings of the International Conference on Database
and Expert Systems Applications (DEXA), volume 1677 of LNCS, pages 751-760,
Florence, Italy, 1999.
ARKTOS:
A Tool for Data Cleaning and Transformation in Data Warehouse Environments
Panos Vassiliadis, Zografoula Vagena, Spiros Skiadopoulos, Nikos Karayannidis,
Timos Sellis, Bulletin of the IEEE Computer Society Technical Committee
on Data Engineering, vol. 28, no. 4, pp. 42-47, December 2000
Data Quality Research
at AT&T Labs
The MIT Total Data
Quality Management Program
Data
Cleaning and Information Quality(Drexel)
Data Cleaning
at Microsoft
Univ. of Toronto DB Group
Automated Data
Cleansing(SDML)
Server
log cleaning
Dagstuhl Seminar
"Data Quality on the Web"
A Reading
List
Data Cleaning Projects and Commercial Tools:
AJAX: An Extensible
Data Cleaning Tool
ARKTOS
II(2002-2004)
Data Cleaning and
Integration(a list of commercial tools and papers)
IBM DataJoiner
WinPure ListCleaner
Business
Advantage
Practical
Analysis of Nutritional Data(PANDA): Data Cleaning
Data
Providers and Data Cleaning-KDnugget
SAS
Data Quality-Cleanse
Dataquality.com
Datacleaning.com
SIGMOD, VLDB,
ICDE'04, ICDT05,
EDBT
SIGKDD, CIKM,
ICDM'04,
SDM'04
ACM TODS, IEEE
TKDE, J.
VLDB, J. DMKD
Data
Cleaning, Record Linkage,and Object Consolidation
DIMACS
Workshop on Data Quality, Data Cleaning and Treatment of Noisy Data
International Workshop
on Data Quality in Cooperative Information Systems
ICDE
2000: Special Issue on Data Cleaning All-in-one
papers[PDF, PS]
Data Quality
and Data Cleaning: An Overview
Data
Warehousing Systems: Design & Research Issues
Ren¨¦e J. Miller
Professor at U. Toronto
JiaWei Han Professor
at UIUC
Helena Galhardas, Professor
at IST and Researcher at INESC
Ahmed K. Elmagarmid
Professor at Purdue U.
Mohamed
Galal Elfeky Ph. D. Student, Department of Computer
Sciences, Purdue U.
Panos Vassiliadis
University of Ioannina, Greece
H. Galhardas: Data Cleaning:
Model, Language and Algoritmes, PhD thesis, University of Versailles,
September 2001[ps]
[pdf]
Alvaro E. Monge: Adaptive detection of approximately duplicate database records and the database integration approach to information discovery. University of California, San Diego, 1997
[pdf]
Edwin M. Knorr: Outliers and Data
Mining: Finding Exceptions in Data, PhD Thesis, University of British Columbia,
April, 2002. [pdf]
Exploratory Data Mining and Data Cleaning by Tamraparni Dasu (Author), Theodore Johnson (Author)
SUGI
27: Data Cleaning 101
Data Quality
on the Web
Courses
in Data Cleaning and Analysis