Zur Seitenansicht


Data Quality Measurement in Wide-Column Stores / submitted by Julia Hilber, BSc
AutorInnenHilber, Julia
Beurteiler / BeurteilerinWöß, Wolfram
ErschienenLinz, 2018
Umfang95 Seiten : Illustrationen
HochschulschriftUniversität Linz, Masterarbeit, 2018
Schlagwörter (DE)Datenqualität / NoSQL / Cassandra
Schlagwörter (EN)data quality / NoSQL / Cassandra
Schlagwörter (GND)Datenqualität / NoSQL-Datenbanksystem / Apache Cassandra / MySQL
URNurn:nbn:at:at-ubl:1-22748 Persistent Identifier (URN)
 Das Werk ist gemäß den "Hinweisen für BenützerInnen" verfügbar
Data Quality Measurement in Wide-Column Stores [2.84 mb]
Zusammenfassung (Englisch)

Many companies and organizations make decisions with the help of data stored in database. Therefore, it is very important to know the data quality of a database, otherwise bad decisions are made if the data quality is poor. The most data issues come from human errors during data acquisi- tion, faulty process, wrongly-designed architectures, inconsistent definitions and incorrect usage of data. Nowadays the NoSQL stores are a hype resulting from unstructured data in the web, and hence it is even important to assess the quality of the NoSQL stores. Therefore, this thesis provides an approach for assessing the data quality for a Cassandra store only on the schema level. The schema part is more important than the instance part because a change in the schema is maybe not possible due to applications which do not allow changes. A correction and assessment of the instances is easier. The data quality assess- ment is done for the Cassandra store, because Cassandra is the second most used database from the NoSQL stores. In this work, the extension of the tool QuaIIe with the assessment of the Cassandra schema is described and implemented. This is achieved with an exact analysis of the existing program, especially the connector for MySQL databases. The further step is the transformation of the Cassandra schema into the DSD vocabulary and the direct comparison of the data quality as- sessment between a Cassandra store and a MySQL database. At the end of the implementation the assessment of the Cassandra schema is performed and evaluated.

Das PDF-Dokument wurde 20 mal heruntergeladen.