Using superimposed multidimensional schemas and OLAP patterns for RDF data analysis

Authors
M. Hilal, C. Schütz, M. Schrefl
Paper
Hila18a (2018)
Citation
Open Computer Science Journal, Vol. 8, No. 1, Editor: Egon van den Broek, De Gruyter, ISSN 2299-1093, DOI: 10.1515/comp-2018-0003, pp. 18-37, Download PDF: https://doi.org/10.1515/comp-2018-0003, 2018.
Resources
Copy  (In order to obtain the copy please send an email with subject  Hila18a  to dke.win@jku.at)

Abstract (English)

The foundations for traditional data analysis are Online Analytical Processing (OLAP) systems that operate on multidimensional (MD) data. The Resource Description Framework (RDF) serves as the foundation for the publication of a growing amount of semantic web data still largely untapped by companies for data analysis. Most RDF data sources, however, do not correspond to the MD modeling paradigm and, as a consequence, elude traditional OLAP. The complexity of RDF data in terms of structure, semantics, and query languages renders RDF data analysis challenging for a typical analyst not familiar with the underlying data model or the SPARQL query language. Hence, conducting RDF data analysis is not a straightforward task. We propose an approach for the definition of superimposed MD schemas over arbitrary RDF datasets and show how to represent the superimposed MD schemas using well-known semantic web technologies. On top of that, we introduce OLAP patterns for RDF data analysis, which are recurring, domain-independent elements of data analysis. Analysts may compose queries by instantiating a pattern using only the MD concepts and business terms. Upon pattern instantiation, the corresponding SPARQL query over the source data can be automatically generated, sparing analysts from technical details and fostering self-service capabilities.

Keywords: Linked Open Data; Self-Service Business Intelligence; Multidimensional Modeling