Keys and Foreign Keys for XML Design and Reasoning

M. Karlinger
PT1001 (2010)

Kurzfassung (Englisch)

Integrity constraints are the primary means to ensure that data is an accurate representation
of reality, which is vital to organizational success in today's economy. The most fundamental
types of integrity constraints are keys and foreign keys, irrespective of the data model used.
Keys and foreign keys establish meaningful connections between real world entities and their
representations in data (data entities) on the basis of entity properties.

With the adoption of the eXtensible Markup Language (XML) as the standard for data
exchange over the internet and its increasing usage as format for the permanent storage
of data, the importance of studying keys and foreign keys in the XML data model has
increased in recent years. The design of XML integrity constraints is challenging because of
the hierarchical and semi-structured nature of XML data which allows data entities to have
multiple or absent values for an entity property. In previous proposals to XML integrity
constraints, multiple or absent property values lead to counter-intuitive results in checking
the satisfaction of a constraint in an XML document. In contrast, XML keys (XKeys) and
foreign keys (XFKeys) as proposed in this thesis handle multiple or absent property values
in a way intuitively expected by the application developer.

It is shown that XKeys and XFKeys preserve the semantics of relational keys and foreign
keys when relational data is mapped to XML, as frequently required in data exchange scenarios.
Moreover, the consistency and implication problems related to XKeys and XFKeys
are discussed in the context of  'complete' XML documents, which generalize complete relations.
It is shown that every set of XKeys or XFKeys is consistent, and that there are sound
and complete sets of inference rules for both XKey and XFKey implication.