Johan ter Bekke developed a new data modeling
approach based on semantic principles resulting in inherently specified data structures. A singular word or data item hardly can convey meaning to humans, but in combination with the context a word gets more meaning. In a database environment the context of data items is mainly defined by structure: a data item or object can have some properties ("horizontal structure"), but can also have relationships ("vertical structure") with other objects. In the relational approach vertical structure is defined by explicit referential constraints, but in the semantic approach structure is defined in an inherent way: a property itself may coincide with a reference to another object. This has important consequences for the
semantic data manipulation language.
Semantic Data Modeling Principles
The objective of data modeling is to design a data structure for a database fitting as good as possible with some relevant world, often related to an organization with some information need. In general there is some relationship between a data model and a part of the existing world, but it is also possible that a data model has a relationship with some imaginary and abstract world. According to the well-known seminal work of Smith and Smith (1977), three
abstractions are very important for data modeling:
Classification is used to model instance_of relations, aggregation to model has_a relations and generalization to model is_a relations.
In semantic data modelling all three abstractions lead to a type definition (which can be base or composite). The semantic data model is, as contrasted with many other data models, based on only one fundamental notion: the type concept.
It is interesting to know that also the Xplain meta data model requires all three abstractions. Many other data modelling techniques (such as the relational model) do not support these three abstractions and therefore are limited in modelling capabilities.
A semantic data model can be represented grafically in an Abstraction Hierarchy diagram showing the types (as boxes) and their inter-relations (as lines). See warehouse example. It is hierarchical in the sense that the types which reference other types are always listed above the referenced type. This simple notation principle makes the diagrams very easy to read and understand, even for non-data modellers.
Data model specifications imply validity of certain integrity rules. Two inherent integrity rules are recognized for type definitions in a semantic data model:
- relatability: Each attribute in a type definition is related to one and only one equally named type, while each type may correspond with various attributes in other types.
- convertibility: Each type definition is unique: there are no type definitions carrying the same name or the same collection of attributes.
It is important to realize that these two integrity rules require neither separate specification nor declaration by procedures - they are inherent in the type definitions in a semantic data model.
Additional Data Restrictions
In addition to restrictions inherent from the data model there is often a need for additional more complex restrictions on data states that cannot be specified in a data model itself. These additional restrictions can be specified as
Declarative Data Derivations
The assert command - as explained in static restriction - can be used to specify derivable attributes. This is extremely useful for complex data derivations, such as needed for user applications (e.g. total order amount), reports (e.g. grouping per x, Year-To-Date and totals) and data analysis (top 10 products).
An assert command derives only one attribute or a single variable at a time. For complex derivations one has to define multiple assertions building on each other. This principle ensures modularity (thus easy to understand, test and maintain) and re-usability (of intermediate derived data) for other queries.