Comparison of Relational and Multi Dimensional Database structures

Published on 2003-11-27 by John Collins. Video: YouTube|Rumble|Patreon Audio: Spotify|Amazon Music|Apple Podcast

2. Comparison of Relational and Multi-Dimensional Database structures

2.1 Relational Databases

The relational database model uses a two-dimensional structure of rows and columns to store data, in tables of records corresponding to real-world entities. Tables can be linked by common key values. E.F. Codd first designed this model in 1970, while working for IBM, and it's simplicity revolutionized database usage at the time. Codd's work was in many ways ahead of it's time, as computing power could not support the overheads of his database system (Hasan 1999).

In the 1980s the power of computers had grown to the point where these overheads were no longer a problem, and today relational database management systems (DBMS) are available on local desktops, as well as large organizational database management servers.

The techniques of entity-relationship (ER) modelling and the structuring of data in normalized tables have become popular with trained database administrators and designers, who routinely use relational DBMS to store huge volumes of organizational data with very high transaction rates.

Although deceptively simple to design and operate, relational database simplicity for the end-user does fall down when it comes to running queries. Accessing data from relational databases may require complex joins of many tables and is distinctly non-trivial for untrained end-users, who may be forced to hire IT professionals to structure such queries in a query language, such as SQL. When queries of a writing nature are run, such as INSERT, DELETE and ALTER TABLE, the consequences of getting it wrong are greatly increased when they are employed on a live system environment.

Figure 1 A typical 3D hypercube

2.2 Multi-Dimensional Databases

In a multi-dimension database system, the data is presented to the user in such a way as to represent a hypercube, or multi-dimensional array, where each individual data value is contained within a cell accessible by multiple indexes. A simple example is given in the previous diagram, Figure 1, where a fictional student exam result database is presented. This database contains three dimensions, namely Result, Student Name and Exam. In this example, an individual student (represented by Student Name) may have their exam results for several exams compared over a period of time, for example a four-year undergraduate course. This ability to present data in such a top level view is unique to multi-dimensional systems, and shows just how powerful these systems can be. Of course a multi-dimensional system is not limited to three dimensions as in the previous example, but when we go beyond that amount, it becomes more difficult to present such structures in a pictorial view. If we stick with the previous example presented in Figure 1; let us now add a fourth dimension called Subject. Let us assume our students study computer science, with subjects in Databases, Programming and Software Engineering. If we imagine this new dimension as being a box containing our previous three dimensions, then we would have three such boxes, namely one for each subject that our students were tested on, as shown in Figure 2.

Figure 2 A four-dimensional database structure

Of course this model can be extended to a fifth dimension, and a sixth and so on, until all requirements are met. But what are the advantages of such a system over a traditional relationship system? We will look at some of these in the next section.

2.3 Advantages of Multi-Dimensional Databases over Relational Databases

If we look again at our example of a student exam result dataset, there are many reasons why it is more efficient to represent our dataset with a multi-dimensional array rather than a relational table. For example, all similar information is lined up in a single dimension, like Results, so that they can be very quickly summed up to a total or quickly compared to get an immediate idea of how student results are fairing this semester.

The multi-dimensional array structure represents a higher level of organization than the relational table. The structure itself represents a more intelligent view of the data it contains, because our perspectives of this data are embedded directly into the structure as dimensions, as opposed to being placed into fields. For example, if we were to design a fictional relational table for our student results, it might look something like the following diagram:

Student Name	Exam	Result
John Collins	Databases	70
John Collins	Programming	72
John Collins	Operating Systems	60
Larry Wall	Databases	80
Larry Wall	Programming	99
Larry Wall	Operating Systems	70
Linus Torvalds	Databases	80
Linus Torvalds	Programming	90
Linus Torvalds	Operating Systems	99

The structure of this relationship table can tell us nothing of the nature of the contents of these fields, only that there are three fields Student Name, Exam and Result, and there are nine records. If we were to present a three-dimensional view of this data, while adding a third dimension called Semester, it might look something like the following:

Figure 3

As you can see from Figure 3, there is no need to have Result as a dimension, because the exam results are going to be contained within the cells of the database structure. Another obvious advantage is the removal of the duplication in the relational table, where each student name was repeated three times for each exam that they participated in. In the multi-dimensional view, the Student Name and the Exam become dimensions, or in effect indexes into that data, so having duplicates does not make any sense.

Notice how all related information neatly lines up in the multi-dimensional view, for example all programming results for John Collins over all three semesters lines up along the z-axis (i.e. from the diagram view perspective, the Semester dimension), while all exam results for John Collins in all subjects line up on the x-axis (the Exam dimension). Programming results for all students line up on the y-axis (the Student Name dimension).

From this example, it is clear to see the inherent intelligence in this database structure; in the ER table such views of specific data would not be possible without writing complex SQL queries.

2.4 Additional Multi-Dimensional Database advantages

Apart from the inherent advantages of using a multi-dimensional array structure, multi-dimensional databases also contain the following advantages (Kenan):

Enhanced Data Presentation and Navigation: Intuitive spreadsheet-like views of the data are the output of multi-dimensional databases. Such views are difficult to generate in relational systems without the use of complex SQL queries, while others cannot be performed by SQL at all, e.g. top ten exam results.
Ease of Maintenance: Multi-dimensional databases are very easy to maintain, because data is stored in the same way as it is viewed, that is according to its fundamental attributes, so no additional computational overhead is required for queries of the database. To compare this to relational systems, where complex indexing and joins may be used that require significant maintenance and overhead.
Increased Performance: Multi-dimensional database achieve performance levels that are well in excess of that of relational systems performing similar data storage requirements. These high performance levels encourage and enable OLAP applications. Performance can be improved in relational systems through database tuning, but the database cannot be tuned for every possible on-the-fly query. In relational systems, tuning is quite specific, therefore decreasing flexibility, and also requires expensive database specialists.

Previous - Contents - Next

Updated 2020 : note that this post was originally published in 2003, but is left here for archival purposes.