Strengths
and weaknesses of relational DBMSs
Relational database
systems are by far the most important type of database around today. There has
been a widespread acceptance of this model for the traditional business
applications, including payroll, order processing and airline reservations
systems. There has also been a widespread acceptance for applications which
access databases over the Internet. Well-designed relational databases can
provide appropriate data storage and retrieval facilities over a long timescale.
Because of
the dominance of relational databases, this book is largely concerned with the
relational model, although the object-oriented and object-relational models are
discussed in chapter 7. In this section we look at a number of advantages of
relational database systems over other methods (including file-based systems and
other types of DBMSs) and also several weaknesses that are cited by the
proponents of object-oriented database systems:
Strengths
·
Relational databases support a simple data structure, namely tables
·
Relational
databases limit redundancy or duplication of data. As all the data
pertaining to a particular object is stored together, and then linked to related
objects, there is no need to store data about the original object in more than
one place. In practice, not all redundancy is eliminated - as we shall see (in
chapter 2), the process of normalisation will lead to some redundancy, although
controlled
Weaknesses
·
Poor
representation of 'real world' entities
and their relationships. When
relational databases are designed, entities are fragmented into smaller
relations through the process of normalisation (chapter 2). This fragmentation
into many relations leads to relations which do not exist or correspond to
entities in the real world. Such a design is inefficient, as many joins may be
required to recover data about that entity.
The model is said
to suffer from semantic overloading in that one construct (the relation)
is used to represent two different things (entities and relationships): there is
no mechanism for differentiation between entities and relationships. The
normalised component, the relation, is not sufficient to represent both the data
and data relationships. With relational databases, an entity is broken up into
several relations, thus querying becomes cumbersome since project, select and
join operations have to be used frequently to reconstruct the entities.
Also it is
difficult to represent hierarchies of data. For example, both students and
lectures share some of the same data (name, address etc). In a relational
database we would have to define three relations: one for the 'super class'
person as well as for student and lecturer, and to retrieve the information may
well require a join
The limited data
types in the relational model cannot represent 'real word' objects that have a
complex structure, and this leads to unnatural joins, which are inefficient. Any
item in the tuple, i.e. intersection of a row and column, must be an atomic
data type since it assumes both horizontal and vertical homogeneity.
Horizontal homogeneity means that each tuple of a relation must be composed
of the same attributes. This can be a disadvantage in that it is not possible to
store closely related objects in the same category if they differ slightly in
the attributes they possess. For example, there may be adult and child aeroplane
tickets, with children having the additional attribute 'guardian'. Vertical
homogeneity means that the values in a particular column of a relation must
all come from the same domain.
Relational
databases support a small, fixed collection of data types and do not
permit users to define new types. In many applications the attributes' domains
require far more complex data types. Many relational DBMSs allow the storage of
binary large objects (BLOBs) - a data value that contains binary information
representing images, digitised videos or audios, or any large and unstructured
object. Typically, it is held as an attribute in a relational database which
references to a file. Storing the data in external files is not a good way of
manipulating this data: the DBMS has no knowledge of the structure of this data
and cannot perform queries or operations on it. In addition, BLOBs cannot
contain other BLOBs. As an example, a picture can be stored as a BLOB by an
RDBMS. The complete picture can be displayed but not part of it, and the
internal structure is not known to the RDBMS.
Because SQL is
computationally incomplete and cannot provide all of-the operations
provided by most programming languages, the SQL standard provides embedded SQL
to help develop more complex database applications. However, this leads to the
so-called impedance mismatch problem because we are mixing different
programming paradigms (or models).
SQL is a language
handling data relations either in rows or columns. The main difference of such a
declarative language from high-level languages of procedural or functional
programming is that the latter can handle single data relations each time. In
other words, SQL handles more than one row of data at the same time.
Furthermore, SQL and highlevel programming languages represent data in
different ways. Especially when attributes of date and time are concerned, SQL
has the corresponding data types that automatically 'translate' the value to a
meaningful format. High-level programming languages need to convert integer
values to the date and time formats. Such type conversion is responsible for as
much as a 30% increase in programming effort and use of resources