January 2005, Copyright Tony Drewry
Logical Data Structures (LDSs) - Getting started.

A logical data structure is:

... a high level model of a system's operating data and the relationships within it ...

Logical data structures are data models, and are sometimes called entity-relationship (ER) models or even entity-attribute-relationship models.

Modelling

Databases mirror aspects of the real world.
In effect databases  are computerised models - simple-structure models - of more complex processes.  Logical data structures are models which look both ways - at the real world and at the database world.   LDSs show the underlying structure of data in the real world. Once we have derived an LDS we then have a map of the data structures which must exist in the database.

An LDS is a model of some part of the real world, but it is also a formalized model which defines the structure of the database system which will model that part of the real world - a database system can be described as an implementation of a Logical Data Structure.

Data modelling

Data modelling (logical data structuring) is based on the identification of:

the entities, their attributes, and  the relationships between the entities,
within the system being modelled.
Entity  - Something about which an enterprise needs to keep data
Attributes  - the properties of an entity;
Relationships  - the connections between  entities

Entities
An Entity may be physical
                           e.g.  an Employee; a Part; a Machine
  or conceptual
                           e.g.   a Project; an Order; a Course
Attributes

Attributes are the properties of an entity - data which describes or is 'owned' by an entity.
Attributes (data) equate to facts - specific details about entities - details of interest.

For example:

Entity Details of Interest
Customer Customer Name, Address, Telephone No., Fax, Credit Status, etc.
Employee Age, Salary, Address, etc.
Project Project Leader, Start-Date, Cost, etc.

See diagrams below for examples of attributes in an LDS.

Entity Types

In LDSs the name given to an entity in a model refers to an entity or object  type;
- that is to say that the name given to an entity describes a set or group of a particular type - a set of entity 'instances' or 'occurrences'.
e.g. 'Employee' is a set of employees - any individual employee is an 'entity occurrence' - an instance of 'Employee' entity type.

Identifying entities

An Entity is:
  •  a thing about which the system holds information
  •  a type of thing with many particular instances
  • an 'owner' of a set of attributes (attributes can 'point' to other entities)

 Each instance of an entity is different from all others - one or more attributes will typically form a 'primary key' attribute - unique to a particular instance.
For example,  an Employee has an Employee Number; a Project has a Project Code.

Using the above bulleted points as a guide, think about any organisation with which you are familiar.
 What entities can you identify?
Do any of the entities have attributes which refer or point to other entities?
Can you think of some uniquely identifying attribute?
(Note that a person's name is not necessarily unique.  Usually  in a database, if some candidate key attribute does not exist, or is not adequate, a new attribute is created.  In the case of Person, we would allocate - typically - a unique number ID to the Person entity type).



Given some system or subsystem as the target of data analysis, it is usually quite easy to identify the principle entity types  to be included within a data model.  Typically, the information and data which an organisation needs to store already exists - as files and documents which an analyst will study and interpret in terms of a logical data structure (LDS).

The next step in modelling the system is to understand the links or connections between the entities.

Relationships

Relationships link entities.

an Employee works on  a Project
a Part  is made on a Machine
a Student attends a Course

  In the real world, objects do not exist in isolation.  Our understanding of real world objects is in terms of their relationships with other objects; for example,  'the earth circles the sun'; 'he is a carpenter' ; etc.
Any real world object which we are going to include in a data model as an entity type must have some relationship with at least one other entity within the model (even if we are not going to implement that relationship within our database system).

Typically, real-world relationships are 'many-to-many'.
In the above examples, each instance (occurrence) of Employee may work on a number of Projects, and  each Project may be worked on by many Employees;  a Part may be made using a number of Machines, and a Machine may be used to make different Parts; a Student may attend a number of Courses and a Course may be attended by a number of Students.


Terminology
LDS Filing system RDBMS
Entity File Table
Entity occurrence  Record Row
Attribute Field/data item  Column heading/entry
Entity identifier Record no./ID Primary key
(RDBMS = Relational DataBase Management System)

Describing Entities and Relationships
Master - Detail relationship (1:many)
Masters can also be Details

Entities are shown as boxes.

Relationships are shown by lines connecting entity boxes.

All relationships must be named.
Relationships are named using verbs.
 


Relationships and Entities

A Factory makes Components.

Given a factory, we can tell which components it makes.

 


A Component is made in a Factory.

Given a particular component, we can tell which factory it is made in.
 

(Note that, in a complete model, relationships are named from both directions. However, it is generally acceptable to only show the name from one direction in working models.)

One-to-Many Relationships
 

A formal description 
of the relationship shown in the diagram above is:
    • One Factory may make zero or more Components.
    • One Component is made in one (and only one) Factory.
 
This formal description can be expanded as:
    • One instance of the Factory entity may be related to  zero or more instances of the Component entity.
    • One instance of the Component entity may be related to one (and only one) instance of the Factory entity .
What this means in a database system is that:
one record in a table called Factory may be related to a number of records in a Component table
but
a record in the Component table can only be related to one record in the Factory table. 

( Note that the relationship is supported by the use of 'keys'.)

One-to-Many Relationships summarised.


For any occurrence of A, there may be 0, 1, or many, occurrences of B.

For any occurrence of B, there can only be one occurrence of A.

From another perspective:

If an 'A' record exists there may be zero or more related 'B' records.
Any 'B' record can only be related to a single 'A' record.

Note that in a finished LDS (i.e. one which will be used as the blueprint for a database) ALL the relationships will almost invariably be one-to-many.

One-to-One Relationships
1:1 relationships always suggest need for further analysis.

Usually one entity is really an  attribute of the other or the 1:1 relationship indicates a 1:many relationship - often 'over time'. 

For example: a Factory may have many Managers during its lifetime; a Manager might be in charge of different Factories during his career. An analyst would have to decide whether keeping track of management changes is required or whether it is sufficient to know who is the current manager ( in which case Manager would be an attribute of Factory).


Many-to-Many Relationships
M:N relationships 
hide missing entities. 

Minimally, a many-many relationship will require insertion of a 'link entity'.
Further analysis may show that the link entity has attributes of its own - often qualifiers in respect of quantity or time.

 This diagram shows the above diagram with attributes.

Note that the Primary Key attributes are underlined.

(To reduce clutter, relationship names have been omitted).




Another example of 'breaking down' a many-to-many relationship. 

Observe that in each of the examples here the 'link' entity has a double barrelled primary key - made up from the primary key attributes of the two entities being linked. This is standard practice.

In the link entity, each part of the primary key is a foreign key - pointing to some record in the related entity.
 


In the image below, 'Borrower' has been renamed as 'Member' and a condition has been added to the re;lationship between Member and Book; i.e. that a member can borrow a maximum of 4 books at a time.


Optionality - Must and May
Two notations which were traditionally used in LDSs. 

The diagram here says: 
One A record may have a relationship with zero or more B records. 
If a B record exists there must be one (and only one) related A record.

However, this may-must type of relationship is almost always the case in 1:m (one-to-many, Master-Detail) relationships.
Typically the existence of a detail record is dependent on the prior existence of the related master record - and since the master record in the relationship has to have prior existence, it must exist without a related detail record.
The may-must of a one-to-many relationship is usually supported in database applications by enforcing Key Integriy and Relational Integrity.
 
The circle ('may') notation is used rarely.
In this example: 
A Sales Person may place an Order. (There is no 'bar' to say 'must').
An Order may be placed by a Sales Person. (perhaps Customers place orders directly.

(This is not a very good example - further analysis might suggest that a Customer entity is required; and that an Order record must show  some source. Probably the model would be amended along the lines of the next diagram - with an exclusive relationship.

 The bar ('must') is ocasionally used - mostly in an 'arc' form as shown below.
Exclusivity

The diagram here says: 

A Part may be bought from a Supplier OR  be made at a Factory, but it must come from one or the other.

(The exclusive arc is also used in SSADM to describe 'IS A' relationships. See below)

 

The diagram below says that:
An Order may be for Parts OR  for Work but not for both.
 
(Imagine a situation where the factory provided products and also provides services - but wishes to have separate ordering systems.)

Creating a Model
 

The steps in development of a Logical Data Structure 
    • Identify initial entities (object types)
    • Identify direct relationships between the entities
    • Construct an initial diagram
    • Identify additional characteristics
    • Rationalize and validate the model.

Identify initial entities

So far in the above document we have identified a number of Entity types, some of which we will use to create an LDS.
The entities are:

Division
Factory
Part/Component
Order
Order Line
Sales Person
Supplier

Identify direct relationships between the entities

We have identified the following relationships:
 

We will make one assumption, that we can call a 'Component' a 'Part', and then:

Construct an initial diagram
 

Note
The relationships have only been named in one direction. In each case I have chosen to name from the master end.

Initial diagram with attributes
 

The above diagram, sometimes called an entity-attribute-relationship model, is starting to look like a useful starting point for creating a database system. Some attributes are missing, but all primary and foreign keys have been included.
Note that all relationships are 1:many.

Rationalize and validate the model
The above model is okay and needs no further rationalization or validatation.
Some examples of cases in which rationalization could be required are shown below.


More on Relationships

 Multiple Relationships


A factory makes a number of components. Some factories have  components repair shops. Every component is repaired by one factory only, which is not necessarily where it was made.

1. For any Factory record, there may be a number of related Component records referencing the Factory as producer.

2. For any Factory record, there may be a number of related Component records referencing the Factory as maintainer/repairer.
 

Keep multiple relationships if they are different.  Give them names.

 Triangle Relationships
Is the relationship A to B to C the same as A to C ?
Each division manufactures a variety of components

is the same as

A division has a number of factories, each of which produces a range of components

The 'manufactures' relationship in this model can be removed.

Necessary triangulation

Each division has a QA (quality assurance) function for a variety of components - no division QAs its own products.

The relationship

Division to Factory to Component
is different from
Division to Component

Recursive Relationships
Each large factory provides technical assistance and support for several smaller factories.
Each entity occurrence will contain a reference to the factory's supporting factory.
Recursive M:N Relationships
A component may be assembled from many subcomponents. 
Each subcomponent may be used in several higher level components.

The M:N (many-to-many) relationship is resolved in the standard way by creating a link entity.

Sometimes a M:N Relationship is really two 1:M relationships
A person may be parent to a number of children.
A child has more than one parent!

The M:N relationship is resolved by creating 2 one-to-many links.


Further Reading and Bibliography

Britton, Carol & Doake, Jill (1993)Software System Development - a gentle introduction, by McGraw-Hill
Beynon-Davies, Paul (1991)Relational Database Systems, Blackwell
Birrell, N.D. & Ould, M.A, (1986)A Practical Handbook for Software Development, C.U.P.


Some example LDSs
 
My Home Page