After understanding the fundamental ideas of data modeling (perhaps through my previous blog Data Modeling: The Basics), you may be wondering how this actually comes together in practice to build a functional database. So, let’s now look at the general process that modelers use.

 

Conceptual, Logical & Physical Models

To effectively understand how a business’ data should be modeled, it is imperative to begin from a high level view and work your way down. The first stage is to build a conceptual data model, establishing the nature of the business, their key processes and any data requirements. The goal is to understand each entity, their important attributes and how they relate to one another.

Next is to build a logical data model, formalising the entities as tables and their attributes as columns and accommodating further complexity by creating extra entity tables. Suppose that our conceptual data model reflected that Customers purchase Products. To accurately model this relationship, a new Orders table is required to connect the Customers and Products tables such that Customers place Orders which contain Products.

Finally, the data modeler will build a physical data model, specifying all the necessary details for each table to be able to instantiate the model in a chosen DBMS. This entails assigning datatypes, lengths and other characteristics for each attribute, creating primary and foreign keys to establish relationships between tables, and generating bridge tables to represent any many-to-many relationship between two tables as two one-to-many relationships via the intermediary of the bridge.

One Order may contain many Products and one Product may be contained in many Orders so an Orders-Products table can help simplify this relationship. This table would have at least two columns Order ID (the PK of Orders) and Product ID (the PK of Products), with each row representing each instance a distinct product was part of a distinct order. Maybe Order 001 had Apples and Bananas and Order 002 had Apples and Cherries; the Orders-Product table will thus have four rows: Order 001, Apples; Order 001, Bananas; Order 002, Apples; Order 002, Cherries. Converting many-to-many relationships into two one-to-many relationships enables a strict series of parent-child dependencies which avoids data redundancy.

 

Database Normalisation

One of the most important sub-processes in building a logical data model is database normalisation. This involves minimising the number of places where a particular piece of data appears, known as data redundancy. The fewer places in which the same piece of data exists, the easier it is to ensure updates to this piece of data are universal and consistent and hence the more reliable queries will be. It is important to be familiar to be with at least the first the first three stages of normalisation, known as normal forms.

  • A table is in first normal form (1NF) when each row only contains a singular value in each column, and when each row is unique so that there are no duplicates.
  • A table is in second normal form (2NF) when it is in 1NF and each column depends on the entire primary key so that it cannot be determined by just a part of it.
  • A table is in third normal form (3NF) when it is in 2NF and each non-key column is independent of each other so that each row’s value is only determined according to the primary key value.

It should be noted that it is often more complicated than identifying the same string in different locations as there may be different strings used to indicate the same piece of data. For instance, the country Australia is commonly referred to as Australia, Aus, AU. Each of these entries should be standardised by using a universal value so that the same information is always referred to with the same string.

 

By grasping these main concepts, it should become a little easier to see how important data modeling is in maximising data quality. Sticking to these techniques will help you effectively and reliably store your data. Happy modeling!

Cover image by Mohamed Hassan from Pixabay

Hunter Iceton
Author: Hunter Iceton

Hunter Iceton is an enthusiastic and positive individual. He graduated from Sydney Uni in 2017 with a Bachelor of Commerce (Liberal Studies) majoring in Finance, Marketing and Quantitative Business Analytics. For the next few years, Hunter spent his time creating and releasing music, while tutoring primary and high school students in Mathematics and Business Studies. Hunter is now excited to be joining The Data School, looking forward to approaching analytics with a creative perspective. In his spare time, Hunter enjoys continuing to create music, reading philosophy and cooking plant-based dishes. Otherwise, he can usually be found at a restaurant, a bar or an art gallery.