Database design typically involves three main levels: conceptual, logical and physical. At the end of the conceptual stage, an Entity-Relationship Diagram (ERD) can be created, illustrating all identified entities along with their relationships. Next at the logical level, we need to rationalize what data we should store in which entity. This involves examining how one grouping of attributes is better than another in providing sufficient relevant information about the entity with minimum redundancy.
Top-down design method
In the preceding stage of analyzing business requirements, we aim to collect as much information as possible regarding the data needed for specific business operations or transactions pertaining to each entity. This raw information may take the form of sales orders, invoices, or reports, then decompose into individual attributes. Since these attributes exist naturally together in a business operation or transaction, they will become our starting point for logical database design. Our goal now is to analyze those set of attributes, perform techniques to reduce the redundancy of data as much as possible, and to organize data so that they utilize less storage space.
Measures of a good database design
In this part, we will establish the criteria for what constitutes a well-designed database.
- Database tables have clear meaning for their existence
Each database table or relation only represents one entity or one relationship between entity. For example, in Employee table, each row represents an employee with values of attributes such as employee’s name, address, etc. If a database table represents more than one entity or relationship, it will create ambiguities in storing data and can lead to higher redundancy and complexities when inserting new, modifying or deleting existing data.
A good database design should minimize data duplication and ensure that each piece of data is stored in one place, does not affect other part of the database when it is changed or updated.
Data anomalies and repeating groups, which are the most common causes of data redundancies, can be controlled by a proper normalization process.
2. Saving storage space by avoiding attributes whose values can be frequently null
NULL values will create ambiguities in understanding the data. For instance, a NULL value in the ‘Grade’ attribute of a row in the Student database table could signify either that the grade has not been updated for that student or that it is missing. This will create problem later in aggregate operations such as SUM or AVERAGE, and in JOIN operations which require comparisons.
3. Database tables should be designed so that no spurious rows are generated
Spurious rows ( tuples) are invalid rows that are unwantedly added in the result of a join operation. They provide spurious information that is not the data in the original tables. They are the results of either incorrect join condition or poorly designed database.
In order to avoid spurious information, we need to maintain proper primary keys and foreign keys for each table. Using keys in join operations ensure integrity of relationships between tables.
When decomposing information in raw format into database relations , it is important to preserve functional dependencies of attributes to maintain the original relationships and constraints of the data in the decomposed tables. Functional dependency is a fundamental concept in database theory, denoting a relationship between attributes in a relation (table) where the value of one or more attributes uniquely determines the value of another attribute. In other words, if attribute A functionally determines attribute B, every value of A in the relation is associated with exactly one value of B. This concept is expressed symbolically as A → B, indicating that the value of A determines the value of B. Decomposed tables while still enforcing these dependencies ensure a lossless decomposition. This means when we re-join two decomposed tables, we can still get the original data without any spurious rows.
In this sense, identifying functional dependencies is one of the fundamental step which helps us decompose tables. Each functional dependency can be placed in separate tables during the decomposing process for careful consideration. This is because each of our ultimate database table should have all of its attributes fully dependent on its Primary Key.