In general, the normalization process includes:
- Eliminating repeating groups.
- Removing partially-dependent columns.
- Removing transitively-dependent columns.
Eliminating repeating groups
When a field in a given row contains more than one value for each occurrence of the primary key, then that group of data items is called a repeating group. This is a violation of the first normal form, which does not allow multi-valued attributes.
Refer to the DEPARTMENT table. For any occurrence of a given primary key, if a column can have more than one value, then this set of values is a repeating group. Therefore, the first row, where DEPT_NO = "100," contains a repeating group in the DEPT_LOCATIONS column.
DEPT_NO | DEPARTMENT | HEAD_DEPT | BUDGET | DEPT_LOCATIONS |
| 100 | Sales | 000 | 1000000 | Monterey, Santa Cruz, Salinas |
| 600 | Engineering | 120 | 1100000 | San Francisco |
| 900 | Finance | 000 | 400000 | Monterey |
In the next example, even if you change the attribute to represent only one location, for every occurrence of the primary key "100," all of the columns contain repeating information except for DEPT_LOCATION, so this is still a repeating group.
DEPT_NO | DEPARTMENT | HEAD_DEPT | BUDGET | DEPT_LOCATIONS |
| 100 | Sales | 000 | 1000000 | Monterey |
| 100 | Sales | 000 | 1000000 | Santa Cruz |
| 100 | Sales | 000 | 1000000 | Salinas |
| 600 | Engineering | 120 | 1100000 | San Francisco |
| 900 | Finance | 000 | 400000 | Monterey |
To normalize this table, we could eliminate the DEPT_LOCATION attribute from the DEPARTMENT table, and create another table called DEPT_LOCATIONS. We could then create a primary key that is a combination of DEPT_NO and DEPT_LOCATION. Now a distinct row exists for each location of the department, and we have eliminated the repeating groups.
| DEPT_NO | DEPT_LOCATION |
| 100 | Monterey |
| 100 | Santa Cruz |
| 600 | San Francisco |
| 100 | Salinas |
Removing partially-dependent columns
Another important step in the normalization process is to remove any non-key columns that are dependent on only part of a composite key. Such columns are said to have a partial key dependency. Non-key columns provide information about the subject, but do not uniquely define it.
For example, suppose you wanted to locate an employee by project, and you created the PROJECT table with a composite primary key of EMP_NO and PROJ_ID.
| EMP_NO | PROJ_ID | LAST_NAME | PROJ_NAME | PROJ_DESC | PRODUCT |
| 44 | DGPII | Smith | Automap | blob data | hardware |
| 47 | VBASE | Jenner | Video database | blob data | software |
| 24 | HWRII | Stevens | Translator upgrade | blob data | software |
The problem with this table is that PROJ_NAME, PROJ_DESC, and PRODUCT are attributes of PROJ_ID, but not EMP_NO, and are therefore only partially dependent on the EMP_NO/PROJ_ID primary key. This is also true for LAST_NAME because it is an attribute of EMP_NO, but does not relate to PROJ_ID. To normalize this table, we would remove the EMP_NO and LAST_NAME columns from the PROJECT table, and create another table called EMPLOYEE_PROJECT that has EMP_NO and PROJ_ID as a composite primary key. Now a unique row exists for every project that an employee is assigned to.
Removing transitively-dependent columns
The third step in the normalization process is to remove any non-key columns that depend upon other non-key columns. Each non-key column must be a fact about the primary key column. For example, suppose we added TEAM_LEADER_ID and PHONE_EXT to the PROJECT table, and made PROJ_ID the primary key. PHONE_EXT is a fact about TEAM_LEADER_ID, a non-key column, not about PROJ_ID, the primary key column.
| PROJ_ID | TEAM_LEADER_ID | PHONE_EXT | PROJ_NAME | PROJ_DESC | PRODUCT |
| DGPII | 44 | 4929 | Automap | blob data | hardware |
| VBASE | 47 | 4967 | Video database | blob data | software |
| HWRII | 24 | 4668 | Translator upgrade | blob data | software |
To normalize this table, we would remove PHONE_EXT, change TEAM_LEADER_ID to TEAM_LEADER, and make TEAM_LEADER a foreign key referencing EMP_NO in the EMPLOYEE table.
PROJ_ID | TEAM_LEADER | PROJ_NAME | PROJ_DESC | PRODUCT |
| DGPII | 44 | Automap | blob data | hardware |
| VBASE | 47 | Video database | blob data | software |
| HWRII | 24 | Trans. Upgrade | blob data | software |
| EMP_NO | LAST_NAME | FIRST_NAME | DEPT_NO | JOB_CODE | PHONE_EXT | SALARY |
| 24 | Smith | John | 100 | Eng | 4968 | 64000 |
| 48 | Carter | Catherine | 900 | Sales | 4967 | 72500 |
| 36 | Smith | Jane | 600 | Admin | 4800 | 37500 |