A table in Boyce-Codd normal form (created by Raymond Boyce and Edgar Codd) follows all the rules of third normal form, plus one additional rule:
This only comes into play when there are multiple possible primary key combinations that overlap. Another name for this is "overlapping candidate keys."
Only in rare cases does a table in third normal form not meet the requirements of Boyce-Codd normal form!
| release_year | release_date | sales | name |
|---|---|---|---|
| 2001 | 2001-01-02 | 100 | Kiss me tender |
| 2001 | 2001-02-04 | 200 | Bloody Mary |
| 2002 | 2002-04-14 | 100 | I wanna be them |
| 2002 | 2002-06-24 | 200 | He got me |
The interesting thing here is that there are three possible primary keys:
release_year + salesrelease_datenameThis means that by definition, this table is in the second and third normal forms because those forms only restrict how dependent a column that is not part of a primary key can be.
However, this table is not in Boyce-Codd's normal form because release_year is entirely dependent on release_date.
The easiest way to fix the table in our example is to remove the duplicate data from release_date. Let's make that column release_day_and_month.
| release_year | release_day_and_month | sales | name |
|---|---|---|---|
| 2001 | 01-02 | 100 | Kiss me tender |
| 2001 | 02-04 | 200 | Bloody Mary |
| 2002 | 04-14 | 100 | I wanna be them |
| 2002 | 06-24 | 200 | He got me |
The same exact rule of thumb applies to the second, third, and Boyce-Codd normal forms. That said, it's unlikely you'll see BCNF-specific issues in practice.
Optimize for data integrity and data de-duplication first by adhering to Boyce-Codd normal form. If you have speed issues, de-normalize accordingly.