A database management system (DBMS) is software that organizes everything that concerns a database. It provides us with a database, manages the insert, update, delete, read, and various other operations performed on the database.
Some common DBMS are MySQL, Oracle, Postgres, and MongoDB.
When setting up an SQL database table/relation, there is a need to add a unique identifier to each row of the table. This unique identifier is known as a primary key.
A primary key uniquely identifies every row in a relation. These identifiers are seen as unique because no two rows in a table can have the same identifier.
student_id | name | class |
1 | Smith Whyte | HumanitiesG7 |
2 | Viktor Monday | ScienceG5 |
3 | John Doe | HumanitiesG7 |
4 | Janet Joe | ScienceG5 |
In the table above, we can easily choose student_id
as the primary key. This is because amongst every attribute or column in the relation, only the student_id
column has the possibility of never belonging to more than one student. Two or more students can bear exactly the same name, and can be in the same class, but can’t have the same id.
Choosing the primary key in a relation like the one above is considerably easy, but the same cannot be said of a table where we have a pool of fully qualified unique identifiers which we can select from. This brings us to the concept of candidate keys.
A candidate key is a set of single-column unique identifiers that can, on their own, uniquely identify the tuples of a table.
They are single column in that they are not made by combining multiple columns. From this set of candidate keys, a primary key can be chosen, and doing this can prove difficult. Candidate keys are a subset of the super key superset.
Some examples will come in handy here. In the table below, the Customers
relation has several keys that can be used to uniquely identify the table rows.
c_id | name | phone | receipt_no | |
1 | Becky Queens | 59994889 | becky@example.com | 4543g |
2 | Timothy fosah | 92989303 | fosah@another.com | 8479b |
2 | Allan Peter | 84993030 | allan@email.com | 7649y |
From the table above, there is a host of attributes from the Customer
relation table that can be used as the primary key. This list of eligible attributes forms the candidate key set. From the table above, the candidate keys are:
Candidates |
c_id |
phone |
receipt_no |
Some things to point out here are as follows:
In a properly normalized table, the receipt_no
attribute won’t be in that table. Instead it should be from, e.g., a Receipts
table, and then its primary key can be placed as a foreign key in the Customers
table.
Some Database systems automatically choose the auto-increment column as the primary key. If you have some other attribute in mind to use rather than the auto-increment attribute, then avoid having an auto-increment attribute.