We all are aware of the importance of primary key and foreign key constraints in relational database system. In this blog post, I will illustrate the importance of creating an index on Foreign key constraint and how it can impact us adversely if not created initially.
Primary key constraint basically guarantees data uniqueness, when you specify a primary key constraint for a table, the Database Engine enforces data uniqueness by automatically creating a unique index for the primary key columns.
Foreign Key constraint enforces referential integrity by guaranteeing that changes cannot be made to data in the primary key table if those changes invalidate the link to data in the foreign key table. If an attempt is made to delete a row in a primary key table or to change a primary key value, the action will fail when the deleted or changed primary key value corresponds to a value in the foreign key constraint of another table. To successfully change or delete a row in a foreign key constraint, you must first either delete the foreign key data in the foreign key table or change the foreign key data in the foreign key table, which links the foreign key to different primary key data.
This also explains that if you are trying to delete the data from primary key table that a reference (lookup) will be made with child table to maintain the referential integrity.
Let’s take an example where we will delete and update the data from primary key table and assess its impact on performance. In this example, Person.Person table is primary key table and BusinessEntityID column is primary key. Person.PersonPhone is child table referencing Person.Person as it’s primary key table (BusinessEntityID is Foreign Key). Conceptually we would need to delete/update data from child table first and then delete/update from parent table to maintain the referential integrity . Let’s see the performance impact if foreign key is not indexed in Person.PersonPhone.
DELETE FROM Person.Person WHERE BusinessEntityID=1 GO UPDATE p SET BusinessEntityID = 1 FROM Person.Person p WHERE BusinessEntityID = 2 GO
Now, let’s check the execution plan:
We can see, delete/update statements are looking up its child table (Person.PersonPhone) to verify whether BusinessEntityID “1” exists or not. It is currently scanning the clustered index, which has increased the “subtree cost” so let’s check (you can hover your mouse over clustered Index scan operator to check the reason):
We can see in the predicate that it is joining both the tables on the basis of BusinessEntityID which is not indexed in Person.PersonPhone table, hence SQL chose a scan. Let’s create a non-clustered index on BusinessEntityID colum in child table.
CREATE NONCLUSTERED INDEX IX_NC_BusinessEntityID_PersonPhone ON Person.PersonPhone (BusinessEntityID)
Foreign key is now indexed so I will delete/Update statement again to see if it has improved the performance or not.
We can see “Estimated subtree cost” has been improved in case of Delete/Update both the statements. This will improve your query performance massively if you have large tables. Also these improvements make larger impact if you are working in busy OLTP or table size is massive so do not forget to index your foreign key.
Please leave comments if you have any feedback or query.