Register Login

Difference between Primary Index, Secondary Index and Clustered Index

Updated Jul 31, 2024

Primary Index vs. Secondary Index vs. Clustered Index

Indexing is a data structure technique to properly retrieve records from the database files based on some attributes on which the indexing has been done. Here's a comprehensive look at the different types of indexing:

Primary Index

Definition: The primary index is defined on an ordered data file. The data file is ordered on a key field. The key field is generally the primary key of the relation.

Characteristics:

  • Unique: Makes sure every value in the index is distinct.
  • Sparse Index: Each data block has a single index entry.

Example:

Consider a table with a primary key column EmployeeID. The primary index will sort the table based on EmployeeID.

Advantages:

  • Quick access to documents.
  • Preserves the main key's uniqueness.

Disadvantages:

  • Primary index can only be created on unique keys.
  • Slightly slower for insertions and deletions as the index needs to be updated.

Secondary Index

Definition: A secondary index can be created on any column, including non-key columns. It can be on a candidate key (unique) or non-key field (with duplicate values).

Characteristics:

  • Non-Unique: Allows the index to contain duplicate values.
  • Dense Index: Contains an index entry for each table record.

Example:

If a table has a LastName column, a secondary index on LastName will create an entry for each record, allowing quick searches by last name.

Advantages:

  • Enhances non-primary key column query performance.
  • Flexible indexing across several columns.

Disadvantages:

  • Needs more storage space.
  • Write operations (insert, update, delete) may become slower as a result.

Clustering Index

Definition: A clustering index is defined on an ordered data file. The data file is ordered on a non-key field.

Characteristics:

  • Non-Unique Key: Indexes non-unique fields in which a clustering field has arranged the data.
  • Physical Data Order: The order in which the index and actual data rows are stored is the same.

Example:

In a table with a Department column, a clustering index on Department would arrange the table so that all employees in the same department are stored together.

Advantages:

  • Effective for sorting and range queries.
  • Minimizes I/O operations by grouping records that are similar together.

Disadvantages:

  • Each table may only have one clustering index created.
  • Not appropriate for columns with lots of distinct values.

Comparison Table: Primary Index vs Secondary Index vs Clustered Index

Feature Primary Index Secondary Index Clustering Index
Uniqueness Unique Unique or Non-Unique Non-Unique
Data Order Key Field Order Not Ordered Non-Key Field Order
Type of Index Sparse Dense Physical Data Order
Creation Automatic with PK Manual Manual
Storage Requirements Less More Moderate
Use Case Fast PK lookups Optimizing non-PK queries Range queries, sorting

Practical Examples

Primary Index

Scenario: Consider a table Employees with columns EmployeeID (primary key), FirstName, and LastName.

When a query searches for a specific employee using EmployeeID, the primary index facilitates fast retrieval because the table is ordered by EmployeeID.

Example Query: SELECT * FROM Employees WHERE EmployeeID = 123;

Secondary Index 

Scenario: The same Employees the table has a column LastName.

To optimize queries searching by LastName, create a secondary index on this column.

Example Query: SELECT * FROM Employees WHERE LastName = 'Smith';

Clustering Index 

Scenario: In a table Employees, there is a column Department.

A clustering index on Department orders the actual data rows by the department.

Example Query: SELECT * FROM Employees WHERE Department = 'Sales';

Conclusion

In order to maximize query performance in databases, indexes are essential. Comprehending the distinctions among primary, secondary, and clustering indexes facilitates the development of effective and adaptable database architectures. Utilizing the right kind of index for your particular requirements can greatly improve the speed at which data is retrieved as well as the overall functionality of your application.


Comments

  • 03 Jun 2016 3:59 pm Abhijeet Mudgal Helpful Answer

    Indexing is a data structure technique to properly retrieve records from the database files based on some attributes on which the indexing has been done.

    Indexing has following types:

    Primary Index − Primary index is defined on an ordered data file. The data file is ordered on a key field. The key field is generally the primary key of the relation.

    Secondary Index − Secondary index may be generated from a field which is a candidate key and has a unique value in every record, or a non-key with duplicate values.

    Clustering Index − Clustering index is defined on an ordered data file. The data file is ordered on a non-key field.


×