Indexes in
SQL Server are created on columns in tables or views. The
index provides a faster way to look up data based on the values in those
columns. For example, if you create an index on the primary key and then search for a
row of data based on one of the primary key values,
SQL
Server first finds that value in the index, and then uses the index to
quickly locate the entire row of data. Without the index, a table scan would
have to be performed in order to locate the row, which can have a significant
effect on performance.
An index is made up of a set of pages (index nodes) that are organized in a
B-tree structure. This structure is hierarchical in nature, with the root node
at the top of the hierarchy and the leaf nodes at the bottom
When a query is issued against an indexed column, the query engine starts at
the root node and navigates down through the intermediate nodes, with each
layer of the intermediate level more granular than the one above. The query
engine continues down through the index nodes until it reaches the leaf node.
For example, if you’re searching for the value 123 in an indexed column, the
query engine would first look in the root level to determine which page to
reference in the top intermediate level. In this example, the first page points
the values 1-100, and the second page, the values 101-200, so the query engine
would go to the second page on that level. The query engine would then
determine that it must go to the third page at the next intermediate level.
From there, the query engine would navigate to the leaf node for value 123. The leaf node will contain either the entire row of data or a
pointer to that row, depending on whether the index is clustered or
nonclustered.
Indexes in SQL Server 2012:
Clustered
Non-clustered
Covering
Filtered
Columnstore
Clustered Indexes:
A clustered index can be compared to a dictionary, where data is stored in a
sorted form. If the data is in sorted form, then you can search for any word
very quickly. So essentially A clustered index is table itself in a sorted
order based on some column(s).
A clustered index stores the actual data rows at the leaf level of the
index.An important characteristic of the clustered index is that the indexed
values are sorted in either ascending or descending order. As a result, there
can be only one clustered index on a table or view.A table that has no
clustered index is referred to as a heap.
Nonclustered Indexes:
You can compare nonclusted index with the index pages given at the end of
each book. They actually don't store the data, but point you to the place where
the actual data is. You can have multiple non clustered indexes on a table. And
of course, the more Non clustered indexes you create on a table, the more
storage they take.
Unlike a clustered indexed, the leaf nodes of a nonclustered index contain only
pointers to the actual data rows, rather than contain the data rows themselves.
A row locator’s structure depends on whether it points to a clustered table or
to a heap. If referencing a clustered table, the row locator points to the
clustered index, using the value from the clustered index to navigate to the
correct data row. If referencing a heap, the row locator points to the actual
data row.
Covering Index:
A non-clustered index that contains all the information needed to satisfy a
query is known as a covering index.They enable the database administrator to
add information to the non-clustered index data pages and avoid having to look
up the row in the clustered index.
An index can contain more than one column, as long as the index doesn’t exceed
the 900-byte limit in a clustered index key and 1700 bytes for nonclustered
index key (In sql 2016).
--Example showing 900 bytes limit
create table IndexLimit(Empid int, EmpDesc varchar(1000))
create clustered index idx on IndexLimit(Empid,EmpDesc)
--Insert Fails
insert into IndexLimit values(12,replicate('a',1000))
--Insert Succeeds
insert into IndexLimit values(12,replicate('a',896))
As beneficial as indexes can be, they must be designed carefully. Because
they can take up significant disk space, you don’t want to implement more
indexes than necessary. In addition, indexes are automatically updated when the
data rows themselves are updated, which can lead to additional overhead and can
affect performance.
You should consider the following guidelines when planning your
indexing strategy:
For tables that are heavily updated, use as few columns as possible in the
index, and don’t over-index the tables.
If a table contains a lot of data but data modifications are low, use as many
indexes as necessary to improve query performance. However,use indexes
judiciously on small tables because the query engine might take longer to
navigate the index than to perform a table scan.
For clustered indexes, try to keep the length of the indexed columns as short
as possible. Ideally, try to implement your clustered indexes on unique columns
that do not permit null values. This is why the primary key is often used for
the table’s clustered index, although query considerations should also be taken
into account when determining which columns should participate in the clustered
index.
The uniqueness of values in a column affects index performance. In general, the
more duplicate values you have in a column, the more poorly the index performs.
On the other hand, the more unique each value, the better the performance. When
possible, implement unique indexes.
For composite indexes, take into consideration the order of the columns in the
index definition. Columns that will be used in comparison expressions in the
WHERE clause (such as WHERE FirstName = ‘Charlie’) should be listed first.
Subsequent columns should be listed based on the uniqueness of their values,
with the most unique listed first.