Saturday, 30 November 2013

SQL Clustered and Non Clustered indexes

Indexes are associated with tables or views to speed up the retrieval of rows. It contains keys built from one or more columns. The structure that these keys are stored is called a ‘B-tree which allows the server to find each row quickly and effectively.

There are two types of index:-

Clustered

1.       Clustered indexes sort and store data rows based on key values.
2.       These are the columns included in the index definition.
3.       There can be only one clustered index per table, because the data rows themselves can be sorted in only one order.
4.       The only time data rows in a table are stored in a sorted order is when the table contains a clustered index.
5.       When a table has a clustered index, the table is called a clustered table.
6.       If a table has no clustered index, its data rows are stored unordered and is called a heap.
7.       When you create a PRIMARY KEY constraint, a unique index on the column or columns is automatically created. By default, this index is clustered; however, you can specify a non clustered index when you create the unique constraint.
a.       The primary key column will not allow NULL values.
8.       If a clustered index is not created with the unique property, the Database Engine automatically adds a 4 byte uniqueifier column to the table.
a.       When it is required, the database Engine automatically adds a uniqueifier value to a row to make each key unique.
b.      This column and it’s values are used internally and cannot be seen or accessed by users.
9.       Generally, you should define the clustered index key with as few columns as possible.
10.   Consider columns that have one or more of the following attributes:-
a.       Are unique or contain many distinct values.
b.      Are accessed sequentially.
c.       Defined as IDENTITY because the column is guaranteed to be unique within the table.
d.      Used frequently to sort data retrieved from the table.
11.   Clustered indexes can become typically quite large and so you should consider the following options:-
a.       SORT_IN_TEMPDB
                                                               i.      If ON, the sort results are stored in the tempdb else they are stored in the filegroup.
b.      DROP_EXISTING
                                                               i.      Indicates if the existing index should be dropped and recreated. Consider using this if you are recreating the CI as this will also update the Non Clustered indexes which use the clustered index at the same time.
c.       FILLFACTOR
                                                               i.      Sets the percentage of free space in the leaf level of each index page during index creation.
d.      ONLINE
                                                               i.      Determines whether concurrent user access to the underlying table or clustered index data and any associated non clustered indexes is allowed during index operations. When you create indexes on existing tables long term table locks are not held which allows queries and updates to continue.
12.   If a clustered index is created on a heap table that already contains several non clustered indexes then on creation of the clustered index all the non clustered indexes will need to be rebuilt so that they contain the clustered key value instead of the row identifier (RID).
13.   If a clustered index is dropped on a table that has several non clustered indexes, the non clustered indexes are all rebuilt as part of the DROP operation. This may take significant time on large tables.
a.       The preferred way to build indexes on large tables is to start with the clustered index and then build any non clustered indexes.

Non Clustered

1.       Non clustered indexes have a separate structure from the data rows.
2.       Contains the non clustered index key values and each key value entry has a pointer to the data row that contains the key value.
3.       The pointer from an index row of a non clustered index to a data row is called a row locator.
4.       The structure of the row locator depends on whether the data pages are stored in a heap or a clustered table.
5.       For a heap, a row locator is a pointer to the row.
6.       For a clustered table, the row locator is the clustered index.
7.       You can add nonkey columns to the leaf level of the non clustered index to by-pass existing index key limits, 900 bytes and 16 key columns.

Both clustered and non clustered indexes can be unique. This means that no two rows can have the same value for the index key. Otherwise, the index is not unique and multiple rows can share the same key value.

Indexes are automatically maintained for a table or view whenever the data is modified.


No comments:

Post a Comment