In some ways, you can think of an index as another table. Of course, it's not, but it some times helps to think of it that way.
Each row in the index contains a pointer to the row location in the actual table, and the data for the columns that make up the index. This way, SQL can use the index to quickly find the rows in the actual table.
When you write a query that ONLY uses data from an index, SQL server can recognize this and NOT use the data in the table because the data is already in the index.
For example, let's think about the TrackingDate. If you wanted to know how many rows in your table have a TrackingDate for today, you could write a query like this:
[tt]
Select Count(*)
From YourTable
Where TrackingDate >= '20100707'
And TrackingDate < '20100708'
[/tt]
If there is an index on the TrackingDate column, SQL can use the data in the index to determine the count without actually going to the table to get the data. When this happens, it's called a "covering index".
Before I continue, I need to talk about multi-column indexes. Single column indexes are rarely useful. I mean... they can help you by speeding up certain queries, but multi-column indexes can do the same thing, but also help with more queries.
There are 2 ways to make a multi-column index, and it's important that you know what both of them are, because they behave differently.
By definition, a multi-column index contains the data for more than 1 column. The order in which the columns appear in the list is critically important. Imagine you had a people table that stored PersonId, ShoeSize and EyeColor. Now imagine you had index on ShoeSize,EyeColor. The data for this index might look like this:
[tt]
ShoeSize EyeColor
9 Blue
9 Green
9.5 Blue
10 Brown
10 Green
10.5 Brown
[/tt]
Notice that the data is sorted first by ShoeSize and then by EyeColor. Looking at the list, tell me the number of people that have a ShoeSize = 10. Easy, right. You quickly find the first 10 and then continue counting rows until you get something <> 10. Now, with the same index, tell me how many people have Brown eyes. Since the EyeColor column is not the first column in this index, you would need to start at the first row and look at every row in order to count the Brown's.
Now, imagine there was another index on the table with EyeColor as the first column in the index. This time, SQL can use that index to quickly get your count.
INCLUDED COLUMNS.
Like I said earlier, there are multiple ways of creating multi-column indexes. Multi-column indexes existed in all versions of SQL Server, but new to SQL2005, you can have included columns. In the example I showed earlier, ShoeSize was the first column and EyeColor was the second column. So, in the index, the data is sorted first by ShoeSize and then ALSO sorted by EyeColor. With SQL2005, you could have the EyeColor column be an INCLUDE column. When you do this, the ShoeSize column would still be sorted, but EyeColor would not be. Why is this important? Well... think about inserts and updates on the table. If EyeColor is sorted, it will take (slightly) longer for SQL to determine where (in the index) the data belongs. If EyeColor is an included column, SQL only need to look at the ShoeSize column to determine where the row belongs. This may not seem like much, but with a very large table, it could make a considerable difference.
What would be the difference between the two?
It looks like you have 4 single column indexes right now. No doubt that these indexes are helping to speed up some queries. In the new scenario, it looks like you only have 2 indexes. Basically, queries that were originally using the index on FileNumber and TrackingDate would probably slow down (probably by a lot). By including FileNumber and TrackingDate in the index for BoxNumber, some queries may speed up a little because SQL may be able to use the index data entirely without having to go to the table to get the data.
Personally, I wouldn't remove the other two indexes unless I was sure they were not being used for anything. It wouldn't bother me to add additional columns to them. Doing this will slow down the index slightly, but may speed up certain queries by a lot.
I strongly encourage you test this stuff on a development database. This development database should be an exact copy of the production database (backup and restore to another SQL instance). Then you can modify the indexes and run all the queries that use this table to determine the indexes impact on performance.
I know this was a bit long-winded, but indexes are a LARGE topic, and a very important one at that. If you have any questions, now's the time to ask.
-
George
"The great things about standards is that there are so many to choose from." - Fortune Cookie Wisdom