Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

advice on creating index 2

Status
Not open for further replies.

tsuable

Programmer
Feb 2, 2005
41
US
Hi,

I would like to get some advice on creating an index for a varchar data type.

Will doing this make the queries slower because its a varchar?

Can anyone point me to a best practices document..

Thanks,
 
In answer to your first question (about it being slower) the answer should almost always be "NO"

There is no penalty for it being varchar (unless all rows are almost always full) and infact should provide faster access than char.

Indexes (nonclusted) are like minature tables. If you use char(300) in an index.. That row will always take up 300Bytes of space. Varchar will be how long the data is + anoghther 2byts or so to say how long the data is.

So if in avarage each varchar(300) row has only 150 bytes of data in it, you will speed up your searches by a factor of about 2 as you will have twice as much data per 64K read (an extent)

Hope that helps


Rob

 
Thanks Rob..

Would it be slower if you create a clustered index having a varchar and numberic data types?

thanks,
 
(VERY SHORT LESSON IN INDEXING)

Ok.. Clustered indexes are interesting beasts.

When you create a clustered index on a table, you force the table to be sorted on the column or columns you build the index on.

i.e. All rows are inserted in that order in the table.

This can be very usefull if you are searching for ranges of data based on that key, or if you bring back rows in that sort order, however, the flip side of that is that each column you include in the clustered index is also added to each nonclustered index. This tends to render mulit column indexes and indexes build on large varchar/char data as being expensive (on all non clustered indexes) and makes it very critical in selecting the right column.

If you are only ever going to look for one or two rows of data, a non-clustered index will be almost as good as a custered index.

I tend to try to find columns that you return more than one row from to cluster on, and then I pick numeric or smaller datatypes... int (foregin key cols) - datetime (all rows from this day to that day)

and when I am looking for specific data.. Column = 'widget1' then a nonclusterd index is the choice..

HTH

Rob
 
Indexes in general can be imagined as a very tall stack of hierarcial GREATER-THAN LESS-THAN branches. To find a particular node, you start at the top and do a GREATER/LESS plunge down the tree.

What happens if the tree is originally build in perfect alphabetical order. Then every node has a GREATER branch and an EMPTY LESSER branch. To find anything, you'll need to scan the entire stack!

Cluster index attempt to address this problem. No only by frequent rebalancing but each node contains dozens of downward pointers, so each "read" grabs dozens of branches. The key to clusters, is each node is kept balanced and sorted ... even if the database needs to stop for a second and rebuild an entire branch to keep the index balanced.

Using clustered indexes, and statistical analysis suggests that a single record, in a database of a billion records, can usually be found in SEVEN or less reads.

Clustered indexes are harder for the database to build and harder to keep maintined, but far superior in speed of access.

Well, that's my 2 cents worth. (smile)
 
so if i always use the columns below in all the queries (line items for a report):

Where CenterID = @CenterID --> numeric
AND RegionID = @RegionID --> numeric
AND CorpID = @CorpID --> numeric
AND CenterList = @CenterList --> varchar


Is it safe to say that creating a clustered for these 4 columns would be the best way to go?

thanks,
 
they are numeric datatypes that are always whole numbers
 
convert them to int you will save some storage space= more rows per page =less IO=better performance=happy customers ;-)

and if you know that for example RegionID will always be less than 255 use tinyint (1 byte of storage) you will save 3 bytes per row just doing that



Denis The SQL Menace
SQL blog:
Personal Blog:
 
Thanks Dennis,

How about creating the clustered index for them? Is it a good idea? will it be the way to go? or should i do a non-clustered index instead?
 
A table can only have 1 clustered index, however you can include multiple columns in a clustered index

That said I would say try both ways and see what gives the best performance, try also changing the order of the columns in the index to see what gives the best performance

And one more thing (are you ready?)


The maximum number of bytes in any key cannot exceed 900 in SQL Server 2000. You can define a key using variable-length columns whose maximum sizes add up to more than 900, provided no row is ever inserted with more than 900 bytes of data in those columns


Denis The SQL Menace
SQL blog:
Personal Blog:
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top