Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Opinion On Identity Columns as the Key

Status
Not open for further replies.

shaunk

Programmer
Aug 20, 2001
402
AU
The table I have looks like this :
Code:
CAxF_Client_Id
CAxF_Referral_Id
CAxF_Creation_Date
CAxF_Collecting_Person
CAxF_Collecting_Unit

The data in the table, as it relates to the business requirement, does not uniquely identify the row. Nor does any combination of data identify the row. I could possibly use Creation_Date as part of a composite key on Client and Referral, but I am linking to foreign keys in other tables that don't need to carry this data. I am therefore forced to create an artificial key.
The choice as I see it is an Identity column or a field of type 'uniqueidentifier'.
I have been scared off using an Identity column as the primary key for a host of reasons, none of which seemed relevant in this situation.
Does the Guid type always advance in value i.e the method used to calculate it will always produce a value that is an increment on any previous value ? If it does, I can use this as the key or are there a whole host of other reasons why GUID should not be used.

 
By default guid has no "auto increment" feature. You have to make one by yourself. And one frequently used method for auto-generation of guid values - NEWID() - produces random sequence.

------
"There's a man... He's bald and wears a short-sleeved shirt, and somehow he's very important to me. I think his name is Homer."
(Jack O'Neill, Stargate)
[banghead]
 
Random Sequence ? That definitely places the spanner where I don't want it !
I intended to key the table on Newid() and later retrieve the latest Assessment for the Client by using a max(Key)
operation.
e.g
Code:
select max CAxF_Key as Max_Assessment_Id 
From Caxf
Group by Client

Back to using an Identity column !

 
create a identity column on the table. That guarantees uniqueness/auto increment for each record, which is what you are looking for I think.

Code:
alter table Test add RowID int identity

Regards,
AA
 
I almost always use an identity column in every table except lookup tables. If I have a natural key, I may create a unique index for it, but joins on integers are faster. Also there are some people who insist on using bizarre schemes for creating new ID numbers but those almost always slow down the insert and if improperly done create a data integrity problem. Unless you are inserting too many records to use an identity field or are separting the data over multiple servers, I see no valid reason not to use them.

AA, I don't believe you can insert an identity aftwards using alter table. You need to create a new table with the identity field, then move the data, then drop the original table, then rename the new one. Or use Enterprise Manager to add the identity field, but it will follow the steps I sjust described. Becasue of this it is better to always design with an identy field before records are added. Better to drop one you don' need than try to add one to a table with many existing records.

Questions about posting. See faq183-874
Click here to learn Ways to help with Tsunami Relief
 
I have to disagree with you SQLSister on that.

Sql Server allows you to add an identity column to a table that already has data.

Code:
create table Test (EmpNo int not null, 
		   LastName varchar(10) null,
		   FirstName varchar(10) null)

insert into Test Values (1, 'Doe', 'John')

insert into Test Values (71, 'Ricky', 'James')

select * from Test

alter table Test add RowID int identity

select * from Test

Regards,
AA

 
Given my experiences with users, I never assign "natural" values as keys. I've been burned too many times by situations like where they changed printers, and their invoice numbers now start at 000001 (again).

I always use identity columns now.

Chip H.


____________________________________________________________________
If you want to get the best response to a question, please read FAQ222-2244 first
 
Use Identity (auto-increment) column, yes.

But in my dept, the developers tend to also make the ID col the PK and unique + clustered. This is just an old habit I think.

The unique attribute is not necessary because SQL Server will insure the numbers are unique. And to assign it the clustered attribute means you cannot cluster any other index - because a table may have only 1 clustered index.

Others may disagree with me on this.
John
 
shaunk, i'm surprised that you know enough to even ask the question about surrogate versus natural keys, and yet at the same time not know enough to avoid the dreaded MAX(ID) mistake :)

"latest" is best implented using a DATETIME column

also, be careful of just assigning identity keys out of habit

natural keys do change, yes, and that's what ON UPDATE CASCADE is for

that said, composite foreign keys can be a bit cumbersome, so i'm not totally against surrogate keys, and identity columns make real swell ones

r937.com | rudy.ca
 
I'd second chiph's opinion on natural keys. What works today as a natural key (or composite key) often breaks down when the next change request arrives. Surrogate keys separate the plumbing (how the tables are linked together) from the data.
 
r937 said:
also, be careful of just assigning identity keys out of habit
IMHO identity PK is as closer to record number feature (ol' dBase RECNO()/LASTREC()) as it takes so...They also don't guard against logically duplicated data, and people often forget about that. Guess why most of sites (including TT) are vulnerable to re-POSTs?

------
"There's a man... He's bald and wears a short-sleeved shirt, and somehow he's very important to me. I think his name is Homer."
(Jack O'Neill, Stargate)
[banghead]
 
R937 - Light hearted aspect noted in your comment. I have found an excellent article on Natural vs Surrogate keys at this link:
To complete the picture (would have been a good idea earlier I hear you say - but it's lengthy)
1. Uniqueness from the business perspective is guaranteed by a combination of Client-Id and a DateTime field of type
'datetime' i.e accuracy is needed down to the milliseconds. This excludes smalldatetime.
1. The above table will serve as the parent table of many child tables, some of which are in one to one, some one to many. The Surrogate identity PK is still looking good. Just pick a way to determine the next surrogate key value for the child tables.
2. This is a form processing system. The data arrives on a paper form, prepopulated with Client information, fed thru a scanner, from which accompanying software loads it into the D/B. One or several pages of the form are associated with a particular table. I have no control over the load software other than to configure which form field loads what D/B field in what table. I cannot guarantee that forms arrive complete or will be processed complete. In a business sense, the parent has to be linked with all the children and inserted in one go. The Surrogate identity PK is no longer looking good as I will be faced with the real possibility of missing surrogate_Id rows in the child tables...and the possibility of child tables being linked to the wrong parent record. I think I can guarantee that a violation of the business rile will occur.

So...back to an intelligent composite Key of Client_id and Creation_Date which will be populated in all tables. But wait, Creation_Date printed on the form will be in YYMMDD with no time component and therefore not guarantee uniqueness down to the millisecond.
As I see it now, I am left with two options:
1) Construct a bizarre surrogate composite PK that will be a string consisting of the Converted Client_Id and Datetime with the necessary precision. Print that on the appropriate form pages and populate all tables with it.
2) Adopt a natural composite PK model, of Client_Id and the converted datetime string with the necessary precision. This produces a mixed composite key of char and numeric datatypes. The fields will also both come from the form.

I'm leaning towards the bizarre surrogate PK as only one column will have to be added to the child tables.
Thanks
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top