×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!
  • Students Click Here

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Cross table insert duplicate key issue

Cross table insert duplicate key issue

Cross table insert duplicate key issue

(OP)
Hi,

I have 2 tables, wce_contact and PC_WiredOrder.

I want to copy data from PC_WiredOrder into wce_contact, that is not difficult. The issue comes when the PC_WiredOrder table has duplicates in it, it then conflicts with the unique primary key in the wce_contact table. That is because i'm using the customerid from PC_WiredOrderand for the uniqueid in wce_contact. So the duplicate rows customerid '333' in the example below, can't insert into the uniqueid of wce_contact table it attempts the first instance then the second and conflicts.

Here is the error:

Msg 2627, Level 14, State 1, Line 1
Violation of PRIMARY KEY constraint 'PK__wce_cont__AA522ACA7F60ED59'. Cannot insert duplicate key in object 'dbo.wce_contact'.
The statement has been terminated.

Below is some script to create the tables, insert some sample data and the attempt to insert script. This replicates the issue.

If anyone can help me to get round this I would be grateful, I have tried loads of options and can't get it! Thanks for your help

CODE

CREATE DATABASE temp3

GO

USE temp3

CREATE TABLE wce_contact
(
uniqueid VARCHAR(50) NOT NULL PRIMARY KEY,
emailaddress VARCHAR(150) NOT NULL,
)
 
INSERT INTO wce_contact VALUES (1, '1@1.com'), (2, '2@2.com');

CREATE TABLE PC_WiredOrder
(
id VARCHAR(50) NOT NULL PRIMARY KEY,
customerid VARCHAR(150) NOT NULL,
billingemail VARCHAR(150)  NULL,
)
 
INSERT INTO PC_WiredOrder VALUES ('a', 333, '3@3.com'), ('b', 333, '3@3.com'), ('c', 444, '4@4.com'); 

Script causing the issue.

CODE

INSERT INTO wce_contact
  (UNIQUEID, EMAILADDRESS)
SELECT t1.customerid,
       t1.billingemail
  FROM PC_WiredOrder t1
 WHERE NOT EXISTS(SELECT EMAILADDRESS
                    FROM wce_contact t2
                   WHERE t2.EMAILADDRESS = t1.BillingEmail) 

RE: Cross table insert duplicate key issue


CODE

INSERT INTO wce_contact
  (UNIQUEID, EMAILADDRESS)
SELECT distinct
       t1.customerid
     , t1.billingemail
  FROM PC_WiredOrder t1
 WHERE NOT EXISTS(SELECT EMAILADDRESS
                    FROM wce_contact t2
                   WHERE t2.EMAILADDRESS = t1.BillingEmail) 

Regards

Frederico Fonseca
SysSoft Integrated Ltd
www.syssoft-int.com

FAQ219-2884: How Do I Get Great Answers To my Tek-Tips Questions?
FAQ181-2886: How can I maximize my chances of getting an answer?

RE: Cross table insert duplicate key issue

(OP)
Thank you fredericofonseca, does Distinct not look at all columns in the row to make sure each value is distinct? I need to add another 10 columns into the query that are not sample code.

Thanks again.

RE: Cross table insert duplicate key issue

It's the continuation of thread183-1790851: SQL Merge Into Help, right?

Well, as said you should group by customerid. In this example using a numeric id for a varchar(50) field is questionable, you don't get far when relying on implicit type conversions.

You can stay with MERGE, your Source just has to be a GROUP BY query.

CODE

MERGE INTO wce_contact AS Target
using (select customerid, min(field2), min(field3),... from pc_wiredorder group by customerid where addedtowired = '0') AS Source ON Target.emailaddress = Source.BillingEmail
WHEN NOT MATCHED THEN
    INSERT ([uniqueid], [createuser], [edituser], [recordmanager], [createtime], [edittime], [COMPANY], [contact], [FirstName], [lastname], [emailaddress], [Phone], [Address1], [Address2], [Address3], [City], 
    [County], [Postalcode], [Customer_Type], [customerpipeline], [account_manager], [Referred_By], record_type, OP_Gone_Ahead) 
    VALUES (source.[customerid], 'ykn3t293i7ckyat7', 'ykn3t293i7ckyat7', 'ykn3t293i7ckyat7', getdate(), getdate(), source.[FirstName]+' '+ source.[Surname], source.[FirstName]+ ' ' + source.[Surname], source.[FirstName], source.[Surname], source.[BillingEmail], source.[Phone], 
    source.[AddressLine1], source.[AddressLine2], source.[AddressLine3], source.[TownCity], source.[County], source.[Postcode], 
    source.[CustomerType], source.[CustomerPipeline], source.[AccountManager], 'Web Integration', 'Individual', 'Confirmed'); 

The usage of MIN() for all other fields aside of customerid should be ok, as all those fields should have the same value for the same customerid. GROUP BY still needs an aggregation of all fields aside of customerid. MAX() would work too, and stating all fields from a customer would also work and not create smaller groups. If you list all of them DISTINCT would also work, just not with *. Anytime you use * outside of simple single table you must ask yourself, if that's a sane choice, here it's not, as you don't use all fields.

All this points out your database accepts orders with customerid that don't exist. The normal order must be creating customers first and then orders. So I assume you DO have these customer records too.

So pull the data from there:

CODE

MERGE INTO wce_contact AS Target
using (select * from customers where customerid in (Select distinct customerid from pc_wiredorder where addedtowired = '0')) AS Source ON Target.emailaddress = Source.BillingEmail
WHEN NOT MATCHED THEN
    INSERT ([uniqueid], [createuser], [edituser], [recordmanager], [createtime], [edittime], [COMPANY], [contact], [FirstName], [lastname], [emailaddress], [Phone], [Address1], [Address2], [Address3], [City], 
    [County], [Postalcode], [Customer_Type], [customerpipeline], [account_manager], [Referred_By], record_type, OP_Gone_Ahead) 
    VALUES (source.[customerid], 'ykn3t293i7ckyat7', 'ykn3t293i7ckyat7', 'ykn3t293i7ckyat7', getdate(), getdate(), source.[FirstName]+' '+ source.[Surname], source.[FirstName]+ ' ' + source.[Surname], source.[FirstName], source.[Surname], source.[BillingEmail], source.[Phone], 
    source.[AddressLine1], source.[AddressLine2], source.[AddressLine3], source.[TownCity], source.[County], source.[Postcode], 
    source.[CustomerType], source.[CustomerPipeline], source.[AccountManager], 'Web Integration', 'Individual', 'Confirmed'); 

Of course, that relies on all the fields you need for the merge target to be in a customers table.

Overall it's questionable you have both a foreign key and copy over data. I can guess this is because customer data can change over time and you want the data as it was when ordering. There are other ways to do that and still only needing the customerid and datetime of order to lookup the customerdata as it was. Then storing customers in a temporal table allowing queries like SELECT * FROM customers where customerid=X AS OF orderdatetime.

Temporal tables were introduced in 2016 and are ideal to prevent such redundant copying and spreading of record information across all database just for that historic approach, the history of data is kept at the single place it happens, your customer table extends with updates and keeps the older state, too, you can easily select just the current states/records a normal table would have but also can get older states, AS OF is just on possibility.

Even if I guess wrong about this in detail, the fact you use a foreign key AND detail data shows me, you need some more training and consultation to fix design flaws of your database.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: Cross table insert duplicate key issue

(OP)
Thank you Olaf, that was a great explanation. I understand all you said there it makes good sense. The numeric id for a varchar(50) was just in the test table in the live system this field in numeric, not sure why I didn't correct this in the test example above.


I will run some tests and this has really freshened my knowledge it's been a while since I worked with advanced SQL.


Thanks again.

RE: Cross table insert duplicate key issue

(OP)
Could I ask, do you rate this method of selecting the distinct records? I was playing with it but will use the method using the merge but curious to get your opinion. Thanks.


CODE

SELECT customerid, billingemail, rk
  FROM (
     SELECT
       customerid, kk.billingemail,
          ROW_NUMBER() OVER (PARTITION BY kk.billingemail
                        ORDER BY kk.customerid DESC) AS rk
     FROM PC_WiredOrder kk
     WHERE  kk.billingemail IS NOT NULL) t
  WHERE rk = 1 

RE: Cross table insert duplicate key issue

Quote:


Thank you fredericofonseca, does Distinct not look at all columns in the row to make sure each value is distinct? I need to add another 10 columns into the query that are not sample code.

Thanks again.
you got a solution for the question you posted - next time put the full code, not just a snippet.

group by or row_number method are the other options - row_number normally faster

Regards

Frederico Fonseca
SysSoft Integrated Ltd
www.syssoft-int.com

FAQ219-2884: How Do I Get Great Answers To my Tek-Tips Questions?
FAQ181-2886: How can I maximize my chances of getting an answer?

RE: Cross table insert duplicate key issue

Indeed picking out the first row after ROW_NUMBERing will help to cope without all the MIN()/MAX() or other aggregations. With just a few field I'd be pragmatic and use DISTINCT, but I also can see why picking the rk=1 record could be fastest, there is only the need to find a first row. Since you have multiple rows per customerid, thar will just not really give an order making sense, i.e. the record picked per kk.customerid is random. I don't know if Fredericos assertion this is faster would hold true.

I'd still prefer to only take the customerid out and join in the data you need via this ID, especially if all records of PC_WiredOrder kk with same customerid have the same values in billingemail, the SQL engine can't knnow this and will need to work out MIN()/MAX() distinct or any row numbering. It's easier to pick out the distinct customer ids and join data from where this is the primary key. You have no way to tell SQL Server to pick any row of PC_WiredOrder and are sure any record with same customerid also has same billingemail, so it doesn't help that you only query from one source table and avoid a join.

Indeed PC_WiredOrder looks like joined data and you should hook into whatever process generates it and merge new mailadresses from there instead of doing this as aftermath. You see, you first cause multiple rows in PC_WiredOrder and now want to condense them, you have a stage in the process where this already was condensed, was only single rows per customerid.

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close