Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

INSERT .csv formatted records into SQL w/o dupes

Status
Not open for further replies.

bogjacket

Programmer
Jan 10, 2005
19
US
Greetings!

I'm basically a sql newbie or novice.

I have a need to take a comma-separated file and add those records to an existing table. The existing table has different field names and far more fields than the csv file. One of the fields I'm inserting is the primary key and I'll need to overwrite any duplicate records with the new copy.

Unfortunately, I'll need to do this on a regular basis. So, is there a way to put something together to automate these tasks?

I believe we are using SQL 2000 and I have Enterprise Manager to interface with the DB. I have dabbled with DTS a little, but it's still pretty foreign to me.

I hope this is enough info. Any help is greatly appreciated.
 
You can automate this, either schedule a DTS package or use BCP/BULK INSERT and make it a job
what I would do is create a staging table import into that table and then move only the records that you need into the production table



Denis The SQL Menace
SQL blog:
Personal Blog:
 
Thanks SQLDenis,

Would you please explain a little bout what BCP/BULK INSERT means?
 
The best way to learn about it is to look it up in BOL (Books On Line)
The bcp utility copies data between an instance of Microsoft® SQL Server™ 2000 and a data file in a user-specified format.

take a look here (
BULK INSERT
Copies a data file into a database table or view in a user-specified format.



Denis The SQL Menace
SQL blog:
Personal Blog:
 
Thanks again, Denis the SQL Menace!

Could you take a look at my BULK INSERT statement and help me with the syntax?

BULK INSERT sql.mvdata.Table1
FROM 'c:\LoanRecords.csv'
WITH
(
FIELDTERMINATOR = ' , ',
ROWTERMINATOR = ' \n '
)

It told me that my syntax was incorrect near ',',
and it also said that there may be a problem with access.

Thanks again.
 
save this in a file named 'c:\LoanRecords.csv'
1,2
1,3
4,5
6,7

then

create table testBulkInsert (id int,id2 int)
go
BULK INSERT testBulkInsert
FROM 'c:\LoanRecords.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'

)

go
select * from testBulkInsert
go
drop table testBulkInsert
go

you had some spaces around the delimiters

Denis The SQL Menace
SQL blog:
Personal Blog:
 
Yes, it looks like I'll need to look into my permissions and making sure that the text file is in the right spot, because the csv file is on my machine, while the server is not.

Also, if I just copy and paste the stuff you gave me into Query Analyzer, where will it create the object testBulkInsert by default?
 
Once you get the bulk insert accomplished, there is still more work to do. You say that you have records already exisiting that will be replaced with the new data. You do this with update statements. Then you need to run an insert statment to add any new records. Will you also be deleting records that previously were there but which are not in the new file? If yes, then you need to run a delete. Don;t forget to consider any related tables. Are there other records in other tables which need to be created when a new record is added to the table you are inserting records to?

Other considerations are the other fields that exist that are not in your feed. Do you need to supply the values for these when inserting new records? Do any of the values change (like a last updated field) when an update is done? Are any of the fiedls for which you don;t have data required? If so how will you determine what value goes inthat field?

Also will you need to do any clean up to make the data from the feed match the business rules or fomatting of the fields in the table you are importing into? For instance, do they have non dates in the date fields which need to be scrubbed before tring to send that data to a table whic is using a datetype for that field? Do they have names in one field where you have names separated out to lastname, firstname and all sorts of issues like that. Are there nulls in the data in the work table for fields that are required?

Import of data to an exisitng table is a very complex and technical thing. you need to be very sure that you know what you are doing or you can royally mess up the database. Since you are new at this, make sure you do this first on a development database before moving the data to production. Never ever even consider attempting this for the first time on a database that does not have a recent backup!

If the import involves lots of records and will take alot of processing time or update a lot of records that are in use, then you will need to consider scheduling it during the off hours. And you should personally babysit it the first time it runs or you may get an out of control process that takes the whole system down.

Questions about posting. See faq183-874
Click here to help with Hurricane Relief
 
Thanks, Denis. I got the BULK INSERT working.

SQLSister,

Thank you so much for all that info. I do need to consider most of those things, and I'd be thrilled if you'd help me write the correct statements. Let's see:

- I will need to use UPDATE. How do I account for dupes?
- No DELETE
- No related tables, yet
- There are more fields in the table than in the feed and all of those can be null, because they will no longer be needed.
- There may be a little formatting, such as splitting a string. But dates and such should be ok, as far as I can tell.
- It doesn't seem like a lot of records to me. It will begin at around 50 and grow by about 20-30 per workday, and each record shouldn't have more than 50 or so fields in them.


Here is what I've used so far:

USE mvdata
BULK INSERT table1
FROM 'f:\commatext.csv'

WITH

(

FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'

)

It worked except that it created duplicate records. It also added the field names as a record.

Thanks again!
 
>>It worked except that it created duplicate records. It also added the field names as a record

This is your staging table now you can use distinct to move the data over

to skipt the first row (field names) use FIRSTROW =2
try

USE mvdata
BULK INSERT table1
FROM 'f:\commatext.csv'

WITH

(

FIELDTERMINATOR = ',',
FIRSTROW =2
ROWTERMINATOR = '\n'

)

Denis The SQL Menace
SQL blog:
Personal Blog:
 
To avoid dupes and to get the correct data matched up to the correct record, you need something unique about the record to match the work table to the primary table. Do you have that?

If you don't this can be a real nightmare, I've just spent the last three hours trying to match 255 records with no one unique thing to several tables each of whcih has over a million records and I know for a fact that all 255 records are from our database. I can find all the names for that client, but I have multiple records for those names (after all more than one person can have the same name and even live at the same address and have the same phone number). Now I'll be doing this faster later on but this is a new job and I'm not that familar with the picky little details that are needed to find some of this information out.

To give more specific advice on what to do, I would need to see examples of the data records in the work table and exanmples of existing records inthe table you want to import into.

Questions about posting. See faq183-874
Click here to help with Hurricane Relief
 
The first field in the feed is the primary key for the destination table.

I think it should be pretty easy from here. Here is what the feed will look like:

"Loan Name","Borr First Name","Borr Last Name","Borr SSN","Subj Prop Address","Subj Prop City","Subj Prop State","Subj Prop Zip","Subj Prop County","Loan Amount"

"0512096","JOE","DOE","111-22-3333","1802 BARNESDALE WAY","Fayetteville","GA","30066",,"$89,800.00"

The first group is the field names and the second is an example record.

"Loan Name" is the primary key and will need to be inserted into the destination field named "loan_id".

"Borr First Name" will go to "f1b1#first_name_buyer" and so on...

Is that enough info to build the INSERT statement, without taking formatting into consideration?

Can you please give me an example of the correct syntax?
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top