×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!
  • Students Click Here

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Jobs

Group By Optimize Problem

Group By Optimize Problem

Group By Optimize Problem

(OP)
I am trying to delete duplicate records from a table (tempfile). I populate the file using the following sql statement:

INSERT INTO tempfile
(counter, ssn, encrypt_ssn, field1, field2,...field50)
select  count(encrypt_ssn) as counter
    ,ssn  
    ,encrypt_ssn  
    ,field1
    ,field2   
    ...
    ,field50
from permfile
group by encrypt_ssn

DELETE FROM tempfile
WHERE counter > 1 AND field1 = ""

The ... indicates there are 50 fields total (field3, field4, field5, etc). I have about 100,000 records and it is taking a long time to do. Is there a faster way of doing this? The tempfile is not indexed but the permfile has ssn and field4 as the primary key.

Thanks


Mark


 

RE: Group By Optimize Problem

this is mysql, yes?

perhaps you will get better answers in that forum

in ANSI SQL (the forum you posted in) your GROUP BY clause is illegal

r937.com | rudy.ca
Buy my new book Simply SQL from Amazon

RE: Group By Optimize Problem

(OP)
I am using Sybase 11.9. The group by works it is just slow. Should I try doing a SELECT INTO before doing an INSERT INTO?
 

RE: Group By Optimize Problem

This is somewhat confusing. The general GROUP BY rule is:
If a GROUP BY clause is specified, each column reference in the SELECT list must either identify a grouping column or be the argument of a set function.

You have lots of columns not fulfilling that requirement.

What exactly are you trying to do?
 

RE: Group By Optimize Problem

(OP)
Hi,

I have a table with 50 fields and 100,000 records. I want to get rid of all duplicate records based on the encrypt_ssn field only which is why I do a group by on that field only, the other fields I don't care about. Once I find the records where the counter is greater than 1 I delete all records in which field1 is blank.

I hope that explains it a little better. If I am doing this incorrectly, please tell me and I will redo my sql code.

Thanks,

Mark
 

RE: Group By Optimize Problem

i am concerned about the "other fields i don't care about" remark

 encrypt_ssn other fields
     123       NULL NULL NULL NULL NULL NULL NULL    9
     123       NULL NULL NULL NULL   60 NULL NULL NULL
     123        125  400  815   10   52  187  204   37
     123       NULL NULL  937 NULL NULL NULL NULL NULL

which row would you like to keep?

your way, you could get any one of them
 

r937.com | rudy.ca
Buy my new book Simply SQL from Amazon

RE: Group By Optimize Problem

oops i overlooked the "where field 1 is blank" part

sorry

r937.com | rudy.ca
Buy my new book Simply SQL from Amazon

RE: Group By Optimize Problem

(OP)
If you can see a better way to do the query I am open to suggestions. What I was thinking was something like this:

select count(encrypt_ssn) as counter ,encrypt_ssn  
INTO temp
from permfile
group by encrypt_ssn

DELETE permfile
FROM permfile, temp
WHERE permfile.encrypt_ssn = temp.encrypt_ssn
      AND temp.counter > 1 AND field1 = ""

Thanks,

Mark
 

RE: Group By Optimize Problem

I suppose that would make it, but the syntax is far from ANSI SQL compliant.

The ANSI SQL way looks something like:
INSERT INTO temp (column-list) SELECT count(encrypt_ssn) ...

DELETE FROM permfile WHERE encrypt_ssn IN (SELECT encrypt_ssn FROM permfile, temp WHERE ...)

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close