Using DISTINCT Properly 1

jw2000 · Aug 23, 2005

I am try to retrieve all fields for non Southwest Pilots from the Pilot table but I only want unique emails. The reason is because there are some pilots with duplicate records.

--no error but I need all fields
select distinct(pilotEmail1)
from Pilot
where pilotEmail1 NOT LIKE '%@southwest.com%'

--returns all records, distinct doesn't work here, using a group by returns an error
select distinct(pilotEmail1), *
from Pilot
where pilotEmail1 NOT LIKE '%@southwest.com%'

How can I return the correct records for pilots with unique Emails here?

jw2000 · Aug 24, 2005

I agree with you that find the problem is key - the problem is pilots with duplicate pilotemail1 can also have null pilotemail2 as well as recordquality = 1 for both. The only difference is between the two (or more) duplicate records is the primary key, pilotId. Since this is the case the record with the smallest pilotId should be returned.

amrita418 · Aug 24, 2005

What does this give you?

Code:

union
select     * 
from       Pilot
where      pilotid in (select  min(pilotid) 
                       from    Pilot 
                       where   pilotemailid1 in (select   pilotemailid1
                                                 from     pilot a
                                                 where    pilotemailid2 is null
                                                          and not exists (select   'x' 
                                                                          from     pilot b 
                                                                          where    a.pilotemailid1 = b.pilotemailid1 
                                                                                   and b.pilotemailid2 is not null)
                                                 group by pilotemailid1
                                                 having   count(*) > 1)
                                and RecordQuality = 1
                       group by pilotemailid1)

Regards,
AA

jw2000 · Aug 24, 2005

select *
from Pilot
where pilotEmail2 in (select pilotEmail2
from Pilot
where politemail1 in (select pilotEmail1
from Pilot
where pilotEmail1 NOT LIKE '%@southwest.com%'
group by politEmail1
having count(*) > 1)
and pilotEmail2 is not null)
union
select *
from Pilot
where pilotEmail1 in (select pilotEmail1
from Pilot
where pilotEmail1 NOT LIKE '%@southwest.com%'
group by politEmail1
having count(*) = 1)
union
select *
from Pilot
where pilotId in (select min(pilotId)
from Pilot
where pilotEmail1 in (select pilotEmail1
from Pilot a
where pilotEmail2 is null
and not exists (select 'x'
from Pilot b
where a.pilotEmail1 = b.pilotEmail1
and b.pilotEmail2 is not null)
group by pilotEmail1
having count(*) > 1)
and RecordQuality = 1
group by pilotEmail1)

returns 13 more records as compared to:

select distinct(pilotEmail1)
from Pilot
where pilotEmail1 NOT LIKE '%@southwest.com%'

I am not sure why, but will look over the records returned. Can you explain how your last union work? What does select 'x' do?

amrita418 · Aug 24, 2005

A little more research should get you the reason for the difference. Good luck with the research.

You can replace the 'x' with NULL or 1 or pilotmail1, all we need to know is that for this pilotmailid1, there is no pilotemailid2 that is not null.
Using 'x' has performance benefit over getting a value from the table. Nothing more than that.

Regards,
AA

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Using DISTINCT Properly 1

jw2000

Programmer

jw2000

Programmer

amrita418

Technical User

jw2000

Programmer

amrita418

Technical User

Similar threads

Part and Inventory Search

Sponsor