Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations bkrike on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Using DISTINCT Properly 1

Status
Not open for further replies.

jw2000

Programmer
Aug 12, 2005
105
US
I am try to retrieve all fields for non Southwest Pilots from the Pilot table but I only want unique emails. The reason is because there are some pilots with duplicate records.

--no error but I need all fields
select distinct(pilotEmail1)
from Pilot
where pilotEmail1 NOT LIKE '%@southwest.com%'

--returns all records, distinct doesn't work here, using a group by returns an error
select distinct(pilotEmail1), *
from Pilot
where pilotEmail1 NOT LIKE '%@southwest.com%'


How can I return the correct records for pilots with unique Emails here?

 
I agree with you that find the problem is key - the problem is pilots with duplicate pilotemail1 can also have null pilotemail2 as well as recordquality = 1 for both. The only difference is between the two (or more) duplicate records is the primary key, pilotId. Since this is the case the record with the smallest pilotId should be returned.

 
What does this give you?
Code:
union
select     * 
from       Pilot
where      pilotid in (select  min(pilotid) 
                       from    Pilot 
                       where   pilotemailid1 in (select   pilotemailid1
                                                 from     pilot a
                                                 where    pilotemailid2 is null
                                                          and not exists (select   'x' 
                                                                          from     pilot b 
                                                                          where    a.pilotemailid1 = b.pilotemailid1 
                                                                                   and b.pilotemailid2 is not null)
                                                 group by pilotemailid1
                                                 having   count(*) > 1)
                                and RecordQuality = 1
                       group by pilotemailid1)

Regards,
AA
 
select *
from Pilot
where pilotEmail2 in (select pilotEmail2
from Pilot
where politemail1 in (select pilotEmail1
from Pilot
where pilotEmail1 NOT LIKE '%@southwest.com%'
group by politEmail1
having count(*) > 1)
and pilotEmail2 is not null)
union
select *
from Pilot
where pilotEmail1 in (select pilotEmail1
from Pilot
where pilotEmail1 NOT LIKE '%@southwest.com%'
group by politEmail1
having count(*) = 1)
union
select *
from Pilot
where pilotId in (select min(pilotId)
from Pilot
where pilotEmail1 in (select pilotEmail1
from Pilot a
where pilotEmail2 is null
and not exists (select 'x'
from Pilot b
where a.pilotEmail1 = b.pilotEmail1
and b.pilotEmail2 is not null)
group by pilotEmail1
having count(*) > 1)
and RecordQuality = 1
group by pilotEmail1)


returns 13 more records as compared to:

select distinct(pilotEmail1)
from Pilot
where pilotEmail1 NOT LIKE '%@southwest.com%'

I am not sure why, but will look over the records returned. Can you explain how your last union work? What does select 'x' do?
 
A little more research should get you the reason for the difference. Good luck with the research.

You can replace the 'x' with NULL or 1 or pilotmail1, all we need to know is that for this pilotmailid1, there is no pilotemailid2 that is not null.
Using 'x' has performance benefit over getting a value from the table. Nothing more than that.

Regards,
AA
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top