×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!
  • Students Click Here

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

select distinct each record

select distinct each record

select distinct each record

(OP)
Hi guys,
i have two tables would be joined using ref_n and because ref_n the same, if i joined using ref_n, it would create cartessian join.
How would I avoid it by having one auth person each records like example below.

CODE

table1
post_id ref_n  post_d
1        50    2018-10-01
2        50    2018-10-02

table2
auth_id ref_n auth_person
3        50   Nina 
4        50   Bill

If I join table1 and table2 with ref_n the result would be below


post_id ref_n  post_d        auth_person
1        50    2018-10-01    Nina
1        50    2018-10-01    Bill
2        50    2018-10-02    Nina
2        50    2018-10-02    Bill

I would like to have result like below, one auth person each for record

post_id ref_n  post_d        auth_person
1        50    2018-10-01    Nina
2        50    2018-10-02    Bill 

RE: select distinct each record

Your outcome right now is this:
post_id ref_n  post_d        auth_person
1        50    2018-10-01    Nina
1        50    2018-10-01    Bill
2        50    2018-10-02    Nina
2        50    2018-10-02    Bill
 
But you want it to be:
post_id ref_n  post_d        auth_person
1        50    2018-10-01    Nina
2        50    2018-10-02    Bill 
 
So what is the rule of getting the BLUE dates and not the RED dates in your outcome? ponder


---- Andy

There is a great need for a sarcasm font.

RE: select distinct each record

This is a case of bad data design and you don't make it manageable creating a join matching lowest post_id with lowest auth_id etc.

There is no such join type. So where does this data originally come from, I assume you do have a per row ID, that you simply dropped out of your lists, but that would now help make the right matching condition.

Technically you can join to orderable lists by ordering them and giving each intermediate result (subquery) a row_number to match with the "corresponding" row of the other list. But that's nonsensical in terms of database relationships and even if that solves this and similar cases, this would be a bad way of handling it.

Bye, Olaf.





Olaf Doschke Software Engineering
https://www.doschke.name

RE: select distinct each record

(OP)
Hi Andrzejek,

I know this is not the great data design but that is the data that we deal with when we source from external party.

what we are trying to do is to avoid cartessian join and dedup them.

the result would be either one of them below by joining from ref_n

CODE

post_id ref_n  post_d        auth_person
1        50    2018-10-01    Nina
2        50    2018-10-02    Bill 

post_id ref_n  post_d        auth_person
1        50    2018-10-01    Bill
2        50    2018-10-02    Nina 

The question is more like how do we dedup them from cartessian join output?

RE: select distinct each record

What I'd like to see is further data, a third table could make a normal join possible.

What you're doing is guessing which record belongs to which, ref_n isn't allowing that. I'd look for tables with post_id and auth_id in them, that would be the missing link.

CODE

SELECT COLUMN_NAME,TABLE_NAME
FROM        INFORMATION_SCHEMA.COLUMNS
WHERE       COLUMN_NAME LIKE '%post_id%' OR COLUMN_NAME LIKE '%auth_id%' OR COLUMN_NAME LIKE '%ref_n%'
ORDER BY    TableName,ColumnName; 

Does this show some more tables?

Bye, Olaf.

Olaf Doschke Software Engineering
https://www.doschke.name

RE: select distinct each record

Both Nina and Bill have 2 records with dates of 2018-10-01 and 2018-10-02
You want your outcome to be for Nina 2018-10-01, but for Bill 2018-10-02 WHY?
What determines which date goes with which person?

I would understand if you want both people to have 2018-10-01 (Min date) or 2018-10-02 (Max date), but your example shows no logic (that I can see).

So you still did not answer my question. sad

"the result would be either one of them below by joining from ref_n"
In my opinion - you CANNOT do it.

(Unless somebody could prove me wrong smile )

---- Andy

There is a great need for a sarcasm font.

RE: select distinct each record

(OP)
Hi Andrzejek,

Here I give you more data, if you see the source data, table 2 is actually extension of table 1 but the linkage sometimes are not quite unique.
the reference number is meant to be unique for each bill but sometimes people make mistake by putting same number.

CODE

table1
post_id ref_n  post_d
1        50    2018-10-01
2        50    2018-10-02
3        11    2018-10-03
4        12    2018-10-04

table2
auth_id ref_n auth_person tran_a
3        50   Nina          25
4        50   Bill          25
5        11   Tom           20
6        12   Shan          30 

At the end, we would like to get total transaction with who roughly owns the bill but we don't want double counting the number.
usual query is below

CODE

Select 
 T1.*,
 T2.*
FROM TABLE1 T1
  INNER JOIN TABLE2 T2 ON T1.REF_N = T2.REF_N 

but the query above will create cartessian join for ref_n 50 with 4 records and we would like to dedup it with 1 bill for each Nina and Bill.

RE: select distinct each record

Quote (peac3)

the query above will create cartessian join for ref_n 50 with 4 records and we would like to dedup it with 1 bill for each Nina and Bill.

Again, with which date for Bill and Nina: 2018-10-01 or 2018-10-02?
Or doesn't matter?


---- Andy

There is a great need for a sarcasm font.

RE: select distinct each record

(OP)
doesn't matter

RE: select distinct each record

Something like:

Select T1.post_id, T1.ref_n, MAX(T1.post_d) AS post_d, T2.auth_person
FROM TABLE1 T1
INNER JOIN TABLE2 T2 ON T1.REF_N = T2.REF_N
GROUP BY T1.post_id, T1.ref_n, T2.auth_person


(Not tested)


---- Andy

There is a great need for a sarcasm font.

RE: select distinct each record

(OP)
it doesn't work using either max(post_d) or min(post_d), because it will result the post_d for bill and nina the same.
and when we segregate daily, it would not be reconciled.

your codes result would be like below:

CODE

post_id ref_n  post_d        auth_person tran_a
1        50    2018-10-02    Nina         25
2        50    2018-10-02    Bill         25 

we want one post_d for each which doesn't matter who so the result would be either

CODE

post_id ref_n  post_d        auth_person tran_a
1        50    2018-10-01    Nina         25
2        50    2018-10-02    Bill         25

or

post_id ref_n  post_d        auth_person tran_a
1        50    2018-10-01    Bill         25
2        50    2018-10-02    Nina         25 

RE: select distinct each record

So, what you want to say to your data base is: give me any date for Bill, I don’t care which date you give me, just pick one arbitrary, random date from table1, it does not matter to me. And then, give me any date for Nina as long as it is not the same date as the date you gave me for Bill’s record.

If that’s the case, you can do it – providing you have additional fields in your tables, like a Primary Key field. But the SELECT statement will be so convoluted you will not like it, because you would have to have imbedded Select inside another Select referring to the previous record number asking for the data that does not appear in the previous record, etc. Ugly…

Unless somebody else, smarter than me (which is easy smile ) will jump in with the solution.


---- Andy

There is a great need for a sarcasm font.

RE: select distinct each record

Quote (peac3)

the reference number is meant to be unique for each bill but sometimes people make mistake by putting same number.
Well, there you describe the core problem. It's not impossible to fix this, is it? update one ref_n.

You can easily find double ref_n by

And technically make sure no double ref_n is allowed by making a unique constraint on that field:
A database should take care of the integrity of data, not only but also the referentiaöl integrity. You don't fix wrong data with random pick choices, that works in some examples but then you don't fix a problem, you suppress an error and seep it under a rug, that's not solving it and may introduce wrong deduping in further scenarios, eg when a ref_n is even used three times. I guess it also would be much more important if tran_a would differ.

It would be best to know what ref_n references to, what table1 and2 are, what the whole real-world meaning of this is, but one thing is clear, this case has an easy solution by changing the ref_n and ensuring no double entry in the future, you don't need complicated SQL, you just need fixed data.


Olaf Doschke Software Engineering
https://www.doschke.name

RE: select distinct each record

Andy,

This seems like a common table expression may be the right choice here. Commonly it is used for Hierarchy joins with multiple ancestors where you want the full chain.

My brain always hurts when I have to look up the example but this may be another choice case for one. That said, they are on the "avoid it if you can" list. - I thought you might want to look into it.

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close