Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

join question 1

Status
Not open for further replies.

von2

MIS
Jul 25, 2005
11
US
Hello,

I have two tables, tableA, tableB. tableA has userid, memberid columns with a total count of 300. tableB has userid, address columns. I have a flat file with memberid and address columns with around 6000 records. I need to update tableB's address column (all rows that currently have null address) using tableA and the flatfile. I believe the first step is to create a temp table tableC(from flatfile) in the database. Is it something that can be done with sql? If I have to use pl/sql, could you suggest a good approach? Thanks in advance
 
Von,

Yes, you can load the flat file into a "TableC", but you can accomplish the same result without having to create an actual "TableC"...you can let Oracle treat the flat file as a table. Here is a flat file I created (with far less records than you have) that simulates your situation:
Code:
D:\dhunt\SQLDBA>type addresses.txt
2^123 Main St., Anytown, Anystate, USA
5^45 London Rd., Anytown, AnyCounty, UK
Note that I used the "^" character as a field delimiter.

Here is the code to cause Oracle to treat the flat file as a table:
Code:
SQL> create or replace directory yada as 'd:\dhunt\sqldba';

Directory created.

SQL> create table address (memberid number, address varchar2(100))
organization external
(Type oracle_loader
default directory yada
access parameters (fields terminated by '^' missing field values are null
(memberid, address))
location ('addresses.txt')
)
parallel reject limit 0
/

Table created.

select * from address;

  MEMBERID ADDRESS
---------- -------------------------------------
         2 123 Main St., Anytown, Anystate, USA
         5 45 London Rd., Anytown, AnyCounty, UK

2 rows selected.

Here are contents for TableA:
Code:
select * from TableA;

    USERID   MEMBERID
---------- ----------
       101          1
       102          2
       103          3
       104          4
       105          5
       106          6

6 rows selected.

Here are the contents for TableB:
Code:
select * from TableB;

    USERID ADDRESS
---------- ------------------------
       101 Original Address for 101
       102
       103 Original Address for 103
       104 Original Address for 104
       105 Original Address for 105
       106
       107 Original Address for 107

7 rows selected.
Notice that based upon your specifications, the only row that should be updated is for USERID = 102 its ADDRESS is NULL and has an update address in the flat file.

(Note 1: USERID 105 has an update address, but the original value on its matching TableB row is NOT NULL.

Note 2: USERID 106, although its ADDRESS is NULL, does not have a matching update address in the flat file.)

Here is the UPDATE code:
Code:
update tableB b set address =
(select address
   from TableA a, Address c
  where a.memberid = c.memberid
    and a.userid = b.userid)
where b.address is null
  and exists
(select 'x' 
   from TableA a, Address c
  where a.memberid = c.memberid
    and a.userid = b.userid);

1 row updated.

select * from TableB;

    USERID ADDRESS
---------- ------------------------------------
       101 Original Address for 101
       102 123 Main St., Anytown, Anystate, USA
       103 Original Address for 103
       104 Original Address for 104
       105 Original Address for 105
       106
       107 Original Address for 107

7 rows selected.
BTW, if you choose to create an actual "TableC" from the flat file, the UPDATE statement will be identical to the one above (except tablename "ADDRESS" becomes "TableC".)

Let us know if this resolves your request.


[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[ Providing low-cost remote Database Admin services]
Click here to join Utah Oracle Users Group on Tek-Tips if you use Oracle in Utah USA.
 
Santa,

Thanks for the fast reply. It took over a minute (70 secs) to update the records. Could you also address the following?

1. What is the advantage of creating a directory and treating flatfile as table instead of creating a table using sql*loader? Can we drop the directory when we are done with the table creation? I am also considering running these statements in a pl/sql block as a job.

2. In the sql solution that you gave could you please tell what's the purpose of "and exists" and the subquery following it?

I verified the following

tableA count = 555
tableB count = 574
Address table count = 6000.

select count(*) from tableA a, tableB b where a.userid=b.userid -- 574

i.e tableB has 22 repititions of userid. I verified that tableB doesn't have a pk or unique constraint on userid.

Out of curiosity, I attempted the following single sql statement but it gave me "single subquery resulting in many rows" error.

select a.firstname, a.lastname, c.address
from tableA a, tableB b, Address c
where a.memberid = c.memberid
and a.userid = b.userid
and b.address is null;

and I believe it was because of the userid repititions

Is that the reason why the "and exists" and the subsequent subquery was used?

3. Is 70 seconds for this update normal?

Thank you again.
 
Von,

Q. 'What is the advantage of creating a directory and treating flatfile as table instead of creating a table using sql*loader?

A. The advantage is simply not needing to run SQL*Loader and not creating an internal table that is redundant to storage-consuming data already on disk.

Q. 'Can we drop the directory when we are done with the table creation?'

A. Yes.

Q. '...what's the purpose of "and exists" and the subquery following...'?

A. The "and exists" and its subquery determine if there is a 3-way match between TableA, TableB, and Address. If there is no match, there is no need to UPDATE the current row. Without this code, the UPDATE (for example) would set (the NULL) ADDRESS for USERID 106 to NULL...which is unnecessary and wasteful.

Q. 'Is 70 seconds for this update normal?'

A. Using an EXTERNAL (flat-file) table exhibits slower performance than a regular table. If this exercise will occur regularly, then you probably should use SQL*Loader to place the rows in a standard table. Besides this technique, ensure that indexes exist on all columns that your WHERE clause references.

Von said:
I attempted the following single sql statement but it gave me "single subquery resulting in many rows" error.

The code you posted has no subquery, so it cannot be the offending code in its current form. Please post the entire query (outer and sub queries) so we can recommend a solution.

In anticipation of the solution, however, we typically use the "IN" comparison operator (instead of "=") for cases where the subquery results might yield more than one row of results.

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[ Providing low-cost remote Database Admin services]
Click here to join Utah Oracle Users Group on Tek-Tips if you use Oracle in Utah USA.
 
Thanks for all the help. I just wanted to clarify from you that the directory method will also create an internal table right? You are right, the subquery that I ran resulted in multiple rows and so I got that error ... Also, I made a small change to the original query that you gave. I changed the closing bracket and it ran well. Thank you again.

update tableB b set address =
(select address
from TableA a, Address c
where a.memberid = c.memberid
and a.userid = b.userid
where b.address is null
and exists
(select 'x'
from TableA a, Address c
where a.memberid = c.memberid
and a.userid = b.userid));
 
Actually, Von, the EXTERNAL table/directory method does not create an internal table. That is one of the benefits of using the flat-file/EXTERNAL method...it takes up no space in the database.[smile]

I'm glad you got everything working well.

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[ Providing low-cost remote Database Admin services]
Click here to join Utah Oracle Users Group on Tek-Tips if you use Oracle in Utah USA.
 
Santa,

Appreciate your answer again. Excuse my ignorance, when you say the external table/directory method doesn't create internal table, do you mean it doesn't create an ORACLE internal table? If so, can you detail it little bit? I am using ADDRESS table in the query and it still exists in the schema. Thank you.
 
Yes, Von, your ADDRESS table "appears" in the schema as table...you can DESCRIBE it:
Code:
SQL> desc address
 Name                    Null?    Type
 ----------------------- -------- -------------
 MEMBERID                         NUMBER
 ADDRESS                          VARCHAR2(100)
...you can SELECT from it:
Code:
SQL> select * from address;

  MEMBERID ADDRESS
---------- -------------------------------------
         2 123 Main St., Anytown, Anystate, USA
         5 45 London Rd., Anytown, AnyCounty, UK
...but if you try to find space that it occupies in the database:
Code:
SQL> select count(*) from user_extents where segment_name = 'ADDRESS';

  COUNT(*)
----------
         0
...you see that it consumes no space (i.e. extent count = 0)...and that is my point...it acts like a "real" table, but it occupies no space in the database...the only place that it consumes space is outside the database in a plain-old, notepad-editable flat-file on disk.

Does this answer your question?


[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[ Providing low-cost remote Database Admin services]
Click here to join Utah Oracle Users Group on Tek-Tips if you use Oracle in Utah USA.
 
Santa,

One last question .. I ran these two queries and inspite of the and exists clause, the second query is updating addresses (with null values) for memberids that do not exist in Address table. I can tell this from "6 rows updated" etc .. message after the queries are run.

Code:
UPDATE tableB b SET address = (SELECT address FROM tableA a, Address c
				WHERE a.memberid = c.memberid
    				AND a.userid = b.userid) WHERE b.address IS NULL;
									   
									   

UPDATE tableB b SET address = (SELECT address FROM tableA a, Address c
					      WHERE a.memberid = c.memberid
    					      AND a.empid = b.empid
  					      AND EXISTS (SELECT 'x' FROM tableA a, Address c
  							  WHERE a.memberid = c.memberid
    							  AND a.userid = b.userid)) WHERE b.address IS NULL;
Both the above queries are updating equal number of records each time. As planned, the second query is not leaving out the memberids that are not found in Address (flatfile). Any idea how to resolve this? Thank you.

 
Yes, Von, we can talk about why your code is updating too many rows: Notice that your code (from a logic perspective) differs significantly from my code (My Code):
Code:
update tableB b set address =
(select address                 -- Begining of correlated sub-query
   from TableA a, Address c
  where a.memberid = c.memberid -- Inner WHERE joins tables
    and a.userid = b.userid)    -- End of correlated sub-query
where b.address is null         -- Outer WHERE which limits rows to update
  and exists
(select 'x' 
   from TableA a, Address c
  where a.memberid = c.memberid
    and a.userid = b.userid);
Your code, however, inappropriately combines the two WHERE clauses, limiting just the subquery rows; there is no limit (to the rows being updated) to just the rows that match all three tables.

Does this answer your question?

[santa]Mufasa
(aka Dave of Sandy, Utah, USA)
[ Providing low-cost remote Database Admin services]
Click here to join Utah Oracle Users Group on Tek-Tips if you use Oracle in Utah USA.
 
Santa,

That's it. This answers all my questions. You are great.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top