Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Shaun E on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Compare two tables without primar key

Status
Not open for further replies.

shyamsundar

Programmer
Aug 23, 2002
75
IN
Hi techies,

I am facing the problem on how to compare two tables, while going quality assurance check.

- 10% of documents are re-entered in a seperate table consisting of fields ranging from 30 to 150.

- Tables don't have primary keys and validation criteria is too tedious, as there are differenct combination of data entered/populated in 150 fields.

Now my problem is how to compare the data entered later as second entry with the first actual entry for all the fields.

Any Ideas, I am using ACCESS 2000

Thanx Shyam
cwizshyam@yahoo.com
 
Quality assurance
Quality design
Quality control
Quality improvement

Define primary key columns for your tables. If you ever find a way to match up the re-entered rows with the original rows fill in the foreign key values in the re-entry table.

This will be an improvement that will make quality control a snap.

With so many fields it seems almost certain that you will be able to match up original and re-entered rows. Or at least narrow the possibilities enough to make manual comparisons to the paper documents feasible.

Let's call our tables Intakes and Updates, Intakes has columns 1 through 150 and Updates has columns 30 through 150.

First add new columns named id to both tables, make these autonumber columns. These are now the primary keys to the tables. Add another new column to Updates named intake_id, this is the foreign key that we hope someday to be able to fill in with values from the corresponding rows in Intakes.

Now we try some matching rules until we get a good enough result.

Assume that columns 31, 37, and 49 are good candidates for matching up original with re-entered records.

SELECT Intakes.id,
Intakes.col_31,
Intakes.col_37,
Intakes.col_49,
Updates.id,
Updates.col_31,
Updates.col_37,
Updates.col_49
FROM Intakes, Updates
WHERE
Intakes.col_31 = Updates.col_31
AND Intakes.col_37 = Updates.col_37
AND Intakes.col_49 = Updates.col_49

This is in effect an INNER JOIN using composite keys

If we knew that the combination of these three fields was a unique triple, then this would be a composite key. This is how you choose which columns to try, your judgment that together they might identify a row. If you guess right then the result of this query will be one row for every row in the Updates table and the Intakes.id value will be the value you need as a foreign key in the Updates table. If you are lucky there will be very few repeated rows from the Intakes table. You can tell that they are repeated because the Intakes.id will be repeated.

This is just a theory, it might work but I've never tried it.


Finally, your comment "...validation criteria is too tedious..." is interesting. After cleaning up the data you may want to re-consider, a little tedium today may save you a lot of time cleaning up a mess tomorrow. Ah, but you already knew that and you inherited this problem from someone who no longer works there (we know why) or from your boss (sorry about that).

Good luck with this.
 
hi rac2,

Great suggestions, and thanx very much for the time. Yes u r right..The tables were in excel, and lot of data for the different countries were entered, that has been imported into ACCESS database. I would like to explain you more in detail about this database.

The data entered in the tables are from secondary data (most un-standardized data) collected from different sources (like publications/journals etc). Number of tables are 20, each table is for different subjects,having a minimum of 30 fields and maximum of 150 fields. The first table is having a primary key that is [Source ID] for the source document,which all other tables refers to.

Tables other than table 1 can have more than 1 row with same[source ID], which fetched from table 1 and entry cannot continue without Source ID.

For example :
Table 1
Source ID
S-123
S-124
S-125
S-126

table 2
S-123
S-123
S-125

TABLE 3
S-124
S-124
s-125

From the above table it shows that S-124 is having data in table 3 only, where s-125 is having data both in 2 & 3 and may be vice versa (it must be in table 1). I think I am clear. And I tried to insert autnumbers as a primary key for all the tables other than table 1.

Now in second-entry, since the entry is only 10%, the autonumber is not necessarily same as in first entry. That is where I struck. Great if you can suggest me something on the above.

Thanx

Shyam
cwizshyam@yahoo.com
 
Can you describe the second-entry process a little more, please.

Is the first entry made in Table1, are the rows in Table2 and Table3 all "second-entry"? Or do we have a first, second and third entry for S-123, S-124, and S-125?

Do you expect that when the entries are done correctly that the row for S-123 in Table1 will match exactly the rows for S-123 in Table2? Or at least that some of the common columns will match exactly?

To clarify the point of adding an autonumber column: it is more a matter of basic relational database design than a solution to this problem. Every table needs a primary key, a value that uniquely identifies the rows, this can be meaningful as with Table1.SourceID, or arbitrary as with the autonumber column. Without primary keys for the tables you don't have a relational database, you have buckets full of stuff.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top