Pulling Random Records

boblang · Jan 6, 2002

Does anyone know how to pull random records from a Teradata table with SQL?

rohitrevo · Jan 6, 2002

select * from tablename sample n

You can substitute 'n' with the actual number of records you want the sample. Say if you want 5 records and the table name is emp...the sql will be

sel * from emp sample 5

TheTeradataKid · Jan 7, 2002

The sample 5 example will give you the first 5 rows of the table, not the randomness that you are looking for.

dhauer · Jan 7, 2002

You can also get a percentage of the records:
sel * from emp sample .05;

This will return 5% of the records from table emp. (if you have qualified records using where condition(s) the percent of records returned is based on total number of records meeting criteria).

rohitrevo · Jan 7, 2002

Hello TheTeraKid

In Teradata there is a concept of rowid as in Oracle where in we know that the row has been inserted in which sequence. In TD rows are stored randomly on different AMPs and there is no way to know which one was inserted first except if you put in a time stamp in a record. So Sample 5 will not give you first 5 rows of a table.

e.g. A table t2 has d1,d2,d3 integer columns having data as
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
16 17 18
19 20 21
22 23 24
25 26 27
28 29 30

Now fire statements like

sel * from t2 sample 4;

*** Query completed. 4 rows found. 3 columns returned.
*** Total elapsed time was 1 second.

d1 d2 d3
----------- ----------- -----------
16 17 18
22 23 24
10 11 12
4 5 6

sel * from t2 sample 4;

*** Query completed. 4 rows found. 3 columns returned.
*** Total elapsed time was 1 second.

d1 d2 d3
----------- ----------- -----------
16 17 18
19 20 21
7 8 9
22 23 24

You see different rows are returned even when you fire the same SQL and also keep the Sample as 4.

Also, you may try the new Teradata feature of RANDOM function.

sel * from t2 where random(1,100)<10; (Returns less than 10% of random rows)

*** Query completed. One row found. 3 columns returned.
*** Total elapsed time was 1 second.

d1 d2 d3
----------- ----------- -----------
19 20 21

sel * from t2 where random(1,100)<30; (Returns less than 30% rows)

*** Query completed. 4 rows found. 3 columns returned.
*** Total elapsed time was 1 second.

d1 d2 d3
----------- ----------- -----------
28 29 30
19 20 21
25 26 27
4 5 6

I am trying to find out the exact difference between RANDOM and SAMPLE function in TD.

rohitrevo · Jan 8, 2002

I wrongly gave the explanation of RANDOM fn in earlier example. Actually in the sql stmt below a random no between 1 and 100 is generated by the system and if it is less than 30 that row is returned to the user.

sel * from t2 where random(1,100)<30;

TheTeradataKid · Jan 8, 2002

Thank You for the clarification rohitrevo.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Pulling Random Records

boblang

Programmer

rohitrevo

Technical User

TheTeradataKid

Programmer

dhauer

Programmer

rohitrevo

Technical User

rohitrevo

Technical User

TheTeradataKid

Programmer

Similar threads

Part and Inventory Search

Sponsor