Hello TheTeraKid
In Teradata there is a concept of rowid as in Oracle where in we know that the row has been inserted in which sequence. In TD rows are stored randomly on different AMPs and there is no way to know which one was inserted first except if you put in a time stamp in a record. So Sample 5 will not give you first 5 rows of a table.
e.g. A table t2 has d1,d2,d3 integer columns having data as
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
16 17 18
19 20 21
22 23 24
25 26 27
28 29 30
Now fire statements like
sel * from t2 sample 4;
*** Query completed. 4 rows found. 3 columns returned.
*** Total elapsed time was 1 second.
d1 d2 d3
----------- ----------- -----------
16 17 18
22 23 24
10 11 12
4 5 6
sel * from t2 sample 4;
*** Query completed. 4 rows found. 3 columns returned.
*** Total elapsed time was 1 second.
d1 d2 d3
----------- ----------- -----------
16 17 18
19 20 21
7 8 9
22 23 24
You see different rows are returned even when you fire the same SQL and also keep the Sample as 4.
Also, you may try the new Teradata feature of RANDOM function.
sel * from t2 where random(1,100)<10; (Returns less than 10% of random rows)
*** Query completed. One row found. 3 columns returned.
*** Total elapsed time was 1 second.
d1 d2 d3
----------- ----------- -----------
19 20 21
sel * from t2 where random(1,100)<30; (Returns less than 30% rows)
*** Query completed. 4 rows found. 3 columns returned.
*** Total elapsed time was 1 second.
d1 d2 d3
----------- ----------- -----------
28 29 30
19 20 21
25 26 27
4 5 6
I am trying to find out the exact difference between RANDOM and SAMPLE function in TD.