Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

SQL/Select Random records from a table 7

Status
Not open for further replies.

tamayok

Programmer
Sep 4, 2001
99
PR
I am using VFP 8 and would like to be able to obtain a RANDOM selection of X number of records from a given table.

These records need to be chosen at RANDOM so that everytime the Select Statement is executed a different result is obtained. The resulting output/cursor/table cannot contain duplicate records.

 
You might want to check RAND() function. You can first write a loop where you play around with RAND() function X times to create an array of random numbers in the range of your unique key or at least record numbers in the table (and check whether the number doesn't exist in the array already), and then just write SELECT statement which selects records using ASCAN() function in WHERE clause. You can look for a unique key or RECNO() containing in the array.
 
tamayok

There is no such function in VFP, you may have to create a function to do this.

Mike Gagnon

If you want to get the best response to a question, please check out FAQ184-2483 first.
 
RAND()is here in VFP6. Is it gone in VFP8?

Code:
[b]RAND( ) Function[/b]
[COLOR=blue]Example   See Also[/color]

Returns a random number between 0 and 1.

[b]Syntax[/b]

RAND([nSeedValue])

[b]Returns[/b]

Numeric

[b]Arguments[/b]

[i]nSeedValue[/i]

Specifies the seed value that determines the sequence of values RAND( ) returns.

RAND( ) returns the same sequence of random numbers if you use the same seed value for nSeedValue the first time you issue RAND( ) followed by subsequent RAND( ) function calls without nSeedValue.

If nSeedValue is negative the first time you issued RAND( ), a seed value from the system clock is used. To achieve the most random sequence of numbers, issue RAND( ) initially with a negative argument and then issue RAND( ) without an argument.

If you omit nSeedValue, RAND( ) uses a default seed value of 100,001.
 
No, Rand() lives in VFP 7.0 and beyond! I believe Mike was suggesting that the SQL Select command doesn't offer a Select Random option.

Rick
 
Many thanks to all... RAND() is definitely the key element to obtaining this... I was looking to see if there was an alternative that didn't involve LOOPS external to the SQL/Select statement... Thanks in particular to stella740pl, as it points in pseudo-code to the solution.
 
Note: One very handy routine to get integer numbers from 1 to n, is the following:
Code:
FUNCTION GetRandom
LPARAMETERS pnMin, pnMax
RETURN INT((pnMax - pnMin + 1) * RAND( ) + pnMin)
Sample usage is:
Code:
nRandRecNo = GetRandom(1, reccount("myTable"))
Rick
 
Thanks RGBean,

Handy function indeed... Do you think a 3rd parameter could be implemented with GetRandom() in order to obtain an x number of records?

For example:
- To obtain 500 Random records from DVD....

SELECT * FROM DVD HAVING GETRAND(1,RECCOUNT("DVD"),500)



 
Tamayok,

Say you have a table, BigTable, with 100,000 records, and you want to select 1,000 of them at random. Here's the code:

SELECT * FROM BigTable WHERE BETWEEN(RAND(),0.0,0.01) INTO TABLE SampleTable

By adjusting the second and third parameters of BETWEEN(), you can vary the numbe of records selected.

Mike


Mike Lewis
Edinburgh, Scotland

My Visual Foxpro web site: My Crystal Reports web site:
 
Mike, is anything missing in the code?
It didn't look to me like it would work, but I tried it anyway, on a table with over 2000000 records, and on a small one as well - 0 records selected. Also, as it is, it wouldn't guarantee that records are not duplicated.
 
Thanks to both,

> SELECT * FROM MyTable HAVING BETWEEN(RAND(),0.0,0.01)

The proposed method proves interesting and in the intended manner but somewhat flawed. The problems is:

There's no "guarantee" one will obtain the desired percentage of records... As far as dupes, I think IT DOES WORK as it is evaluating whether or not it takes a record on a record-by-record basis...

 
I changed from "WHERE" to "HAVING" in order to obtain results.
 
I played with it, with both WHERE and HAVING. It always selects RANDOM number of RANDOM records, although WHERE returns 0 records very often.
 
A couple of things I noted:

- The size of the table seems to make no difference
- Even when a =RAND() or =RAND(-1) instruction is executed previous to the SQL statement, the amount of records returned is indeed, RANDOM.

It seems like the alternative that doesn't record temporary tables is the one you stated (stella740pl) that included LOOPs and ASCANs...

---------

I find it is kind of weird that there is no native SQL clause/function that would be able to achieve this!!!
 
Hi,

maybe you can try something like this:
Code:
set decimals to 6
select top x rand() as dummy, ;
 * from yourtable ;
 order by dummy
where 'x' is the number of records you want.
 
I have to say NO-ONE beats TheRambler on this one!!!!!! Works like a charm!!!! Many many thanks!!!!

PS. TheRambler, Where do I send the $? (kidding!)
 
TheRambler's method does indeed work perfectly.

For the record, I now realise tht my method will produce a correct result with some versions of Foxpro, but not others. The reason is to do with the way that the SELECT is optimised. (For background information on this issue, see FoxPro Advisor, Nov 2001, page 42.)

However, changing WHERE to HAVING, as TamayOK suggested, seems to solve the problem and produce the required result in all versions. But it's true that you can't guarantee an exact record count (although in my tests it's pretty close in most cases).

Mike


Mike Lewis
Edinburgh, Scotland

My Visual Foxpro web site: My Crystal Reports web site:
 
Hi again,

Thanks for the stars. I am glad it is working for you.
I tested it with a few thousand records and I got a quick answer, but when I tried to get 10 random records from a million, it took 90 seconds!

In that case, I would use a function like GetRandom() (posted above) to generate random record numbers, go to that record, scatter memvar and insert into cursor from memvar. That would be very much quicker.

Regarding the $, you can send them to on my behalf.[wink]
 

TheRambler,

A star from me, too. I also experimented with TOP X idea a couple of times, but screwed up the SELECT statement in another place, and it didn't work correctly. You did it.

I now realize that this statement was exactly what I needed once in a while to produce demo versions of big tables. I used to use some artificial criteria to make a selection, and this method would create very representative subsets of data.

Mike,

But it's true that you can't guarantee an exact record count (although in my tests it's pretty close in most cases).
They were relatively close for me, too, but in some cases you do need an exact number. Say, if you select a random set of questions for a test from a big database. Each student has to get exactly same number of questions.

Thanks all! It was very useful.

Stella
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top