×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
• Talk With Other Members
• Be Notified Of Responses
• Keyword Search
Favorite Forums
• Automated Signatures
• Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

#### Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

# Random Number Problem

## Random Number Problem

(OP)
Hi there,

I am trying to get 200 (records) sample data from 50,000 records.  I am using the RDNORM function but it's not giving me the random data, instead the output is in chronological order.  Here's the code:

TABLE FILE BAN
PRINT
BAN
BAN_STATUS
BAN_TYPE
BAN_AR_BALANCE

COMPUTE RANDOM/D10.2=RDNORM(RANDOM);

BY BAN_BAN
ON TABLE HOLD AS 'RDM2A'
END

TABLE FILE RDM2A
PRINT
BAN
BAN_STATUS
BAN_TYPE
BAN_AR_BALANCE
RANDOM

BY BAN_BAN
WHERE RANDOM LT 0
WHERE RECORDLIMIT EQ 100
ON TABLE PCHOLD FORMAT EXCEL
END

Any help would be much appreciated.  Thank You.

### RE: Random Number Problem

First off, understand that RDNORM returns a random number, such that ALL the random numbers are in a 'normal' (bell shaped) distribution. The values returned are usually between 3 and -3, and the mean of all the numbers is 0, with a standard deviation of 1. This is used primarily for statistical analysis.

For random selection, use the function RDUNIF, which returns a value which is uniformly distributed (all values are equally likely), having values between 0 and 1.

Now, if you want approximately 200 records, you can do the following, in ONE pass:

1. what's the ratio of the number of records desired to the total population? Here, it's 200 out of 50000, or .004.

2. Calculate RDUNIF for each record (as you've done), but ONLY accept records where the result is LE .004

The problem with wanting EXACTLY 200 records is that, then the selection is no longer random (independent). Whether a record gets selected is not only a function of randomness, but also how many records you've already retrieved.

However, if THAT'S what you want, you can do it, but it will take two passes, which you're already doing.

First, calculate the value for RDUNIF with a DEFINE, rather than a COMPUTE. In your first pass, retrieve EVERYTHING, but SORT it by the DEFINEd random number. In your second pass, use RECORDLIMIT to retrieve the first 200 records.

The code might look like this:

#### CODE

DEFINE FILE BAN
RANDOM/D10.2 WITH BAN=RDUNIF(RANDOM);
END

TABLE FILE BAN
PRINT
BAN
BAN_STATUS
BAN_TYPE
BAN_AR_BALANCE
BAN_BAN
BY RANDOM
ON TABLE HOLD AS 'RDM2A'
END

TABLE FILE RDM2A
PRINT
BAN
BAN_STATUS
BAN_TYPE
BAN_AR_BALANCE
RANDOM
BY BAN_BAN
WHERE RECORDLIMIT EQ 200
ON TABLE PCHOLD FORMAT EXCEL
END

Note the use of 'WITH' in the DEFINE. This says to calculate a new value every time you get a different value for the specified field. It should be at the lowest level of the retrieved structure. If the structure is a flat single segment, then ANY field will do.

#### Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

#### Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Close Box

# Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

• Talk To Other Members
• Notification Of Responses To Questions
• Favorite Forums One Click Access
• Keyword Search Of All Posts, And More...

Register now while it's still free!