×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!
  • Students Click Here

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Jobs

scoring records that were used in training/validation

scoring records that were used in training/validation

scoring records that were used in training/validation

(OP)
Hello,

I have a question pertaining to data partitioning, model training, and ultimately, scoring a data set (predictive modeling).

The heart of the question is this:  If you have a population of 20,000 divided up into training/validating/testing (40/30/30) for modeling purposes, is it incorrect to use the resulting score code to score the same population of 20,000?

That was the way I did it, accidentally.  So I went back and sampled the entire database (rather than using the very specific population of 20,000), reconstructed my modeling table, and went through the modeling/scoring process again.  This time, I used my new score code to re-score the original 20,000, so that I could compare the results.

I compared the scores of 100 records.  I found the difference between scores, took the absolute value, and calculated the average.  My number was .05.  This means that on average, a probability score of 80% may be off plus or minus 5%.  So there was a difference, but that can be attributed to many things.  All it really told me was that I need to ask the question!

So back to the question - what is considered "best practice" as far as scoring records that you trained your model from?

Many thanks!

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members!

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close