Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations TouchToneTommy on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Getting a random selection from an SQL Query Result 3

Status
Not open for further replies.

grantwilliams

Programmer
Sep 8, 2003
66
AU
Hi!

Hopefully this is a new one here and might give someone a challenge!

The task I've been given is to get the details of a selection of clients for the previous 12 months. The selections are as follows:
- Major clients who have approached us 3 or more times a month for the full year (on average)
- A random 5% of the remainder of clients for the year.

So, firstly I have the retrieved the ClientID and number of times the client has approached us for the years using the following SQL Query named getCount:
Code:
SELECT DISTINCT tRegister.ClientID, Count(tRegister.ClientID) AS Total
FROM tRegister
WHERE tRegister.DateCompleted >= 15/05/2003
GROUP BY tRegister.ClientID;

Secondly, I can retrieve the major clients and their details:
Code:
SELECT getCount.ClientID, Right(Format(Now,"Short Date"),2) AS IDCODE, tClient.DivisionID, tClient.BranchID
FROM tClient INNER JOIN getCount ON tClient.ClientID = getCount.ClientID
WHERE getCount.Total >= 36;

Thirdly, I can retrieve the remainder:
Code:
SELECT getCount.ClientID, Right(Format(Now,"Short Date"),2) AS IDCODE, tClient.DivisionID, tClient.BranchID
FROM tClient INNER JOIN getCount ON tClient.ClientID = getCount.ClientID
WHERE getCount.Total < 36;

NB The Right(Format(Now,"Short Date"),2) is in there as the table this will be stored in, will store multiple years data, so the combination of the year and the ClientID form the Primary Key.

Now is where I run into trouble....

I need to be able to insert the details of all of the major clients into a table named tClientSurvey... I imagine I can do this with a Recordset and use a For loop and the MoveFirst, MoveNext commands to insert each record into the table.

I also need to insert a random 5% of the rest of the clients into the table. For this I thought I could once again use a Recordset and determine which records to insert using a Rnd * Recordset.length... however I do not know how to move to the record specified by Rnd * Recordset.Length....
Alternatively, I thought if I were able to insert all of the query data into an array, then I could use Rnd * Array.Length to get the ClientID's and then use a query to pull the remainder of the data out which could then be inserted into table.

OR if someone has a better suggestion, PLEASE PLEASE PLEASE come forth and let me know! [smile]
 
On your third query, can't you add a calculated field,
RND() AS PICK_ME

and use that as a subquery, like

WHERE PICK_ME < .05

to get your randomized selection?

Rusty

The early bird gets the worm, but the second mouse gets the cheese.
 
GrantWilliams,
It is no problem navigating to selected rows in the recordset. All you need to do is:

(1) Before opening the recordset. establish a local cursor
like so: rs.CursorLocation = adUseClient

(2) Determine the number of rows in the recordset either
with the rs.RecordCount property, if your recordset
supports it, or by doing a rs.MoveFirst, then a Do
While not rs.EOF loop, if the RecordCount property
returns a value of -1.

(3) Navigate to the selected row by using the recordset
AbsolutePosition property like so:
rs.AbsolutePosition = <fieldname>

You will need to be cautious because the AbsolutePosition
property is 1-based, so if you set it to zero, you will have a problem.

Also, as a guy who's done a bit of statistical stuff in Access, I'd suggest that each time you process one of the random rows, store the row number in an array. Then each time you generate a random number, check the array to see if you've already processed that row. Otherwise, you *will* be processing duplicates, and there goes your randomness.

To insert the major client data, you can use the Execute Method of the Connection Object like so:
(1) First empty out the tClientSurvey table:
CurrentProject.Connection.Execute "Delete from
tClientSurvey"

(2) Insert fields from each selected client record into the survey table:
CurrentProject.Connection.Execute "INSERT INTO tClientSurvey ( field1, field2, field3 )
SELECT tClient.field1, tClient.field2, tClient.field3
FROM tClient where <clientID = '<whatever>'"

You get the idea.

If I've been to brief in my explanation, just write back and I'll give you a more detailed explanation.

Good luck with your project.

Tranman
 
You say:
I need to be able to insert the details of all of the major clients into a table named tClientSurvey

You should do this with another saved query. Name your second query and take it in as the input recordset to an Append query that appends the records in the table: tClientSurvey

Code:
Insert into tClientSurvey ([i]field1, field2, field3, field4 [/i]) Select A.ClientID, A.IDCode, A.DivisionID, A.BranchID FROM [i]yourqueryname[/i];

Now take a look at the code in a thread that I recently responded to that uses the Rnd function to select a random number of records. This code can be modified to select a random % of records by changing the code that builds the query SQL.

Read that over and if you need more help post back and I can help you clean it up and apply it to your database problem.





Bob Scriver
[blue]Want the best answers? See FAQ181-2886[/blue]


 
No need of recordset at all, you may try something like this:
DoCmd.RunSQL "INSERT INTO tClientSurvey (Column1,Column2,Column3,Column4) " _
& "SELECT g.ClientID,Format(Now,"yy"),t.DivisionID,t.BranchID " _
& "FROM tClient AS t INNER JOIN getCount AS g ON t.ClientID=g.ClientID " _
& "WHERE g.Total>=36;"
Randomize
DoCmd.RunSQL "INSERT INTO tClientSurvey (Column1,Column2,Column3,Column4) " _
& "SELECT TOP 5 PERCENT g.ClientID,Format(Now,"yy"),t.DivisionID,t.BranchID " _
& "FROM tClient AS t INNER JOIN getCount AS g ON t.ClientID=g.ClientID " _
& "WHERE g.Total<36 ORDER BY Rnd;"

Hope This Help, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
Thank you everyone for these valuable replies!

Am now attempting to use an ADODB Recordset and have the following code producing a "Variable undefined" error at the "Set recClientDetails.ActiveConnection = CurrentProject.Connection" line

I am using Access 97. I have Microsoft ActiveX Data Objects 2.0 Library checked unter Tools -> References.

Can anyone see a problem here?
Code:
    Dim strListQuery As String
    Dim recClientDetails As ADODB.Recordset
    
    Dim iTotalRows As Integer, iCounter As Integer
    
    strListQuery = "SELECT ClientID FROM getCount WHERE Total < 36;"
    
    Set recClientDetails = New ADODB.Recordset
    Set recClientDetails.ActiveConnection = CurrentProject.Connection
    recClientDetails.CursorLocation = adUseClient
    
    recClientDetails.Open strListQuery
    
    recClientDetails.MoveLast
    iTotalRows = recClientDetails.RecordCount

Tranman, thanks for the INSERT statement. Worked a charm! Some more info on using the Recordset would be MOST appreciated! [wink]


 
grantwilliams, why insist on recordset when no need ?
Have you tried my suggestion ?

Hope This Help, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
PHV,

Running the following SQL appends ALL of the remainder of the records, not just the random 5%. Is there an error in the way I have attempted to use your code?

Randomize
DoCmd.RunSQL "INSERT INTO tSurveyData (ClientID, IDCode, DivisionID, BranchID) " & _
"SELECT TOP 5 PERCENT getCount.ClientID, Format(Now, 'yy'), tClient.DivisionID, tClient.BranchID " & _
"FROM getCount INNER JOIN tClient ON getCount.ClientID = tClient.ClientID WHERE getCount.Total < 36 ORDER BY Rnd;"
 
Have you tried the SELECT TOP 5 PERCENT in the query window ?

Hope This Help, PH.
Want to get great answers to your Tek-Tips questions? Have a look at FAQ219-2884 or FAQ222-2244
 
PH,

Have tried the SELECT TOP 5 PERCENT in the query window. This is randomising the results, but is selecting all, not just the 5 percent.

I could use this to populate a DAO Recordset (which is friendly with my version of Access!) and then just use a RecordCount * 0.05 to count out the first 5 percent of records and insert each record into the table. Sure, it's not a nice way of doing it, but this task is only done once a year, so anyway is a good way!

Unless you have another better way?

Thanks,

Grant.
 
Sorry, I don't understand why the TOP 5 PERCENT selects all.
 
Let's modify PHV's code to the following:

Code:
Randomize
DoCmd.RunSQL "INSERT INTO tClientSurvey (Column1,Column2,Column3,Column4) " _
 & "SELECT TOP 5 PERCENT g.ClientID,Format(Now,"yy"),t.DivisionID,t.BranchID " _
 & "FROM tClient AS t INNER JOIN getCount AS g ON t.ClientID=g.ClientID " _
 & "WHERE g.Total<36 ORDER BY Rnd([red]Asc(g.ClientID)[/red]);"

This seeding with the Asc(g.ClientID) value generates a different number each time and every time the query runs it is in a different order. Without the argument entry all records have the same value for Rnd() so all records were selected. The Asc conversion is used to make sure that the argument is a numeric value. I don't know if the ClientID is numeric or not so always just use Asc function.

See if that works for you. It does here.

Bob Scriver
[blue]Want the best answers? See FAQ181-2886[/blue]


 
After re-reading this thread I see that I never posted the thread# that I was referring to in my initial posting. Here is that thread: thread701-824877 The code there prompts for a table and builds the query necessary to generate the SELECT TOP XX sql on the fly. It uses the Rnd function with ASC function value to seed it as posted above.

Good luck with your project as I think the last adjustment that I suggested to PHV's code will give you the TOP 5 PERCENT that you were looking for.

Bob Scriver
[blue]Want the best answers? See FAQ181-2886[/blue]


 
There is now just one more problem... but first here's what we've got right!

I took PH's Randomised list and added Bob's Asc(getCount.ClientID), however I'd already managed to get my 5% with a recordset. The code for the Sub is below:

Code:
Private Sub PopulateClientData_Click()

    DoCmd.SetWarnings False
    Dim MajorClientSQL

    MajorClientSQL = "INSERT INTO tSurveyData (ClientID, IDCode, DivisionID, BranchID) SELECT getCount.ClientID, Right(Format(Now, 'Short Date'), 2), tClient.DivisionID, tClient.BranchID FROM getCount INNER JOIN tClient ON getCount.ClientID = tClient.ClientID WHERE getCount.Total >= 36;"

    DoCmd.RunSQL MajorClientSQL
    DoCmd.SetWarnings True
    
    Dim strListQuery As String, wks As Workspace, db As Database, recClientDetails As DAO.Recordset

    Dim iTotalRows As Integer, iCounter As Integer, iUseRows As Integer
        
    strListQuery = "SELECT getCount.ClientID, right(Format(Now, 'Short Date'),2) AS IDCode, tClient.DivisionID, tClient.BranchID FROM getCount INNER JOIN tClient ON getCount.ClientID = tClient.ClientID WHERE getCount.Total < 36 ORDER BY Rnd(Asc(getCount.ClientID));"
    
    Set wks = DBEngine.Workspaces(0)
    Set db = wks.Databases(0)
    
    Set recClientDetails = db.OpenRecordset(strListQuery, dbOpenSnapshot)
            
    recClientDetails.MoveLast
    
    iTotalRows = recClientDetails.RecordCount
    iUseRows = iTotalRows * 0.05
    
    recClientDetails.MoveFirst
    
    DoCmd.SetWarnings False
    
    For iCounter = 0 To iUseRows
    
        Dim InsertSQL

        InsertSQL = ("INSERT INTO tSurveyData (ClientID, IDCode, DivisionID, BranchID) VALUES (" & recClientDetails.Fields("ClientID") & "," & recClientDetails.Fields("IDCode") & "," & recClientDetails.Fields("DivisionID") & "," & recClientDetails.Fields("BranchID") & ");")
        
        DoCmd.RunSQL InsertSQL
        
        recClientDetails.MoveNext
    
    Next
    
    DoCmd.SetWarnings True
    
    recClientDetails.Close
    
    MsgBox "List Population Complete"
        
End Sub

*Apologies for the messy code*

NOW.... here's the problem. The getCount query is this:

Code:
SELECT DISTINCT tRegister.ClientID, Count(tRegister.ClientID) AS Total
FROM tRegister
WHERE tRegister.DateCompleted >= #30/04/2003# 
And tRegister.DateCompleted <= #01/05/2004#
GROUP BY tRegister.ClientID;

THIS WORKS! However I'd like to pick the start and end dates off a form. To do this, I have 2 text boxes: Date and StartDate. StartDate has its visible property set to 'No'. Its value is picked up as per the code below:

Code:
Private Sub Date_AfterUpdate()

    Dim getDate As Date, StartDate As Date
    getDate = Format(Me.Date, "dd/mm/yyyy")
    StartDate = getDate - 365
    Me.StartDate = StartDate
      
End Sub

Private Sub Form_Load()

    Dim getDate As Date, StartDate As Date, StartDateString
    getDate = Format(Now, "dd/mm/yyyy")
    Me.Date = getDate
    StartDate = getDate - 365
    Me.StartDate = StartDate
    
End Sub

Now, if I change my criteria lines in getCount to:

Where tRegister.DateCompleted >= Forms!ClientSurvey!StartDate
AND tRegister.DateCompleted <= Forms!ClientSurvey!Date

The getCount query STILL WORKS, however the:

Set recClientDetails = db.OpenRecordset(strListQuery, dbOpenSnapshot)

line gives me the following error:
"Run-time Error '3061' Too few parameters. Expected 2."

Any ideas why this is happening?

THANKS GUYS!
 
First of all change the query code to this to get your TOP 5 PERCENT:

Code:
strListQuery = "SELECT [red]TOP 5 PERCENT [/RED]getCount.ClientID, right(Format(Now, 'Short Date'),2) AS IDCode, tClient.DivisionID, tClient.BranchID FROM getCount INNER JOIN tClient ON getCount.ClientID = tClient.ClientID WHERE getCount.Total < 36 ORDER BY Rnd(Asc(getCount.ClientID));"

Now you can remove all the code to get your TOP 5 PERCENT which is where this other error is comming from. Just make sure that the above SQL executes correctly and you should be all set.

Bob Scriver
[blue]Want the best answers? See FAQ181-2886[/blue]


 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top