Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations bkrike on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

indexing in loops 1

Status
Not open for further replies.

sdpsc

Technical User
Feb 10, 2001
76
I want the below code to loop through each field and index in (thus showing the unique count for each field). But it just counts 1 unique for each field (which is not accurate). How can I do this? Is there a completely different way? Thanks.


numFields=aFields(aColumns)
set safety off
set stat on

For i=1 To numFields
index on Upper(aColumns(i,1)) to temp uniq
? Upper(aColumns(i,1))
? 'Unique records:'
count
? ''
set index to
Next
 

SELECT COUNT(DISTINCT MyField) FROM MyTable INTO ARRAY MyCount

?MyCount(1)
?_TALLY
 

Well, remove the ?_TALLY line in that example.
It would go to another example:

SELECT MyField FROM MyTable GROUP ON 1 INTO CURSOR MyCount
?_TALLY
 

Just set up the time test:

Index takes the longest, and _TALLY shows the total number of records in the table (all that indexed), not only unique values. (Sorry, Jim!)

SELECT COUNT(DISTINCT...
takes just slightly less, shows the correct number.

SELECT ... GROUP ON
takes significantly shorter time, and shows the correct number of unique values.
 

Jim, did you mean _TALLY after the COUNT?
Then yes, this would show the correct result; but still INDEX + COUNT work the longest in a few tests that I set up.
 
Hi Stella,

I meant something like

SELECT mytable
INDEX ON myfield TAG mytag UNIQUE
?_TALLY

In my tests this show the correct unique count. Yes, I'm sure it's not the fastest way.

Jim
 

Oh, so I didn't misunderstand you the first time around.
It didn't show the correct count in my test: _TALLY showed over a million records, all that table has, and only 251 unique values in the field in question. I got that number only after COUNT and _TALLY. That's interesting. Something else could be in play.

On the other hand, SELECTs shown correct results, and were noticably faster, especially GROUP ON (it always works faster than DISTINCT).
 

Of course.
But as I said (possibly, I didn't make it clear enough), the first _TALLY was issued after INDEX, and showed 1xxxxxx (over 1 mln, the number of records in that table); and the second _TALLY after the COUNT showed 251.
 

By the way, in sdpsc's post there are words "But it just counts 1 unique for each field", which means that INDEX...UNIQUE - COUNT approach doesn't work for him/her.

Of course, it could be due to the fact that the value sdpsc is printing after the count is '', not _TALLY or a variable in which he/she should have been counting (but didn't); but in this case, I wonder how even "1 unique for each field" can be achieved.

Something could be amiss in the OP.



 
_TALLY keeps track of the number of records processed. So if you index 1 million records, _TALLY will show 1 million, even if there are only 250 unique records.
If you use COUNT, since there will only be 250 records COUNTed, _TALLY will then show 250 after the COUNT command.

sdpsc,
Is the correct table selected in the routine? I don't see why you would only get one record with count.


-Dave Summers-
[cheers]
Even more Fox stuff at:
 

DSummZZZ[b/],

_TALLY keeps track of the number of records processed. So if you index 1 million records, _TALLY will show 1 million, even if there are only 250 unique records.

Thanks for confirming.
That's exactly the result I got.
 
Yeah, its on the right table. I open up a Fox Pro file with 36 fields and approximately 41K records, and run this code:

numfields = afields(acolumns)
set safety off
set stat on

For i=1 To numFields
index on Upper(aColumns(i,1)) to temp uniq
? Upper(aColumns(i,1))
? 'Unique records:'
count
? ''
set index to
Next

And it goes through the fields, prints the name of the field on screen as it should, but always says '1 record indexed.' I'm not sure why this is happenening. (Count also just gives '1'.)
 

I think I know why.

Try this:

index on Upper(&aColumns(i,1)) to temp uniq

Also, put that count in a variable:

COUNT to MyVar
?MyVar

OR

COUNT
?_TALLY

But still, SELECT-SQL would be faster.
 
Well hey this is exciting, I got it to work like this:

dimen clist(255)
dimen cuniq(255)

numfields = afields(acolumns)
set safety off
set stat on

For i = 1 To numFields
clist(i) = upper(acolumns(i,1))
next

For i = 1 To numFields
? i, clist(i)
&&index on Upper(aColumns(i,1)) to temp uniq
index on (&clist(i)) to temp uniq
go bott
&&cuniq(i) = recno()
count
cuniq(i) = _tally
Next

I may try to put it in select-sql later, but for now I'm happy to have something working :) Thanks everyone for your input!
 
Actually, it wasn't the the count clause causing the issue. I believe if you use this it work better:
Code:
numfields = afields(acolumns)
set safety off
set stat on

For i=1 To numFields
        index on [COLOR=red]Evaluate([/color]Upper(aColumns(i,1))[COLOR=red])[/color] to temp uniq
        ? Upper(aColumns(i,1))
        ? 'Unique records:'
        count
        ? ''
        set index to
Next

-Dave Summers-
[cheers]
Even more Fox stuff at:
 
Hi Stella and Dave,

When I run this code I get the result 2. Try it.

CREATE CURSOR cc (x c(1))
INSERT INTO cc VALUE('a')
INSERT INTO cc VALUE('a')
INSERT INTO cc VALUE('b')
INDEX ON x TAG x UNIQUE
?_TALLY && I get 2 - VFP5

Jim
 
BTW removing UNIQUE from the INDEX statement changes the result to 3.

Jim
 
Hi sdpsc,

you already got it, that index on Upper(acolumns(i,1)) will not index correctly. In fact that way you are creating an index with a constant: The field name in upper case. So it's no wonder, that you always get only 1 unique value, e.g. "FIELDNAME" in your index. Simply test it with SEEK "FIELDNAME", whatever the fieldname really is...

But there is one last flaw in your code:
I'd say with you wanted to have an index on the uppercase VALUES of the fields. And with your code:

Code:
clist(i) = upper(acolumns(i,1))
...
index on (&clist(i)) to temp uniq
You again make the field name upper case and index on the field this time. Not on the uppercase values, but on the unchanged values of the field. So "McDonalds" and "Mcdonalds" will count as 2 unique values.

That should work:
Code:
local lnNumfields, lcField, i
local array laFields[1]
lnNumfields = AFields(laFields)
set safety off
set status on

For i = 1 To lnNumFields
    lcField = laFields[i,1]
    ? i, lcField
    index on upper(&lcField) to temp uniq
    count
Next
Stellas solution with
index on Upper(&aColumns(i,1)) to temp uniq
is almost the same, but I wonder if makro substitution will work correctly for array elements in each vfp version.

Dave Summers: And with evaluate it should be
Upper(Evaluate()), as you want upper of the field value and not the evaluation of the uppercase field name, which is the same as Evaluate() alone, as foxpro isn't case sensitive in field or variable names.

 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top