And just for the sake of fun, when you do delete orphaned rows and INDEX ON NOT DELETED() TAG deleted BINARY.
The index size now is <1.5MB.
And then SELECT COUNT(*) FROM t1, as you don't need to join t2, and VFP9 Rushmore takes binary indexes on DELETED() besides also supporting filtered FOR DELETED() and more (see the VFP9 help "Indexes Based on Deleted Records") it breaks down to this on my computer:
Time to count ~10 million records with 5% deleted rows
Without index: 2.4 Seconds
With index: 0.012 seconds
So that's the most promising way of fast counting in the single table and even if you want to refrain modifying your production data, for the migration you could and perhaps even should use a data copy, ie restored a backup to work without being influenced by the current application usage of the data. On the copy, you can even work with exclusive on and have your migration EXE on the file server, etc.
And then you can not only delete, you can not only index on DELETED(), you can also PACK, that makes it even better because RECCOUNT() then is 1:1 the count.
Since you brought up that migration topic I'm not sure anymore, if it's really just about the count but any handling of data without a high selectivity, ie boiling a result down to just a few records. Most important that's not to be confused with just having one aggregate value. That's a small result set, but to get it you need to go through all data. High selectivity means only addressing a small number of rows.
Well, from my migration experiences I can only say partitioning data helps a lot but I don't see why that would need such a join optimization. Aside from some small tables when you go for the massive tables and migrate chunks of data the most natural way in nested or hierarchical structured relationships is going from one head record to all detail data and not vice versa.
Bye, Olaf.
Olaf Doschke Software Engineering