Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations Chriss Miller on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Creating Indexes 3

Status
Not open for further replies.

audiopro

Programmer
Apr 1, 2004
3,165
GB
I am saving a DBF file for backup purposes and for space saving reasons have opted not to save the CDX file.
How do I programmatically create a CDX file?
The language reference book I have covers creating CDX files with DBF files but not on their own.


Keith
 
Code:
COPY TO ..... WITH CDX
works a treat.
The whole exercise has been worth the effort for the knowledge gained. This is just part of a planned automatic back up and analysis system and I'm sure more questions will be aimed at Tek-Tips before the project is complete.
Thanks to all who have taken time to respond to my questions.

Keith
 
Hi AudioPro,

I'm getting to this discussion late, but wanted to introduce a couple concepts that haven't been discussed (and add my opinions to some that have):

1) There are significant differences between backing up DBC's+DBF's and backing up free DBF tables. With DBC's, I agree fully with Olaf: Backup the whole directory (DBC,DCX,DCT,DBF's,FPT's,CDX's).

Regarding Free Tables:
1) If you use "COPY TO" without the "WITH CDX" clause, then the resulting DBF's can be backed up, restored, and reindexed with no problem, since the backed-up DBF's weren't marked as having a CDX.

2) If you backup the DBF's "as is", after restoring them you can remove the "This DBF has an attached CDX" marker before trying to USE them...
3) If you don't like hacking DBF headers, then you can create a "stub" CDX (like mentioned above) on a dummy DBF (the DBF structure and CDX contents are irrelevant... VFP doesn't check), and immediately after "USE mytable EXCLUSIVE" opens the DBF with the dummy CDX, just "DELETE TAG ALL" to wipe out the bad CDX.

4) I highly recommend NOT backing up the CDX's because they are so susceptible to corruption, (let alone the fact that they alone can take up more space than all the DBF's). I think that any VFP program that relys on DBF's must have built-in a way to recreate the indexes for those DBF's. Instead of trying to use ATAGINFO to salvage-and-save the info, have the indexes originate from INDEX commands in code... never distribute CDX's with an application.


With DBC's
I don't have the same extent of experience with these, but have one app that has been stable using this system:

Upon installation, a GENDBC script creates the DBC and all related files.

Backup the DBC,DCX,DCT, DBF's & FPT's w/o any alterations.

To Restore:
1)unzip the DBC,DCX,DCT, DBF's and FPT's into a temp location.
2)create one Dummy CDX: for each DBF, in turn, copy that CDX to the appropriate name, USE the DBF, and DELETE TAG ALL. Validate that the tables you care about Do exist.
3)run the GenDBC script to create an empty database of the Current version. (the GenDBC code in the application that is running may be newer than the one that was in the application that created the backup)
4)Open the old database under a different name, loop through all the tables in the new empty DBC, importing the data from the corresponding table from the old DBC. As this happens, any changes in Schema can be accounted for. I have a stored procedure in each DBC that identifies its version which makes deciding what conversions to do pretty simple.

This system makes it rather easy to improve the Database Schema while still being able to restore backups from previous versions of your program.

- Bill

Get the best answers to your questions -- See FAQ481-4875.
 
This part of the project is now working OK but the thread has thrown up some interesting opinions and I am grateful to everyone who has taken the time to respond.
What started out as a 'simple' backup routine is now developing into a much larger and interesting project. Some will call it project creep but I just view it as development.
I have some additional questions but I will start new threads rather than try to combine multiple questions in one.

Keith
 
Hi Bill,

right: backup of one or some free DBFs is quite easy compared to a DBC+DBFs, if you want to remove indexes and recreate them at restore. For example free tables can have no primary key, only regular or candidate indexes (or binary since vfp9).

Bill said:
I have a stored procedure in each DBC that identifies its version which makes deciding what conversions to do pretty simple.

While I'd say it's not the task of a restore to restore an old database version to a new database format, the idea is quite nice. Do you store code for transforming each older version to the actual one? Or do you even do this data driven?

I rather restore the database as is and do all updates that were done with the old (and perhaps now defect) database with an incremental database update script:

Pseudocode:
Code:
if not exists dbc
   * code to create an empty database in version 1
endif

if dbcversion()<2
   * do update from version 1 to 2
endif

if dbcversion()<3
   * do update from version 2 to 3
endif

if dbcversion()<4
   * do update from version 3 to 4
endif

set dbcversion to 4

This is able to update any restore or create an empty database from scratch...

Bye, Olaf.
 
Hi Keith,
I'm glad you have something working... I think this thread will be of great use to many future readers, too.

Hi Olaf,
While I'd say it's not the task of a restore to restore an old database version to a new database format, the idea is quite nice. Do you store code for transforming each older version to the actual one? Or do you even do this data driven?
The most data-driven I've gotten is one project that used ADBOBJECTS to look through the new DBC and APPEND FROM any tables that existed in the old DBC. This application, fortunately, only added tables, and hasn't yet needed data manipulation.

The other products don't use DBC's, so the Restore/Reindex code is essentially organized like this:
Copy all old tables (current data or from a backup) into a "Reindex" subdirectory.
If "Table1" is in "Reindex" subdirectory or doesn't exist in Data Directory:
Create new "Table1"
Try (carefully) to open "Reindex\Table1"
Append from "Reindex\Table1"
if SourceData is version "X"
Do any data transformations that this table needed to go from version "X" to the current version.
if SourceData is version "Y"
Do any data transformations that this table needed to go from version "Y" to the current version.

Do everything again for "Table2"

This way, all the Table definitions are in Code, all together. Updates to the Schema are noted in the code in comments. Every time the program runs, if the Database version is out of date, it simply reindexes and is left with a current database format.

The two applications we have that use this system have evolved over the last 15 years or more, constantly having more functionality added that wasn't dreamed of when they were first designed.

I rather restore the database as is and do all updates that were done with the old (and perhaps now defect) database with an incremental database update script:
One of our products used to do this. It got really frustrating because occasionally this convert routine was "forgotten" when a release was made that slightly changed the data schema, and there was no record of what the interim schema had been.
I had to reverse-engineer the changes based on backups that we had laying around. (this was a Pascal application that stored the data as simply writing the list of records/structures directly to a file, so the exact record definition was Required before you could load the data. Small schema changes in VFP .DBFs are probably not quite as annoying to account for in an incremental update).

It seemas like the processing for a single table gets stretched out across multiple versions with incremental updating, though, which I think is the main reason I shy away from it.

I like having the data schema in code right next to the conversion code, so that if you change the schema: you change the conversion code.

Also, the final schema is not dependant at all on the starting code: That is, the final database has been constructed directy by CREATE TABLE commands, rather than taking the current table and using ALTER TABLE commands to make v1 to v2 adjustements, then again ALTERing from v2 to v3, etc.

- Bill

Get the best answers to your questions -- See FAQ481-4875.
 
Hi Bill,

I should add, mostly I do individual software for a single customer, not some standard software of which I have full control over it's design. The wishes of customers in such software, buisiness rule changes or misunderstandings often need more advanced changes than just adding tables, so your kind of restore wouldn't work for me.

With freshly generated/resotred tables several ALTER TABLES have not yet been a problem to me. In fact mostly the restored backup is only one version behind the actual one. The customer should backup his data once a day and so being more than one version behind is very seldom, not to say impossible. To keep all other update steps in an incremental update script is mostly not needed, but does not hurt and has that extra feature of generating an empty database from scratch, that goes through all steps of the updates.

You are right, that this affords a very high discipline and you can't do the simplest change without adding that to the script, or it won't be capable to create the final version of the database structure. This may be the downer.
Another thing is, that if one script version failed, it can get quite hard to correct this and make it work reproducable on a half updated database, therefore that script should run on a copy of the database, to be able to revert to the last version, if the update script fails and only copy the new version back to it's path if the update fully succeeded.

As complex, as this may be, if you have the sceleton of such a script, you really only need to add that one ALTER TABLE succeded perhaps by some DBSETPROPs and an UPDATE-SQL or a call of a more complex function.

In my situation I often add single fields and I also fill them with data, partly depending on other fields or even other tables. Sometimes a denormalisation has to be done, or a normalisation. I won't be able to do this from each version x to the actual version or would have to write several script versions.

If a database is broke, with a restore of the last known intact version you'll save your cstomers so much time and money compared to the complete loss of data, that it will mostly not matter if it takes much longer, because two or even three ALTER TABLE commands run instead of a single one to have the table in the structure needed for the newest application version. Each alter table creates a fresh copy of the dbf file and creates the indexes from the index expressions, so the table structure does not get worse with an alteration, it even get's refreshed. The running time for large tables is really bad that way, but with the amount of data the value of it also rises and therefore that time shouldn't matter. And a restore really is something of a worst case scenario. Often enough repairing with SDT (I mentioned earlier) saves the day...

Bye, Olaf.
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top