Dear Experts,
We have the following scenario:
We will create a system that will perform many functions.
Database SQL 2000
Client Visual Foxpro
Function A:
Import and export data from-to text files. Data import will consist of 20,000,000 records
per week.(Every week is treated as one period)
Each record will pass from several verification steps as well as data transformation steps.
This will be a very heavy duty job for the system.
In the transformation process a period (week number) will be stored for every record
processed.
Afterwards data will be appended/replaced into a table that will be used in function B.
Function B:
This function performs data viewing and editing taken from Function A.
we want to keep in the system at least 15 periods.(300,000,000 records)
Data for each period will be taken from Function A.
Function C:
Reporting of data.
I) Selection of data based on 2 periods (20,000,000 records), based on 15 periods
(300,000,000 records). One Table with 300,000,000 records
II) Selection of data based on 2 periods (20,000,000 records), based on 15 periods
(300,000,000 records). Data is splitted into 15 tables(one for each period), therefore if we
select more than one period, various tables consisting of 20 million records each, must be
joined. Worst case is to join all 15 tables. (300 million records)
.....
Function ...
.....
Questions:
----------
1) When do we split a Database in 2 or more? Is it good to split the database in our case,
or can we assing some priorities for job processing. (low priorety for import, so that data
viewing/editing/reporting performs very good)
2) What is the best strategy to cleanup database? (Remove every time 20 million records)
3) What benefits we will get from using clustered index for huge tables? (one will have 300
million records)
Waiting for your helpful reply.
Thanks in advance,
Spyros Christodoulou
We have the following scenario:
We will create a system that will perform many functions.
Database SQL 2000
Client Visual Foxpro
Function A:
Import and export data from-to text files. Data import will consist of 20,000,000 records
per week.(Every week is treated as one period)
Each record will pass from several verification steps as well as data transformation steps.
This will be a very heavy duty job for the system.
In the transformation process a period (week number) will be stored for every record
processed.
Afterwards data will be appended/replaced into a table that will be used in function B.
Function B:
This function performs data viewing and editing taken from Function A.
we want to keep in the system at least 15 periods.(300,000,000 records)
Data for each period will be taken from Function A.
Function C:
Reporting of data.
I) Selection of data based on 2 periods (20,000,000 records), based on 15 periods
(300,000,000 records). One Table with 300,000,000 records
II) Selection of data based on 2 periods (20,000,000 records), based on 15 periods
(300,000,000 records). Data is splitted into 15 tables(one for each period), therefore if we
select more than one period, various tables consisting of 20 million records each, must be
joined. Worst case is to join all 15 tables. (300 million records)
.....
Function ...
.....
Questions:
----------
1) When do we split a Database in 2 or more? Is it good to split the database in our case,
or can we assing some priorities for job processing. (low priorety for import, so that data
viewing/editing/reporting performs very good)
2) What is the best strategy to cleanup database? (Remove every time 20 million records)
3) What benefits we will get from using clustered index for huge tables? (one will have 300
million records)
Waiting for your helpful reply.
Thanks in advance,
Spyros Christodoulou