Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

  • Congratulations wOOdy-Soft on being selected by the Tek-Tips community for having the most helpful posts in the forums last week. Way to Go!

Pausing and restarting Cube builds

Status
Not open for further replies.

drlex

Technical User
Jul 3, 2002
3,295
GB
Hello listers,
I am running Version 7 (although not MR1) of Transformer (Administrator) and am having difficulties with a large model.
All our data is held on and read from a back-up computer, which is refreshed every morning with the previous day's data as part of the "overnight" routines on the main computer.
The problem I have is that transformer is unable to read in all records between the database being available in the morning (7 am) and then being disconnected the next day (1 am) for the refresh. I calculate it will take 24 hours to read in all records.
If I manually pause the cube build in the evening and then restart it the next day, transformer believes it has read all the records and proceeds to build the cube with just the data it read in prior to pausing (perhaps this is due to the database being slightly different to that which it was when the build was paused).

I can't find anything related to this on the Cognos site.
Does anyone have a suggestion other than asking the IT department to hold off with the overnight routines?

many thanks,
lex
 
Firstly you must have a msssive amount of data or a very slow machine for the transformer query to take so long ??

Are you rebuilding your cubes form scratch each time if so could you use incremental cube builds to add in only the previous days data.

are you building your cube from 1 large query ? if so it would be much quicker to change this to several samller queries to bring in disparate information i.e 1 query for transactional data, another for cuistomer data, another for product data so on and so on.

Gary Parker
Systems Support Analyst
 
Gary,
Possibly the ODBC connection could be not optimally configured, but the PC I am using (W2K Pro on Pentium III with 500MBS connection) is pulling in 1M records per hour on a single query for a small-ish (10 dimensions, most being single level) model. It's better than the 500K per hour I manage on a lower PC with 10MBS.
The query is looking at only 2 tables, one being the data, the other being the ledger descriptions. However, there are 5 aliases between them to bring in the relevant descriptions for the 5 levels of ledger analysis (not my DB set-up!). The problem is the 24M records on the nominal ledger.

I think I may have to build 2 cubes of 3 yrs each or look at building a model that I can increment with 1 year's worth of data at a time to give me the monster cube.

The reason for this? The accountants would like to be able to find a b/f figure by looking at the ledger from start to finish, rather than examining 2 or more cubes.
many thanks,
lex
 
Typical Accountants can't manage 2 things at once

As I said you may be able to speed this up by splitting this into multiple queries. 1st query to bring in your data and 1 query for each of your ledger descriptions.

i.e.

1st Query
TransactionData
LedgerCode1
LedgerCode2
LedgerCode3 etc..

2nd Query
LedgerCode1
LedgerCode1Description

3rd Query
LedgerCode2
LedgerCode2Description

and so on for each of your Ledger Descriptions.

This should speed up the database read for the transactional data because you are only looking at one table. And as long as the column names match for each of your sub queries Transformer will match each ledger code with the correct description.

Hope this helps a biot more. Gary Parker
Systems Support Analyst
 
Gary, after Enron, do you want Accountants adding 10 digit numbers...?
I wish I could separate the queries. However, in the main dimension, I need to make a hierarchy of the 5 element codes with their descriptions for drill-down purposes-

If I bring in 5 IQD data sources for the element descriptions to be used as labels for the codes, I get contextual errors in Transformer that can't be solved by specifying uniqueness, given the hierarchy and the non-uniqueness of the codes.

To label, the description table is aliased and joined 5 times to the ledger table in order to get the correct descriptions into the sole query. I appreciate this is the reason why I can only draw in 1M records per hour. For a single year cube, fine; for 6 years, too long!
Am I missing something obvious?
Otherwise, it looks like I'll be asking the IT dept to belay the Sunday am routines.
thanks again,
lex
 
Ok DrLex think I understand your predicament a little more, what database are you reporting of ? could you sk your db administrator to create you a view over your ledger data and description table ? Gary Parker
Systems Support Analyst
 
I might be able to get the DBA to concoct a description table that I can join to, but with 15,000 combinations, it may only be a small speed improvement.
We use Progress 9.1c on a Compaq AS400 and report via ODBC Openlink.
It annoys me that I can't pause the build successfully, so I may try and take it up with Cognos via our (unimpressive) VAR.
Anyhow Gary, thanks for your suggestions.
lex
 
It was explained to me by a COGNOS consultant that Transformer is just as happy (maybe even happier) reading the data from a file. If you could dump out the relevant records in the available time, then you could let transformer chew on it for as long as it wanted.

An alternative (what we're looking at), is creating a second database, copying over an exact copy of the data as it exists in its relational form, then denormalizing it into a FACT table (DataWarehousing 101). At this point, both our querys for reports, and transformer are much happier.
 
I know this thread is a little old, but you might also want to try the following:

Only load current year, and prior year live from the AS/400. Next do a one time dump of all prior required years of history to a flat file. This file should not need to be updated unless there is a prior period adjustment.

You would now have two identical data sources:
1) the query against the as/400
2) the flat file
 
Status
Not open for further replies.

Part and Inventory Search

Sponsor

Back
Top