Report Speed Tips
Report Speed Tips
(OP)
I would like to collect your ideas on improving report speed. I have to chuckle when I see people complaining about reports that take 20 seconds. We have a statewide database with millions of records, hundreds of tables, and an old inefficient server, and if we can get our reports to run in under 15 minutes, we think we have a fast report. We also are unable to upgrade our clients, meaning we can't use the optimal native driver.
The following are some strategies for improving speed that I've picked up, but I would like to hear about other people's methods to expand (or correct) upon this list:
1-Pass all elements of the record selection criteria to the SQL statement.
a- Do not use (most) formulas for record selection
b- Follow guidelines in SynapseVampires FAQ
2-Perform grouping on server (in our case, it's faster NOT to, though)
3-Link tables on indexed fields whenever possible.
4-Select on indexed fields when possible.
5-Remove unused tables, unused formulas, unused running totals from the report.
6-Use conditional formulas instead of running totals when possible.
7-Use conditional formulas to return a desired field result or not, instead of using suppression to eliminate unwanted records.
8-If using running totals, evaluate the minimum number of times, i.e., on change of group instead of for every record, if this is an option.
9-Group on indexed fields when possible (not sure, just surmising)
10-Group on database fields instead of formulas when possible (not sure, just surmising)
11-Avoid subreports (in most cases), although they may result in a faster report if they eliminate significant row inflation.
12-Using a command as the datasource will be faster than using CR to link tables, select records, etc.
13-If using a command as datasource, develop parameters within the command.
14-Link tables by selecting a lead table that has one indexed record to link to tables containing multiple corresponding records.
15-Suppress unnecessary sections; suppress or hide details to facilitate grouping on the server.
16-Avoid nested formulas.
17-Whenever possible, limit records through selection, not suppression.
18-Use SQL expressions to convert fields to be used in record selection instead of using formula functions.
I'm sure I've missed many obvious concepts. Please add any you can think of. Links to other sources are welcome as well.
Thanks for contributing!
-LB
The following are some strategies for improving speed that I've picked up, but I would like to hear about other people's methods to expand (or correct) upon this list:
1-Pass all elements of the record selection criteria to the SQL statement.
a- Do not use (most) formulas for record selection
b- Follow guidelines in SynapseVampires FAQ
2-Perform grouping on server (in our case, it's faster NOT to, though)
3-Link tables on indexed fields whenever possible.
4-Select on indexed fields when possible.
5-Remove unused tables, unused formulas, unused running totals from the report.
6-Use conditional formulas instead of running totals when possible.
7-Use conditional formulas to return a desired field result or not, instead of using suppression to eliminate unwanted records.
8-If using running totals, evaluate the minimum number of times, i.e., on change of group instead of for every record, if this is an option.
9-Group on indexed fields when possible (not sure, just surmising)
10-Group on database fields instead of formulas when possible (not sure, just surmising)
11-Avoid subreports (in most cases), although they may result in a faster report if they eliminate significant row inflation.
12-Using a command as the datasource will be faster than using CR to link tables, select records, etc.
13-If using a command as datasource, develop parameters within the command.
14-Link tables by selecting a lead table that has one indexed record to link to tables containing multiple corresponding records.
15-Suppress unnecessary sections; suppress or hide details to facilitate grouping on the server.
16-Avoid nested formulas.
17-Whenever possible, limit records through selection, not suppression.
18-Use SQL expressions to convert fields to be used in record selection instead of using formula functions.
I'm sure I've missed many obvious concepts. Please add any you can think of. Links to other sources are welcome as well.
Thanks for contributing!
-LB
RE: Report Speed Tips
Some of this depends upon your software version, and optimizing tips are also relative to the database being used, so I can only respond generically, but here's a few thoughts:
2-Perform grouping on server (in our case, it's faster NOT to, though)
This isn't always true, but the idea here is to allow the database to do the grouping/aggregates if it can. Most turn this on thinking that by doing so a group by clause will magically appear in the sql statement, which is NOT true unless the report is properly designed to allow for it, and it can hinder performance if it doesn't pass the group by clause.
9-Group on indexed fields when possible (not sure, just surmising)
Won't matter unless the database is doing the grouping, which means grouping on server would be on and the report properly designed to pass a group by clause. A SQL Command would allow for this.
10-Group on database fields instead of formulas when possible (not sure, just surmising)
I tested this years ago, certainly if the database can do your grouping it's better to do so, but a formula field containing simply {table.field} doesn't have much overhead as opposed to using the database field directly if the grouping is done locally.
12-Using a command as the datasource will be faster than using CR to link tables, select records, etc.
I doubt that this is true, at least not significantly (there's minimal overhead in the Crystal GUI), more of a factor would be the SQL being passed to the database.
15-Suppress unnecessary sections; suppress or hide details to facilitate grouping on the server.
I thought that you turned off the group on server? If so, I don't think that it will pass a group by clause from the GUI.
Would be a Lovely thing were you to post back your completed list afterwards, many people will benefit from your efforts here.
-k
RE: Report Speed Tips
About the command as datasource vs. CR GUI--I've found that I can copy the Show SQL query from a conventionally developed report into a command and reap huge increases in speed. This doesn't seem like an obvious result--since both are passing to the SQL, but I am quite confident that this is the case. Maybe the "advantage" of having a slow system is that something that has an impact can easily be identified.
I hope a lot of people will respond with more ideas.
-LB
RE: Report Speed Tips
If performance is a factor I use Stored Procedures, which generally prove much faster. I gather that isn't an option for you.
Then as opposed to a SQL Command, I tend to create reusable Views on the database, avoiding embedding this sort of thing in the report.
Not always a choice, but it reaps huge benefits for commonly used info.
-k
RE: Report Speed Tips
I didn't mention our database and connectivity as I meant for this to be a general discussion, but the database is Oracle (can't determine the version), and the clients are Oracle 8.0.5 for Windows NT. Currently using a CR Oracle ODBC Driver 4.20 (DataDirect). Upgrading the clients statewide is also not an option at this point. CR version 11.0 SP2.
I'm also noticing that the same report run in 11.0 is dramatically slower than in 8.0 when there is a lot of server traffic, but both are equally fast when there is no traffic. Any ideas why this would be? I'm also having the odd experience in 11.0 of a much more complex report running dramatically faster than a scaled down version (fewer groups and fields) of the same report--with the same tables, linking, and SQL passing in both cases. Still testing why this is.
-LB
RE: Report Speed Tips
RE: Report Speed Tips
-LB
RE: Report Speed Tips
The network traffic issue seems odd if all things are equal, but who knows with Crystal...
A scaled down version of a report might not be using an index that is availble to a report using more fields due to a compound index, meaning mutiple fields.
So if one field is omitted, then the index can't be used, and it's back to a full table scan or a less efficient index.
There are tools for Oracle which allow you to see precisely which indexes are being used, and many other factors, it's not for the faint of heart or newbies.
Here's some delightfully romantic bedtime reading on the topic ;)
http://www.dba-oracle.com/art_sql_tune.htm
-k
RE: Report Speed Tips
mocgp, Thanks for your comments also. This led me to test out and confirm another conclusion:
19-Set parameters to one value versus multiple values whenever possible, e.g., if you have separate sets of codes depending upon a hierarchy, e.g., state, region, city, then instead of using only one parameter that selects one or more cities that then constitute a region or a state, set up a string parameter {?location level} and use a select case statement like:
(
select {?location level}
case "State" : {table.state} = {?state}
case "Region" : {table.region} = {?region}
case "City" : {table.city} = {?city}
)
The lists of values are then set up with "None" as an option for each of the subparameters, and the prompt text instructs users like this: "If the above "Location Level" = "State", then select a state; otherwise select "None". They don't really have to select "None" for the formula to work, it makes more sense to the user on the parameter screen. This made one of my reports significantly faster.
-LB
RE: Report Speed Tips
I see no logical reason why the above criteria would differ from a "ALL" or "NONE"scnario from a SQl standpoint, but ot add to your dilema of optimizing SQL, rearranging a WHERE clause can change performace radically.
Stating that the above will increase performance means nothing until it's qualified, I'm glad that it sped one up, but what is in Crystal means nothing, what is passed to the database is everything.
I'd bet that I coulod write this out another way and net the same performance.
To have some fun with this, copy and paste a query into your query execution tool of choice and rearrange it and execute it.
lso keep in mind that if you run the report one way, then another, the data may be in cahce and that alone will jump the performance.
Getting Crystal to pass SQL can be voodoo, but optimizing queries is a refined, complex issue, and I see no logical reason why your means of passing parms would have any effect on it unless the query built differed from the previous in some way.
Ignoring parts of a record selection formula in different ways nets the same SQL, which is nothing.
-k
RE: Report Speed Tips
{table.city} = {?city}
Then in a grouping formula, I used:
if {?Location Level} = "City" then
{table.city} else
if {?Location Level} = "Region" then
{table.region} else
if {?Location Level} = "State" then
{table.state}
...where each level represented a rollup of cities.
At times I have wondered about whether the data was somehow already organized in some way from a previous report run. If I'm testing report speed, how can I determine whether this is the case? Or how can I clear the cache (I know absolutely nothing about this, and the server is not under my control)?
-LB
RE: Report Speed Tips
LB you say that you wanted to keep this generic to the environment, but I've found that for report performance issues, the environment is a very important piece. I originally started out using CR with a SQL Server database. I am now working with a Borland Interbase database. What I've found is that many of the things I learned and used with SQL Server needed to be unlearned/relearned as it performs very differently.
I have found that using a command rather than the CR table linking GUI can result in huge increases in performances with Interbase. By making this change, I had one report go from taking over 17 seconds to run to under 2 seconds. Since in the application, some clients will run a batch of a couple hundred reports, this is huge.
When I have some more time, I will give the other suggestions a better read and hopefully have some more information to post.
B
RE: Report Speed Tips
-k
RE: Report Speed Tips
Since the current DBMS can't be touched, it may make sense to pursue a data warehouse or a data mart approach.
If there's a chance the organization would be willing to pursue this, propose a meeting and pitch the idea.
If there's no chance of this happening, you may be able to create your own data mart. Using views and materialized views on even a platform as simple as MS Access can make huge improvements in reporting speed and other benefits...
- Ido
view, e-mail, export, burst, distribute, and schedule Crystal Reports.
www.MilletSoftware.com
RE: Report Speed Tips
Then you cna create queries within that filter the rows down to mamageable entities, and perform aggregates where it makes sense, and then you can use Access as the reporting database.
It's fugly, but I've seen it in very large IT shops that can't leverage the database fully.
-k
RE: Report Speed Tips
I haven't tried the linking/filtering in Access method, but will. I guess this becomes a two-step process--first you'd be running a report in Access to populate the database, and then running a second report (or multiple reports) in Crystal against that new database. I guess the assumption is that the Access database would only be updated periodically? Otherwise, how would a speed savings be realized? Or maybe I'm misunderstanding how to go about this.
-LB
RE: Report Speed Tips
The only 2-step cases are those where you wish to create "materialized views" in Access by running "Make Table" queries. You can automate such processes using simple Macros.
- Ido
view, e-mail, export, burst, distribute, and schedule Crystal Reports.
www.MilletSoftware.com
RE: Report Speed Tips
The more formulas you have the slower the report will run because formulas are processed on the local machine so, groupng on them is less efficient than grouping on a database field.
Using page number instead of page n of m is fater becasue n of m has to process all pages before displaying first page in order to calculate total page count.
selecting "select distinct records" slows performance.
Running totals slows performance.
These are just a few I try to keep in mind while writing a report. I would love to see any other suggestions you receive.
RE: Report Speed Tips
"In reference to grouping on server. The server will only process if your report sql has a "where" clause in it. "
A grouping on server is required to create a where clause in the SQL, as is the design of the report. This also applies to generating a Having clause (see below),
"The more formulas you have the slower the report will run because formulas are processed on the local machine so, groupng on them is less efficient than grouping on a database field."
Using SQL Expressions in lieu of formulas is a plus here, perhaps that's part of what Robbie is trying to say.
The significant gain is realized by performing the grouping on the server, if you place a database field in a formula, and Crystal has to do the grouping in either case, it will be similar. Formulas slow reports according to their complexity, not simply as the result of being formulas. Placing a date formula in the report header will not crush performance, so don't use this as a generic rule.
"Using page number instead of page n of m is fater becasue n of m has to process all pages before displaying first page in order to calculate total page count. "
It isn't faster overall, it simply makes the first page return faster, and that is also based on many other factors, such as formulas/record selection/Top N/etc. which is based on the total rowset. The trick LB often uses of a formula to count or sum a value and then using the group selection formula when it isn't passed as a HAVING clause will impede performance as well.
-k
RE: Report Speed Tips
RE: Report Speed Tips
80 formulas in a report is probably going to prove punishing and many of those can probably be converted to SQL Expressions which should speed things up, and I pity anyone forced to work in those enviromnets, that's when it's time to find a client/amployer that considers business intelligence critical to their futture ;)
-k
RE: Report Speed Tips
RE: Report Speed Tips
My preference is to generate Views or SPs to base the report on. Then if something changes, I tweak the View/SP and may not have to touch the report at all.
Second might be to use a Command Object, it depends, sometimes I prefer the solution below over this.
Third would be the Crystal GUI and SQL Expressions.
-k
RE: Report Speed Tips
In general, are there certain report options or options settings that enhance speed other than the obvious "Use indexes of server for speed"?
-LB
RE: Report Speed Tips
That would be a quick performance leap.
Asynchronously does mean that you can interrupt processing from my memory, you shouldn't see any performance gains there.
-k
RE: Report Speed Tips
As I mentioned in a different thread, our local client is Oracle 8, and the Oracle Server driver option isn't available. I did try to research what would be involved in upgrading to an Oracle 9 client (cost, etc.), but gave up as I wasn't sure what I was looking at on the Oracle site, and then I had a conversation with a colleague about the fact that everyone else in the state is using Oracle 8 and since we are developing reports to be used statewide, it made sense to develop them based on a driver that works with Oracle 8.
I'd like to followup another comment you made. In an earlier post you mentioned: "rearranging a WHERE clause can change performance radically." I was under the impression that CR automatically optimized SQL statements developed with the CR GUI, but recently ran into a situation where I added criteria to the beginning of the selection formula, and discovered that part of the remaining formula no longer was passed to the SQL. I then moved the new criteria to the end of the formula, and then all criteria passed (no difference to parens or anything)--which made me think that the order of the clauses in the formula DO matter.
In the link you provided re: Oracle optimization, when discussing optimizing SQL, the author suggests that in the From clause (e.g., if writing a command), you should start with the table that has the most records (if I'm reading this correctly). I guess I thought you should lead with a table that has one record that is then linked to those with many records.
Are there specific rules about the order of tables in the from clause of commands or about the order of clauses in the where part of a record selection formula in the GUI or in a command?
Anyone with comments please jump in.
-LB
RE: Report Speed Tips
I spent weeks trying to figure out the pattern, and there just isn't a discernible one.
The point, of course, is that you must always get the SQL to pass, but aside from that, the order in which the WHERE clause passes can improve performance, but this has more to do with indexes and the optimizer than simply the number of the rows in tables. So rearranging the record selection should alter how the where clause is generated, so don't be shy about switching things around and testing. If you find that in one configuration it doesn't pass, you can probably still get it todo so in that order, it just takes lots of patience and creativity. Really, it's Voodoo...
As for the Oracle 8 client not allowing for native connectivity through CR XI, I didn't recall that. I guess you mean that Oracle Server doesn't show as an option, and I'm not surprised.
Anyway, rearrange the record selection in different configurations in each report and test for gains.
Also one can check the execution plan on the Oracle server to see whether you are using full table scans, indexes, etc.
Getting a sharp Oracle dba proficient in tuning (this is rare, most think that they are) involved is critical.
-k
RE: Report Speed Tips
-LB
RE: Report Speed Tips
RE: Report Speed Tips
I usually give the reverse advice, and occasionally bump up against Oracle dbas who don't want additional objects in the database, but of course to truly sinmply reusablility, the repository is a bad idea as tools other than crystal will have to go through the Crystal API to use them.
I'm sure Business Objects appreciates your thinking, however Mr. Elison probably would not.
Sounds like you understand databases well, I hope that you'll make it a habit to contribute here.
-k
RE: Report Speed Tips
My thinking is the the business logic/rules are the responsibilty of the report writer and not the DBA, so the report writer should have some flexibility in controlling this logic (i.e the SQL).
RE: Report Speed Tips
But it's akin to having business rules embedded in application development.
In a small shop without competent database programmers and BAs you're limited, and go with what you have.
And reporting tends to be the redheaded stepchild of IT anyway, little justification or forethought is applied, yet it's the foundation for upper management in making critical business decisions.
That's a catastrophic disconnect, and explains most contracts I take that need to improve efficiencies.
To emphasize this point, I often use the simplified analogy that you have 2 ways to supply the prices for your parts, you can hardcode the values in code somewhere, or you can create a lookup/reference table.
Obviously you'd prefer the latter. Yet people often think very little about hardcoding other business rules into a proprietary layer, instead of taking the time to make them part of the database or an appropriate data mart so that all tools use the same rules, and enjoy simplified maintenance.
True, it requires that you measure twice and cut once, which is analogous to excellent, and too rare in IT management.
Everything that you store in the repository or in a BO View/Universe can be done on the database, and then it's available to all tools and you have a unified layer of business rules.
The notion that a report writer is competent enough to write SQL confirms that the dba thinks them competent enough to write Views/SPs anyway, it is afterall, all SQL executed on the database (OK, some other languages such as Java in SPs on Oracle can be used, but you get the point).
The fact that they allow software developers to use pass through SQL smacks of short term resource planning and management. A few here and there won't significantly impact performance, but it snowballs, and so the DBAs require more hardware and workarounds, and it continues to feed on itself, spiralling into a hodge podge of disparate, inefficient code that requires constant maintenance and additional hardware. Plus the method for extracting the data can vary between reports then, further complicating maintenance.
BAs, database programmers and software developers are all different titles for a valid reason.
Bring me the best report writer and show me her/his work, and I'll bring the best BA, database programmer and report writer and this theory that the report writer should have flexibility in implementing the business rules using SQL will prove optimistic.
I've had this discussion with the top IT people in the top firms for over 10 years regarding implementing Crystal. and even longer for software in general, and listened to the best arguments, and not once have they decided that in the best scenario a report writer should have this responsibility, it's stopgap and demonstrates poor management.
It's true that under resourced shops with unrealistic timeframes for their projects might be able to hit a few goals earlier on by this approach, but long term it's an expensive workaround to implementing quality software.
-k
RE: Report Speed Tips
11-Avoid subreports (in most cases), although they may result in a faster report if they eliminate significant row inflation.
If you must use subreports, opt for on-demand subreports, if possible. Then, Pass 2 of the multi-pass flow only needs to happen during the drill down.
I've also read that using arrays instead of subreports can also help performance although I'm not quite sure how to use arrays that well so I can't elaborate. I'll try to find the link to the article.
RE: Report Speed Tips
If all of your field heading objects, and field objects which you know should be one line high do not have the can grow flag set then Crystal Reports is able to skip over them when its working on calculating the page n of m.
Also if you are just using a formula to concatenate two fields it is faster to embed the fields in a text object.
RE: Report Speed Tips
Avoiding formulas in general (using text objects is one way, you can also concatenate in a SQL Expression) makes sense, add to that minimizing applying any formatting attributes makes sense as well, thanks for the tips!
-k