visual c++ version6 program slowness 2

dwcasey · Aug 6, 2004

I have a developer that has a stand-alone executable that he compiled on his Windows XP laptop. This program reads in some data, then gives the output of the data read.

On one of our Windows2000 servers, this program runs in about 10.5 minutes, but on his laptop, it will take about 5.5 minutes.

I would think that the server would be quicker? The only thing that seems different is that he is using Windows XP.

Does XP have some optimizations? Is Visual C++ version 6 doing something different on XP than it does on Win2k?

Thanks.

Zech · Aug 6, 2004

A compiler optimizes your codes based on the system where you are compiling. In your case, your program is optimized for Windows XP and the laptop's hardware configuration.

However, I don't usually find such a big difference in performance when moving applications between Windows XP and Windows 2000. You should double-check the server configuration and the program's codes. You also need to ensure that the program is compiled for 'release' version and not 'debug' version.

dwcasey · Aug 6, 2004

Thanks for the tip about release vs. debug, I will have the developer check that out.

We have moved the .exe file from his XP laptop to the Win2k server and also recompiled on the Win2k server and it is still several minutes slower on the Win2k server.

ArkM · Aug 7, 2004

The server may be slower than a (new) laptop. For example, we have a (one of) server with Pentium/800 in our network but we have workstations with Pentium/2600 and faster too.
See your server and the laptop processor characteristics before to start an investigation. See also your application priority on the server...

dwcasey · Aug 8, 2004

I checked those. Server is a 2.0GHz Xeon with 1GB ram. The laptop is a Dell with a Centrino 1.4GHz and 512MB ram.

When the program started, it averages 97% of the cpu usage.

mingis · Aug 9, 2004

Could be, I have read it - 1,6 GHz Pentium M is near of the same speed like other ordinary 3 GHz systems. Tact frequency is not the only parameter determining cumulative speed of the system. It depends on many other parameters of the chipset. For example, AMD processors with the same frequency are commonly known as ~1.5 times faster as Intel ones.

> I would think that the server would be quicker?

Server OS allways are SLOWER than single user ones

They are optimised to serve many users at a time, not to be quick.

Has the program much display output? If so, graphic card speed is critical place - servers usually have weak graphic.

If it reads huge amount of data from the disk - check speeds of hard disks - compare times of simply copying of the same big file.

dwcasey · Aug 9, 2004

The only this it displays is a message to a dos command windows that says it's read in the data, the when it's done, it will display a result summary.

mingis · Aug 9, 2004

Try measure speed of some simple loop application like

for(int ii=0; ii<20; ii++)
for(int jj=0; jj<1000000000; jj++);

It should be OS and optimisation independent and will show the real CPU relative speed (in case if these 97% do not include interprocess timeslot switching waste).

Also, Zech wrote:

> A compiler optimizes your codes based on the system where you are compiling

Are you sure? I don't think so.

Zech · Aug 9, 2004

mingis said:
Also, Zech wrote:

> A compiler optimizes your codes based on the system where you are compiling

Are you sure? I don't think so.

When a compiler compiles a source code, the compiler does more than just stripping the comments, variable's names, etc. and converting those stripped codes into machine language. A compiler does its own additional optimization based on the OS and the hardware of the computer where you are compiling. That's why you see there are many Linux geeks who prefer to compile their applications/patches/etc. from the source codes rather than having them in a pre-compiled binary form. In fact, you can find many source-based linux distro, in which they distribute their packages only in pure source codes. The primary reason for them to go through all those hassle of recompiling their own source codes is performance. They want to take advantage of compiler optimization and optimize the software to better suit the host machine's system configuration.

In the following article (

http://msdn.microsoft.com/visualc/v...ry/en-us/dv_vstechart/html/optimization.asp),

you will see that one feature of the compiler for Visual C++ 7.0 or .NET is that it can take advantage of specific features of Intel Pentium 4 and AMD Athlon.

I hope that clarifies my statement a bit.

Also for dwcasey's problem, I am not quite sure what the exact problem is but I do feel that your developer could do much more to optimize the codes. Again, that depends on what kind of work the software is trying to do.

Zech · Aug 9, 2004

In addition, doing looping is not a very good way of measuring performance because the software may be sharing the processor's time with other running applications/services and these running applications/services may differ between the two machines. Thus, you will not get an objective view of the software performance.

Also, looping will not get you an OS and optimization independent result. Compiler optimization always take place upon compile and the program will always be optimized according to the specific OS in which it is compiled. That's why C++ decompiler is non-existent because the resultant binary is already optimized in such an extent that reversing the binary back to its proper C++ codes is impossible.

If this "slowness" truly bothers you, you can ask your developer to use a profiling tool (if they haven't done so) to find which pieces of codes are the performance bottlenecks. I am not sure about Visual Studio .Net but a profiling tool is already built into the development environment of the Visual C++ 6.0 Professional edition. Alternatively, you can also find free profiling tools on the internet.

I hope that helps.

mingis · Aug 9, 2004

In the following article (
http://msdn.microsoft.com/visualc/v...ry/en-us/dv_vstechart/html/optimization.asp),
you will see that one feature of the compiler for Visual C++ 7.0 or .NET is that it can take advantage of specific features of Intel Pentium 4 and AMD Athlon.

Ok, but I don't see there, that optimisation flags could be choosed by compiler authomatically - you must decide manually, what kind of optimisation to use:

On a Pentium 4 or AMD Athlon machine, the /G7 /arch:SSE2 version runs about 10% faster. This code cannot be run on a machine without the appropriate chip.

Compiler cannot authomatically generate code, which will not work on certain machines. It would mean, that compiling on Athlon will produce code, not running on Celeron.
So my opinion is, that dwcasey could try to test different optimisation options - may be some of them will be more suitable for Xeon.

In addition, doing looping is not a very good way of measuring performance because the software may be sharing the processor's time with other running applications/services and these running applications/services may differ between the two machines.

Yes, of course it will measure performance of whole system - hardware + OS + running environment, but dwcasey said, his application uses 97% of processor time. What do you think - is this pure process running time or it contains also some time elapsed for switching between processes?

In addition you can measure execution time of different kinds of calculation you use in your application, adding to loop body calls of different functions. May be, for instance, floating point unit is slow on your server - floating calculation is also probably unusual job for servers. Possible optimisations should be taken into account here:

for(ii=0; ii<1000; ii++) rr=sqrt(123.45);

could be optimised to

rr=sqrt(123.45);
for(ii=0; ii<1000; ii++);

or more likely to

rr=11.11;
for(ii=0; ii<1000; ii++);

dwcasey · Aug 10, 2004

Wow! Thanks a bunch for all the advice.

The program the developer is writing uses an optimization engine (ilog) to optimize a driver's route on the highway.

So there is a lot of "math" involved when this thing runs.

I would think that an Intel Xeon 1.8GHz would be sufficient for this. And I am still mystified that his Pentium M laptop outpaces the server, but there ya go.

Thanks again for all the tips. Keep'em coming if you can. I will update the list is anything new happens.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

visual c++ version6 program slowness 2

dwcasey

MIS

Zech

Programmer

dwcasey

MIS

ArkM

IS-IT--Management

dwcasey

MIS

mingis

Programmer

dwcasey

MIS

mingis

Programmer

Zech

Programmer

Zech

Programmer

mingis

Programmer

dwcasey

MIS

Similar threads

Part and Inventory Search

Sponsor