Auto-parallelized code runs, but why no speedup?
Auto-parallelized code runs, but why no speedup?
(OP)
I have a working program (Fortran 77) that I’m trying to auto-parallelize on a cluster. The changes made in the compiling makefiles (abbreviated) are shown in the curly brackets below.
Build library:
FOR=ifort -c -O3 {-parallel}
LINK=ifort {-parallel}
PROG_DIR=/export/home/mydir/
HJS=$(PROG_DIR)/hjs
.f.o:
$(FOR) $<
rm libprog.a
$(FOR) $(HJS)/*.f
ar -rv $(PROG_DIR)/libprog.a *.o
rm *.o
Compile executable:
FOR=ifort -c -O3 {-parallel}
LINK=ifort {-parallel}
PROG_DIR=/export/home/mydir/
LIBPROG=-L$(PROG_DIR) -lprog
.f.o:
$(FOR) $(OPT) $<
prog_hjs.o: prog_hjs.f
$(FOR) prog_hjs.f
prog_hjs: prog_hjs.o
$(LINK) -o prog_hjs prog_hjs.o $(LIBPROG)
The program appears to utilize as many cores as are given to it, but execution speed is unchanged. What is missing? Is anything more required in the compilation above?
Build library:
FOR=ifort -c -O3 {-parallel}
LINK=ifort {-parallel}
PROG_DIR=/export/home/mydir/
HJS=$(PROG_DIR)/hjs
.f.o:
$(FOR) $<
rm libprog.a
$(FOR) $(HJS)/*.f
ar -rv $(PROG_DIR)/libprog.a *.o
rm *.o
Compile executable:
FOR=ifort -c -O3 {-parallel}
LINK=ifort {-parallel}
PROG_DIR=/export/home/mydir/
LIBPROG=-L$(PROG_DIR) -lprog
.f.o:
$(FOR) $(OPT) $<
prog_hjs.o: prog_hjs.f
$(FOR) prog_hjs.f
prog_hjs: prog_hjs.o
$(LINK) -o prog_hjs prog_hjs.o $(LIBPROG)
The program appears to utilize as many cores as are given to it, but execution speed is unchanged. What is missing? Is anything more required in the compilation above?
RE: Auto-parallelized code runs, but why no speedup?
But in general, it is better to implement yourself parallel instructions using OpenMP for instance. This is much more efficient than automatic parallelization. But this is not easy to do.
François Jacq