×
INTELLIGENT WORK FORUMS
FOR COMPUTER PROFESSIONALS

Contact US

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

Increasing double loop efficiency

Increasing double loop efficiency

Increasing double loop efficiency

(OP)
I have the following double loop;

CODE -->

DO i=1,100
  DO j=1,100
    S(i,j)= S(i,j)+ALPHA*exp[(real(i)/b1)**2]*exp[(real(j)/b2)**2]
  ENDDO
ENDDO 

Here, S is a 100x100 symmetric REAL array, ALPHA, b1, and b2 are REAL constants.

I am hoping to incorporate LAPACK dgemm or some other library function to increase the efficiency of this calculation. However, dgemm( TRANSA, TRANSB, M, N, K, ALPHA, A, LDA,B, LDB, BETA, C, LDC ) needs A and B arrays as input. I could define A as the first exponential using a DO loop and B as the second exponential using another DO loop but this approach may be less efficient.

What would be the most efficient way to calculate the above double loop in fortran?

Thanks,
Vahid

RE: Increasing double loop efficiency

In your case you compute
S = 1*S + ALPHA*A*B 

so the call
DGEMM(TRANSA, TRANSB, M, N, K, ALPHA, A, LDA, B, LDB, BETA, C, LDC ) 
seems to be
DGEMM('N', 'N', 100, 100, 100, ALPHA, A, 100, B, 100, 1, S, 100) 

where
S is your 100x100 matrix

A is 100x100 matrix of this form, which only has the first column non-zero, otherwise all columns are zeros:
    | exp((  1/b1)**2)  0   0   0   0 ... 0 |
    | exp((  2/b1)**2)  0   0   0   0 ... 0 |
    | ..................................... |
    | exp((100/b1)**2)  0   0   0   0 ... 0 |
 

B is 100x100 matrix of this form, which only has the first row non-zero, otherwise all rows are zeros:
    | exp((1/b2)**2)  exp((2/b2)**2) ... exp((100/b2)**2) |
    |        0            0          ...     0            |
    | ....................................................|
    |        0            0          ...     0            |
 

RE: Increasing double loop efficiency

(OP)
Thanks mikrom for your response. Out of curiosity, wouldn't it be more efficient to define A as a column matrix, B as a row matrix and set K=1? Wouldn't this involve fewer calculations?

Vahid

RE: Increasing double loop efficiency

Yes, of course (i didn't even think of that)

A is 100x1 matrix of this form:
    | exp((  1/b1)**2) |
    | exp((  2/b1)**2) |
    | .................|
    | exp((100/b1)**2) |
 

B is 1x100 matrix of this form:
    | exp((1/b2)**2)  exp((2/b2)**2) ... exp((100/b2)**2) |
 
Then the call seems to be
DGEMM('N', 'N', 100, 100, 1, ALPHA, A, 100, B, 1, 1, S, 100) 

RE: Increasing double loop efficiency

I was thinking about, how to use DGEMM for your case.
But now when I think about it again, I have doubt, that DGEMM is faster for your simple case, than your simple 2 loops.
Look at the source o DGEMM how many loops it has:
https://netlib.org/lapack/explore-html/d7/d2b/dgem...
Maybe DGEMM is efficient for more complicated cases ...

RE: Increasing double loop efficiency

(OP)
I will try both cases using 1) the two do loops, and 2) DGEMM, to see which is faster. It may well be that they are similar in speed.

Thanks,
Vahid

RE: Increasing double loop efficiency

Yes try it and let me know

RE: Increasing double loop efficiency

(OP)
I replaced the double loop with DGEMM in my code. The DGEMM is run inside three other loops and the whole code is run in parallel across 4 nodes of 64 cores each.

When using the double DO loop, the total run time is 7h19m. With DGEMM replacing the double loop, the runtime is 3h34m, a significant speed up.

Thanks mikrom for all your help.

Cheers,
Vahid

RE: Increasing double loop efficiency

Vahid - good job with an amazing result !

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close