Contact US

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

How to parallelize a counter?

How to parallelize a counter?

How to parallelize a counter?

Hello everybody,

I have the following piece of fortran code:

do kk=1,ncz
do jj=1,ncy
do ii=1,ncx

I have been trying to parallelize it but I have not been able to. So far I have done as follows but without good results:

!$OMP PARALLEL PRIVATE(ii,jj,kk,cont), &
!$OMP SHARED(derdensi)
!$OMP DO REDUCTION (+:grdtgrvmsft)
do kk=1,ncz
do jj=1,ncy
do ii=1,ncx

Does anyone have an advice how to correct this?

RE: How to parallelize a counter?

I think what you need to do is calculate the "counter" as a function of the loop variables so that it can be calculated any time independent of the order in which the looks are carried out (in parallel). Something like:

cont = ii + ncx*(jj-1) + ncx*ncy*(kk-1)

RE: How to parallelize a counter?

Sigh - this would be so easy in APL.

How about an implied do loop array)

(/(i, L=1, n)/)

Then RESHAPE it...

If the compiler is good enough, it would optimize that with an array operation.

I'll let you figure out the details.

It's even possible that the most modern FORTRANS already have a single function that does it. I haven't kept up.

Do not use loops if you want to parallize it - the compiler might not be that smart.

RE: How to parallelize a counter?

One minor note: If you can, you may want to re-order the subscripts of your arrays, so that the first two subscripts are reversed, because the implied do loop would put things in the wrong order for what you want. In particular, storage order increments the first index first (i.e., incrementing the first index takes you to the next location in memory, which is where that implied do loop would take the next index, then the second subscript, then the third.

In addition, memory caching works best if you work in storage order - i.e. where possible the innermost loop should increment the first index, and the outermost loop should increment that last index. And many CPUs can execute multiple operations at once if you work in storage order. So that improves efficiency even without parallel execution - though you could have gotten that just by switching the do loop order...

Hope that helps.

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close