Process Array in batches 2

stevio · Apr 19, 2010

I have an array of integers and the array size is variable ,it is determined at processing time.

What I need to do is to work out how many elements there are in the array and process each element with no more than 50 at a time.

So, if there are 320 elements, then I need to process them in 6 lots of 50, then process the last 20.

I was thinking of working out the total array size, if it's bigger than or equal to 50, then dividing by 50 to get the number of batches, but also work out the remainder

Code:

@array = ("30,34,35,40,51.....") #variable size
$arraysize = $#array + 1;
$remainder = $arraysize % 50;
$iterations = sprintf("%d",$arraysize/50);
if ($arraysize >= 50){
  for ($i=0,$1 < $iterations,$i++){
    foreach $elem(0..$#arraysize) {

      #run fork command
      #get result
      splice(@array,0,50); #should I splice the top 50  elements after every lot of commands?
    }
  }
}

I guess I'm struggling to work out how to run commands in batches of 50 and also process the remaining elements.

Any help would be appreciated.

stevexff · Apr 19, 2010

You don't need all the baggage of the modulus checking; splice does what you need, and even handles the final odd-sized remainder correctly. Run the following example to see what I mean (I have used a batch size of 2 to save typing)

Perl:

my @data = qw{a b c d e};

while (my @batch = splice(@data, 0 , 2)) {
  print join(',', @batch), "\n";
}

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object:erlDesignPatterns)[/small]

stevio · Apr 21, 2010

Thanks Steve

I have redefined my process a bit:

As I had in my code, I wanted to fork off some commands.

Code:

@array = ("30,34,35,40,51.....") #variable size
$arraysize = $#array + 1;
$remainder = $arraysize % 50;
$iterations = sprintf("%d",$arraysize/50);
if ($arraysize >= 50){
  for ($i=0,$1 < $iterations,$i++){
    foreach $elem(0..$#arraysize) {

      [bold]#run fork command[/bold]
      #get result
      splice(@array,0,50); #should I splice the top 50  elements after every lot of commands?
    }
  }
}

Instead of processing in batches of 5, I want to keep 5 running at any given time.

What I can't work out is how to fork off 5 threads using those variables and then kick off another one when one or more has finished (good or bad return code), so that at any given time 5 are running

Example
Start
Command/Thread 1
Command/Thread 2
Command/Thread 3
Command/Thread 4
Command/Thread 5

Thread 2 finishes first, kick of new thread 6? How do I get a return code from the forked process so that I can kick off the next one?

stevexff · Apr 22, 2010

Been looking at the threads support, which seems to have everything you need apart from the ability to wait on a list. You can list the running threads, join() individual ones, and even list ones that can be join()ed without blocking; but unless I'm missing something obvious, you can't give it a list of running threads and have it block until the first one completes.

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object:erlDesignPatterns)[/small]

stevio · Apr 22, 2010

Steve,

Are you saying threads are better than using fork? Would fork achieve what I need to do?

stevexff · Apr 22, 2010

fork is kind of old-school - it uses *nix signals to communicate between the processes. I must admit I've never used it, although I have experience of threading in Java, Ruby, and even running subtasks on mainframes. Based on that experience, I tend to favour the option of threads because a lot of the hard interprocess communication gets done for you, without having to write signal handlers and similar.

Maybe one of the hardcore Unix gurus on the forum can offer some advice? Kirsle springs to mind, he always seems to be attracted to hard stuff like this [smile]

Steve

[small]"Every program can be reduced by one instruction, and every program has at least one bug. Therefore, any program can be reduced to one instruction which doesn't work." (Object:erlDesignPatterns)[/small]

Annihilannic · Apr 27, 2010

Here's an example of using forks to run 6 processes in parallel:

Code:

[gray]#!/usr/bin/perl -w[/gray]
[url=http://perldoc.perl.org/functions/use.html][black][b]use[/b][/black][/url] [green]strict[/green][red];[/red]

[url=http://perldoc.perl.org/functions/my.html][black][b]my[/b][/black][/url] [blue]@array[/blue] = [red]([/red][fuchsia]30[/fuchsia],[fuchsia]34[/fuchsia],[fuchsia]35[/fuchsia],[fuchsia]40[/fuchsia],[fuchsia]21[/fuchsia],[fuchsia]20[/fuchsia],[fuchsia]12[/fuchsia],[fuchsia]35[/fuchsia],[fuchsia]10[/fuchsia],[fuchsia]28[/fuchsia],[fuchsia]15[/fuchsia],[fuchsia]23[/fuchsia][red])[/red][red];[/red]
[black][b]my[/b][/black] [blue]$parallel[/blue] = [fuchsia]6[/fuchsia][red];[/red]
[black][b]my[/b][/black] [blue]$running[/blue] = [fuchsia]0[/fuchsia][red];[/red]
[black][b]my[/b][/black] [blue]$result[/blue][red];[/red]

[olive][b]foreach[/b][/olive] [black][b]my[/b][/black] [blue]$elem[/blue] [red]([/red][blue]@array[/blue][red])[/red] [red]{[/red]
        [black][b]my[/b][/black] [blue]$childpid[/blue] = [url=http://perldoc.perl.org/functions/fork.html][black][b]fork[/b][/black][/url][red]([/red][red])[/red][red];[/red]
        [olive][b]if[/b][/olive] [red]([/red][blue]$childpid[/blue] == [fuchsia]0[/fuchsia][red])[/red] [red]{[/red]
                [url=http://perldoc.perl.org/functions/print.html][black][b]print[/b][/black][/url] [url=http://perldoc.perl.org/functions/localtime.html][black][b]localtime[/b][/black][/url][red]([/red][red])[/red] . [red]"[/red][purple]: execing /usr/bin/sleep [blue]$elem[/blue][purple][b]\n[/b][/purple][/purple][red]"[/red][red];[/red]
                [url=http://perldoc.perl.org/functions/exec.html][black][b]exec[/b][/black][/url] [red]'[/red][purple]/usr/bin/sleep[/purple][red]'[/red],[blue]$elem[/blue][red];[/red]
        [red]}[/red] [olive][b]else[/b][/olive] [red]{[/red]
                [blue]$running[/blue]++[red];[/red]
                [black][b]print[/b][/black] [black][b]localtime[/b][/black][red]([/red][red])[/red] . [red]"[/red][purple]: child pid [blue]$childpid[/blue] spawned, [blue]$running[/blue] running[purple][b]\n[/b][/purple][/purple][red]"[/red][red];[/red]
                [olive][b]if[/b][/olive] [red]([/red][blue]$running[/blue] >= [blue]$parallel[/blue][red])[/red] [red]{[/red]
                        [blue]$result[/blue] = [url=http://perldoc.perl.org/functions/wait.html][black][b]wait[/b][/black][/url][red];[/red]
                        [blue]$running[/blue]--[red];[/red]
                        [black][b]print[/b][/black] [black][b]localtime[/b][/black][red]([/red][red])[/red] . [red]"[/red][purple]: child pid [blue]$result[/blue] completed, [blue]$running[/blue] running[purple][b]\n[/b][/purple][/purple][red]"[/red][red];[/red]
                [red]}[/red]
        [red]}[/red]
[red]}[/red]

[olive][b]until[/b][/olive] [red]([/red][red]([/red][blue]$result[/blue] = [black][b]wait[/b][/black][red])[/red] == -[fuchsia]1[/fuchsia][red])[/red] [red]{[/red]
        [blue]$running[/blue]--[red];[/red]
        [black][b]print[/b][/black] [black][b]localtime[/b][/black][red]([/red][red])[/red] . [red]"[/red][purple]: child pid [blue]$result[/blue] completed, [blue]$running[/blue] running[purple][b]\n[/b][/purple][/purple][red]"[/red][red];[/red]
[red]}[/red]

Annihilannic.

Annihilannic · May 5, 2010

I'm curious to know whether that solved your problem, or you went with another method of multi-threading?

Annihilannic.

stevio · May 6, 2010

IT solved the problem, thank you.

Tek-Tips is the largest IT community on the Internet today!

Members share and learn making Tek-Tips Forums the best source of peer-reviewed technical information on the Internet!

Process Array in batches 2

stevio

Vendor

stevexff

Programmer

stevio

Vendor

stevexff

Programmer

stevio

Vendor

stevexff

Programmer

Annihilannic

MIS

Annihilannic

MIS

stevio

Vendor

Similar threads

Part and Inventory Search

Sponsor