-
3
- #1
Chris Miller
Programmer
Within thread184-1820648 a lot of ideas were posted about the need Griff has to process data to very many PDFs. In the end, it has become a bit too convoluted for me, and I don't hide the fact I'm embarrassed by the confusion I had and caused already, so I opted out.
But I wanted to pull together some thoughts about performance and responsiveness improvements by parallel processing. It doesn't really fit into the tek-tips categories of question or tip, it's also not really just news, but I have to pick something to differentiate this from being neither a question not a tip.
Parallel processing in either form like using multi-threading or multiple processes is something that can always be used as todays computers, as multiple CPU cores have become very normal, while VFP is still just single-threaded.
The idea of using multiple computers to process data to the many PDFs necessary was even questioned by the needs that arise from organizing this, be it splitting up data in thread184-1820648. I'm not saying that's wrong. Yes, there are usually reorganisation efforts that limit the acceleration factor to be the number of computers used or the number of CPUs or CPU cores. But usually, you get a factor of much more than 1 and even if you only end up with N-1 it pays to make that effort.
I stated, having made the first experiments by setting process affinities manually with the task manager, that this worked out as I expected it. Then I looked into doing the same thing programmatically and posted this code:
And this works out okay, too.
So I think this is the next logical step on top of using multiple processes and relying on the OS to balance the load of the single cores. This ability to set CPU core affinities also can be merged into what I posted about multiprocessing in thread184-1820019.
I once more looked into ParallelFox, which is a library by far more advanced in features and more mature on that topic. I don't find usage of processor affinity in ParallelFox source code. That's what you could file under "news".
Maybe Joel Leach, the project manager and only contributor I see on already did test but found out it's not advancing the effectiveness of the worker concept. The only aspect related to the CPU core number I find in the source code is that ParallelFox does take the number of CPU cores into account as default number of workers. It's sensible to think that the multiple processes will make balanced use of the cores in themselves without manipulating CPU affinities. By default, any process, also VFP executable processes, will have no specific affinity. This means their affinity mask has all CPU core bits set on and that means they are not limited to which core they use. That also can be interpreted as using any other affinity will just lower the ability of a process to make use of all CPU cores available at any time.
I expect setting affinity to one specific core to actually improve the parallelity of processing. Besides lowering the overhead of switching cores, it would simply ensure that N processes use N different cores and therefore run in parallel by definition, while the OS might assign some cores to more than 1 worker and you don't get to the full parallelity of all cores for all workers.
I already opted out of making multiprocessing with COM a new project in itself, this time I might take on the effort of modifying ParallelFox.
Chriss
But I wanted to pull together some thoughts about performance and responsiveness improvements by parallel processing. It doesn't really fit into the tek-tips categories of question or tip, it's also not really just news, but I have to pick something to differentiate this from being neither a question not a tip.
Parallel processing in either form like using multi-threading or multiple processes is something that can always be used as todays computers, as multiple CPU cores have become very normal, while VFP is still just single-threaded.
The idea of using multiple computers to process data to the many PDFs necessary was even questioned by the needs that arise from organizing this, be it splitting up data in thread184-1820648. I'm not saying that's wrong. Yes, there are usually reorganisation efforts that limit the acceleration factor to be the number of computers used or the number of CPUs or CPU cores. But usually, you get a factor of much more than 1 and even if you only end up with N-1 it pays to make that effort.
I stated, having made the first experiments by setting process affinities manually with the task manager, that this worked out as I expected it. Then I looked into doing the same thing programmatically and posted this code:
Code:
Declare INTEGER GetCurrentProcess In Kernel32
Declare INTEGER GetProcessAffinityMask In Kernel32 ;
INTEGER hProcess, STRING @lpProcessAffinityMask, STRING @lpSystemAffinityMask
Declare INTEGER SetProcessAffinityMask In Kernel32 ;
INTEGER hProcess, INTEGER dwProcessAffinityMask
Declare Integer GetLastError in WIN32API
Local lnProcessHandle, lcPA, lcSA, lnPA, lnSA, lnCPU
lnProcessHandle = GetCurrentProcess()
lcPA = Space(2)
lcSA = Space(2)
Clear
If GetProcessAffinityMask(lnProcessHandle,@lcPA, @lcSA) = 1
? 'Affinity Masks (Process, System):'
? CreateBinary(lcPA), CreateBinary(lcSA)
* Translate System affinity mask string to number
* It will have bits set for all available cores, i.e. for 4 cores it will be 0h0F00 = 15
lnSA = CToBin(lcSA,"2RS")
lnCPU = 2 && 0..3 for 4 cores
lnPA = Bitand(Bitset(0,lnCPU),lnSA) && Bitand with lnSA ensures no unavailable CPU core bit is set
* Could also set multiple CPUs by setting multiple bits, i.e. Affinity Mask=15 would mean no specific CPU core affinity.
If lnPA>0 and SetProcessAffinityMask(lnProcessHandle,lnPA) = 1
? 'OK: Process affinity set to CPU '+Alltrim(Str(lnCPU))
Else
If lnPA=0
? 'Error: CPU number not available according to System Affinity Mask.'
Else
? 'Error:',GetLastError()
Endif
EndIf
If GetProcessAffinityMask(lnProcessHandle,@lcPA, @lcSA) = 1
? 'Affinity Masks (Process, System):'
? CreateBinary(lcPA), CreateBinary(lcSA)
EndIf
EndIf
And this works out okay, too.
So I think this is the next logical step on top of using multiple processes and relying on the OS to balance the load of the single cores. This ability to set CPU core affinities also can be merged into what I posted about multiprocessing in thread184-1820019.
I once more looked into ParallelFox, which is a library by far more advanced in features and more mature on that topic. I don't find usage of processor affinity in ParallelFox source code. That's what you could file under "news".
Maybe Joel Leach, the project manager and only contributor I see on already did test but found out it's not advancing the effectiveness of the worker concept. The only aspect related to the CPU core number I find in the source code is that ParallelFox does take the number of CPU cores into account as default number of workers. It's sensible to think that the multiple processes will make balanced use of the cores in themselves without manipulating CPU affinities. By default, any process, also VFP executable processes, will have no specific affinity. This means their affinity mask has all CPU core bits set on and that means they are not limited to which core they use. That also can be interpreted as using any other affinity will just lower the ability of a process to make use of all CPU cores available at any time.
I expect setting affinity to one specific core to actually improve the parallelity of processing. Besides lowering the overhead of switching cores, it would simply ensure that N processes use N different cores and therefore run in parallel by definition, while the OS might assign some cores to more than 1 worker and you don't get to the full parallelity of all cores for all workers.
I already opted out of making multiprocessing with COM a new project in itself, this time I might take on the effort of modifying ParallelFox.
Chriss