these are the results i obtain:
Code:
#1 fgets
before DEBUG: BEGIN MEMORY=82124
before DEBUG: MEMORY PEAK=135516
after DEBUG: BEGIN MEMORY=28215304
after DEBUG: MEMORY PEAK=28227756
#2 explode
before DEBUG: BEGIN MEMORY=77176
before DEBUG: MEMORY PEAK=80044
after DEBUG: BEGIN MEMORY=30714224
after DEBUG: MEMORY PEAK=30714276
#3 the file function
before DEBUG: BEGIN MEMORY=71940
before DEBUG: MEMORY PEAK=80240
after DEBUG: BEGIN MEMORY=27942284
after DEBUG: MEMORY PEAK=33477884
#4 preg_split
before DEBUG: BEGIN MEMORY=68536
before DEBUG: MEMORY PEAK=80056
after DEBUG: BEGIN MEMORY=30705700
after DEBUG: MEMORY PEAK=30705752
#5 mb_split
before DEBUG: BEGIN MEMORY=63136
before DEBUG: MEMORY PEAK=80052
after DEBUG: BEGIN MEMORY=5597216
after DEBUG: MEMORY PEAK=5599432
#6 split
before DEBUG: BEGIN MEMORY=68524
before DEBUG: MEMORY PEAK=80044
after DEBUG: BEGIN MEMORY=5602604
after DEBUG: MEMORY PEAK=5604820
the last iteration used split in lieu of mb_split
at first glance this shows that neither split nor mb_split are being memory-greedy, but the others are.
None, however, breached the memory limit of 32MB that I placed on the script.
the split and mb_split results are aberrations as, on examination, the array is not being properly formed. The regex is not matching \x, \n or \v and thus only a single element array is returned.
i also examined the memory usage within the loop at each iteration. Whilst the increase in usage is not strictly linear there is certainly no big jump.
taking one of the functions, if we unset the array after it has been built then the memory usage drops back down to anticipated levels. We can therefore be confident (reasonably) that there is no memory leakage.
The above testing was done on a MAMP installation (as the OP reports that ubuntu is not a common denominator). php version 5.2.9.
Wanting to experiment further I tried running the explode variant via the command line on a 5.3.3 installation:
these were the results
Code:
ver 5.3.3
#YY file method
before DEBUG: BEGIN MEMORY=631176
before DEBUG: MEMORY PEAK=637176
after split DEBUG: BEGIN MEMORY=54764784
after split DEBUG: MEMORY PEAK=57541328
which signifies that php 5.3.3 is significantly more greedy than 5.2.9.
wanting to test whether this was something specific to multibyte strings (which struck me as unlikely) I tested a loop that created an array of 300000 elements (roughly the number of lines in the test data) with an eleven character word. the memory usage was around 30MB.
this indicates that there is simply an overhead in the array structures (indexes etc) that increases memory usage. which in turn makes it very likely that this is NOT a bug but expected behaviour.
solutions to the issue would depend on what you actually want to do with the data.