performance - Forcing sequential processing in Haskell's Data.Binary.Get -


after trying import basic java runtime library rt.jar language-java-classfile, i've discovered uses huge amounts of memory.

i've reduced program demonstrating problem 100 lines , uploaded hpaste. without forcing evaluation of stream in line #94, have no chance of ever running because eats memory. forcing stream before passing getclass finishes, still uses huge amounts of memory:

  34,302,587,664 bytes allocated in heap   32,583,990,728 bytes copied during gc      139,810,024 bytes maximum residency (398 sample(s))       29,142,240 bytes maximum slop              281 mb total memory in use (4 mb lost due fragmentation)    generation 0: 64992 collections,     0 parallel, 38.07s, 37.94s elapsed   generation 1:   398 collections,     0 parallel, 25.87s, 27.78s elapsed    init  time    0.01s  (  0.00s elapsed)   mut   time   37.22s  ( 36.85s elapsed)   gc    time   63.94s  ( 65.72s elapsed)   rp    time    0.00s  (  0.00s elapsed)   prof  time   13.00s  ( 13.18s elapsed)   exit  time    0.00s  (  0.00s elapsed)   total time  114.17s  (115.76s elapsed)    %gc time      56.0%  (56.8% elapsed)    alloc rate    921,369,531 bytes per mut second    productivity  32.6% of total user, 32.2% of total elapsed 

i thought problem consttables staying around, tried forcing cls in line #94 well. makes memory consumption , runtime worse:

  34,300,700,520 bytes allocated in heap   23,579,794,624 bytes copied during gc      487,798,904 bytes maximum residency (423 sample(s))       36,312,104 bytes maximum slop              554 mb total memory in use (10 mb lost due fragmentation)    generation 0: 64983 collections,     0 parallel, 71.19s, 71.48s elapsed   generation 1:   423 collections,     0 parallel, 344.74s, 353.01s elapsed    init  time    0.01s  (  0.00s elapsed)   mut   time   40.60s  ( 42.38s elapsed)   gc    time  415.93s  (424.49s elapsed)   rp    time    0.00s  (  0.00s elapsed)   prof  time   56.53s  ( 57.71s elapsed)   exit  time    0.00s  (  0.00s elapsed)   total time  513.07s  (524.58s elapsed)    %gc time      81.1%  (80.9% elapsed)    alloc rate    844,636,801 bytes per mut second    productivity   7.9% of total user, 7.7% of total elapsed 

so question basically, how force sequential processing of files involved, after each 1 processed, string result (cls) remains in memory?

edit 2: realized code this:

stream <- bl.pack <$> filecontents [] classfile 

don't that. pack functions notoriously slow. you'll need find solution doesn't involve using pack create bytestring.

i'm leaving rest of answer because still think applies, biggest problem.

unfortunately can't test because don't recognize imports.

if want result cls remain in memory, why don't force instead of forcing stream? change line 94 to

cls `seq` return cls 

it may necessary use deepseq instead of seq, although have suspicion plain seq sufficient here.

however think there's better solution, , that's use mapm_ instead of mapm. think it's better style (and better performance) create function it's supposed each result rather returning list. here, can change main function to:

main =    witharchive [checkconsflag] jarpath $     classfiles <- filter isclassfile <$> filenames []     form_ classfiles $ \classfile ->        stream <- bl.pack <$> filecontents [] classfile       let cls = runget getclass stream       lift $ print cls 

now print lifted function passed form_ each classfile. value cls used internally , never returned, it's both evaluated , gc'd on each iteration of form_.

making use of style in larger application may require refactoring or redesign, results may worth it.

edit: if you're going trouble redesign code, use iteratees , avoid problem entirely.


Comments

Popular posts from this blog

android - Spacing between the stars of a rating bar? -

html - Instapaper-like algorithm -

c# - How to execute a particular part of code asynchronously in a class -