How to write parallel code with Haskell vectors? -
one 1 hand, in haskell vector a seems preferred type use array of numbers. there (incomplete) vector tutorial.
on other hand, control.parallel.strategies defined in terms of traversable. vector library doesn't provide these instances.
the minimal complete definition of traversable t should define foldable and
traverse :: applicative f => (a -> f b) -> t -> f (t b) sequencea :: applicative f => t (f a) -> f (t a) i don't see how sequencea can defined data.vector.unboxed.vector. so, best approach writing parallel code unboxed vectors? defining new ad hoc strategies evalvector or using par , pseq explicitly or using plain data.array instead of vectors?
p.s. plain arrays parallelizable without problems: https://gist.github.com/701888
it's hack job parvector worked me:
import qualified data.vector v import control.parallel.strategies import control.parallel import control.deepseq ack :: int -> int -> int ack 0 n = n+1 ack m 0 = ack (m-1) 1 ack m n = ack (m-1) (ack m (n-1)) main = let vec = v.enumfromn 1 1000 let res = (v.map (ack 2) vec) `using` parvector print res parvector :: nfdata => strategy (v.vector a) parvector vec = eval vec `seq` done vec chunksize = 1 eval v | vlen == 0 = () | vlen <= chunksize = rnf (v v.! 0) -- fix handle chunks > 1 | otherwise = eval (v.take half v) `par` eval (v.drop half v) vlen = v.length v half = vlen `div` 2 and running code:
[tommd@mavlo test]$ ghc --make -o2 -threaded t.hs ... dumb warning ... [tommd@mavlo test]$ time ./t +rts -n1 >/dev/null real 0m1.962s user 0m1.951s sys 0m0.009s [tommd@mavlo test]$ time ./t +rts -n2 >/dev/null real 0m1.119s user 0m2.221s sys 0m0.005s when run code integer instead of int in type signature:
[tommd@mavlo test]$ time ./t +rts -n2 >/dev/null real 0m4.754s user 0m9.435s sys 0m0.028s [tommd@mavlo test]$ time ./t +rts -n1 >/dev/null real 0m9.008s user 0m8.952s sys 0m0.029s rock!
edit: , solution bit closer earlier attempt cleaner (it doesn't use functions 3 separate modules) , works great:
parvector :: nfdata => strategy (v.vector a) parvector vec = let vlen = v.length vec half = vlen `div` 2 minchunk = 10 in if vlen > minchunk let v1 = v.unsafeslice 0 half vec v2 = v.unsafeslice half (vlen - half) vec parvector v1 parvector v2 return vec else evalchunk (vlen-1) >> return vec evalchunk 0 = rpar (rdeepseq (vec v.! 0)) >> return vec evalchunk = rpar (rdeepseq (vec v.! i)) >> evalchunk (i-1) things learn solution:
- it uses
evalmonad, strict we're sure spark (compared wrapping things inlet, remembering use bang patterns). - contrary proposed implementation (a) doesn't construct new vector, costly (b)
evalchunkforces evaluation of each element usingrpar,rdeepseq(i don't believerpar vecforces of vector's elements). - contrary belief,
slicetakes start index , length, not start , end index. oops! - we still need import
control.deepseq (nfdata), i've e-mailed libraries list try , fix issue.
performance seems similar first parvector solution in answer, won't post numbers.
Comments
Post a Comment