[Haskell-cafe] advice on space efficient data structure with
efficient snoc operation
Manlio Perillo
manlio_perillo at libero.it
Tue Apr 7 07:49:56 EDT 2009
Hi.
I'm still working on my Netflix Prize project.
For a function I wrote, I really need a data structure that is both
space efficient (unboxed elements) and with an efficient snoc operation.
I have pasted a self contained module with the definition of the
function I'm using:
http://hpaste.org/fastcgi/hpaste.fcgi/view?id=3453
The movie ratings are loaded from serialized data, and the result is
serialized again, using the binary package:
transcodeIO :: IO ()
transcodeIO = do
input <- L.hGetContents stdin
let output = encodeZ $ transcode $ decodeZ input
L.hPut stdout output
(here encodeZ and decodeZ are wrappers around Data.Binary.encode and
Data.Binary.decode, with support to gzip compression/decompression)
This function (transcodeIO, not transcode) takes, on my
Core2 CPU T7200 @ 2.00GHz:
real 30m8.794s
user 29m30.659s
sys 0m10.313s
1068 Mb total memory in use
The problem here is with snocU, that requires a lot of copying.
I rewrote the transcode function so that the input data set is split in
N parts:
http://hpaste.org/fastcgi/hpaste.fcgi/view?id=3453#a3456
The mapReduce function is the one defined in the Real World Haskell.
The new function takes (using only one thread):
real 18m48.039s
user 18m30.901s
sys 0m6.520s
1351 Mb total memory in use
The additional required memory is probably caused by unionsWith, that is
not strict.
The function takes less time, since array copying is optimized.
I still use snocU, but on small arrays.
GC time is very high: 54.4%
Unfortunately I can not test with more then one thread, since I get
segmentation faults (probably a bug with uvector packages).
I also got two strange errors (but this may be just the result of the
segmentation fault, I'm not able to reproduce them):
tpp.c:63: __pthread_tpp_change_priority: Assertion `new_prio == -1 ||
(new_prio >= __sched_fifo_min_prio && new_prio <=
__sched_fifo_max_prio)' failed.
internal error: removeThreadFromQueue: not found
(GHC version 6.8.2 for i386_unknown_linux)
Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug
Now the question is: what data structure should I use to optimize the
transcode function?
IMHO there are two solutions:
1) Use a "lazy" array.
Something like ByteString.Lazy, and what is available in
storablevector package.
Using this data structure, I can avoid the use of appendU.
2) Use an "unboxed" list.
Something like:
http://mdounin.ru/hg/nginx-vendor-current/file/tip/src/core/ngx_list.h
That is: a linked list of unboxed arrays, but unlike the lazy array
solution, a snoc operation avoid copying if there is space in the
current array.
I don't know if this is easy/efficient to implement in Haskell.
Any other suggestions?
Thanks Manlio Perillo
More information about the Haskell-Cafe
mailing list