> By the way: I have written the first version of the program to parse
> Netflix training data set in D.
> I also used ncpu * 1.5 threads, to parse files concurrently.

> However execution was *really* slow, due to garbage collection.
> I have also tried to disable garbage collection, and to manually run a
> garbage cycle from time to time (every 200 file parsed), but the 
> performance were the same.

may be it will be better to use somewhat like MapReduce and split
your job into 100-file parts which are processed by ncpu concurrently
executed scripts?

