[Haskell-cafe] [Very long] (CHP?) Compressing, MD5 and big files
nccb2 at kent.ac.uk
Tue Jan 5 11:39:19 EST 2010
Sorry for the slightly delayed reply -- I didn't have time to look
through all your code and understand it until just now. Your code has
one (no doubt frustratingly!) small problem, which is in the deadlocking
Maciej Piechotka wrote:
>> pipeline3 :: CHP ()
>> pipeline3 = enrolling $ do
>> file <- oneToManyChannel' $ chanLabel "File"
>> fileGZ <- oneToOneChannel' $ chanLabel "File GZ"
>> data_ <- oneToManyChannel' $ chanLabel "Data"
>> compressed <- oneToManyChannel' $ chanLabel "Data Compressed"
>> md5 <- oneToOneChannel' $ chanLabel "MD5"
>> md5Compressed <- oneToOneChannel' $ chanLabel "MD5 Compressed"
>> fileGZ' <- Enroll (reader file)
>> fileData <- Enroll (reader file)
>> dataMD5 <- Enroll (reader data_)
>> dataCompress <- Enroll (reader data_)
>> compressedFile <- Enroll (reader compressed)
>> compressedMD5 <- Enroll (reader compressed)
>> liftCHP $ runParallel_ [getFiles (writer file),
>> (forever $ readChannel fileGZ' >>=
>> writeChannel (writer fileGZ) .
>> (poison fileGZ' >> poison (writer fileGZ)),
>> readFromFile fileData (writer data_),
>> calculateMD5 dataMD5 (writer md5),
>> compressCHP dataCompress
>> (writer compressed),
>> writeToFile (reader fileGZ) compressedFile,
>> calculateMD5 compressedMD5
>> (writer md5Compressed),
>> forever $ readChannel dataMD5 >>=
>> liftIO . print >>
>> readChannel compressedMD5 >>=
>> liftIO . print]
> (CHP) Thread terminated with: thread blocked indefinitely in an STM
> < _b3, _b4, File GZ."test1.gz" >
Where you have "readChannel dataMD5" and "readChannel compressedMD5" in
the last few lines, you actually meant to have "readChannel (reader
md5)" and "readChannel (reader md5Compressed)". Your mistake meant that
the former two channels were being used more times in parallel than you
had enrolled and that the latter two channels were being written to but
not read from. Either of these mistakes could cause deadlock, so hence
why you were getting a strange deadlock. Unfortunately, the type system
didn't save you this time, because the channel types happened to be the
same. It took me a while to find it, too!
On a side note, it would be good to have a static check for these
mistakes (using a channel in parallel unsafely, and only using one end
of a channel), but the only way I found to use Haskell's type-system for
this is a rather nasty type-indexed monad. I guess if you use
newChannelRW and name both the results, you would get an unused variable
warning if you didn't use either end of the channel. This would fix one
issue, but not the other.
Hope that helps,
More information about the Haskell-Cafe