[Haskell-cafe] [Very long] (CHP?) Compressing, MD5 and big files

Neil Brown nccb2 at kent.ac.uk
Tue Jan 5 11:39:19 EST 2010


Hi,

Sorry for the slightly delayed reply -- I didn't have time to look 
through all your code and understand it until just now.  Your code has 
one (no doubt frustratingly!) small problem, which is in the deadlocking 
pipeline3:

Maciej Piechotka wrote:
>> pipeline3 :: CHP ()
>> pipeline3 = enrolling $ do
>>   file <- oneToManyChannel' $ chanLabel "File"
>>   fileGZ <- oneToOneChannel' $ chanLabel "File GZ"
>>   data_ <- oneToManyChannel' $ chanLabel "Data"
>>   compressed <- oneToManyChannel' $ chanLabel "Data Compressed"
>>   md5 <- oneToOneChannel' $ chanLabel "MD5"
>>   md5Compressed <- oneToOneChannel' $ chanLabel "MD5 Compressed"
>>   fileGZ' <- Enroll (reader file)
>>   fileData <- Enroll (reader file)
>>   dataMD5 <- Enroll (reader data_)
>>   dataCompress <- Enroll (reader data_)
>>   compressedFile <- Enroll (reader compressed)
>>   compressedMD5 <- Enroll (reader compressed)
>>   liftCHP $ runParallel_ [getFiles (writer file),
>>                           (forever $ readChannel fileGZ' >>=
>>                                      writeChannel (writer fileGZ) . 
>>                                      (++".gz"))
>>                           `onPoisonRethrow`
>>                           (poison fileGZ' >> poison (writer fileGZ)),
>>                           readFromFile fileData (writer data_),
>>                           calculateMD5 dataMD5 (writer md5),
>>                           compressCHP dataCompress
>>                                       (writer compressed),
>>                           writeToFile (reader fileGZ) compressedFile,
>>                           calculateMD5 compressedMD5
>>                                        (writer md5Compressed),
>>                           forever $ readChannel dataMD5 >>=
>>                                     liftIO . print >>
>>                                     readChannel compressedMD5 >>= 
>>                                     liftIO . print]
>>     
>
> Problems:
>
> (CHP) Thread terminated with: thread blocked indefinitely in an STM
> transaction
> < _b3, _b4, File GZ."test1.gz" >
>   
Where you have "readChannel dataMD5" and "readChannel compressedMD5" in 
the last few lines, you actually meant to have "readChannel (reader 
md5)" and "readChannel (reader md5Compressed)".  Your mistake meant that 
the former two channels were being used more times in parallel than you 
had enrolled and that the latter two channels were being written to but 
not read from.  Either of these mistakes could cause deadlock, so hence 
why you were getting a strange deadlock.  Unfortunately, the type system 
didn't save you this time, because the channel types happened to be the 
same.  It took me a while to find it, too!

On a side note, it would be good to have a static check for these 
mistakes (using a channel in parallel unsafely, and only using one end 
of a channel), but the only way I found to use Haskell's type-system for 
this is a rather nasty type-indexed monad.   I guess if you use 
newChannelRW and name both the results, you would get an unused variable 
warning if you didn't use either end of the channel.  This would fix one 
issue, but not the other.

Hope that helps,

Neil.


More information about the Haskell-Cafe mailing list