[Haskell-cafe] network package and SIGVTALRM

Andrey Sverdlichenko blaze at ruddy.ru
Tue Jul 12 15:40:33 UTC 2016


Hi Ruben,

This signal is used internally by Haskell IO manager, nothing is wrong with
it.

What I see in the dump is that 198.255.92.74 is announcing window of size 0
from the beginning and tapioca ends up sending 0 window too. Are you sure
they both reading what other one sends to them? Given that your code only
fails on large files, I guess sender never reads this 4-byte status
messages receiver sends to it, and processes deadlock after socket buffer
is filled up.

As a side node, consider using putWord32be from binary or cereal packages
instead of serializing data yourself.
On Tue, 12 Jul 2016 at 03:35, Ruben Astudillo <ruben.astud at gmail.com> wrote:

> Hi all
>
> I am doing a DCC subsystem on a irc client. After all the handshakes are
> done I just connect to the server and start `recv`. The code I use for
> this is:
>
>      getPackets :: MVar Int
>                 -> FilePath -- ^ Name media
>                 -> Int      -- ^ File size
>                 -> AddrInfo -> ExceptT DCCError IO ()
>      getPackets mvar name totalSize addr =
>        do receivedSize <- lift $ bracket acquire release receive
>           let delta = (totalSize - receivedSize)
>           if delta > 0 then throwE (NotFullRecv delta) else return ()
>        where
>          bufferSize = 16384
>
>          acquire :: IO (IO.Handle,Socket)
>          acquire = (,) <$> (IO.openFile name IO.WriteMode)
>                        <*> newSocket addr
>
>          release :: (IO.Handle,Socket) -> IO ()
>          release (hdl, sock) = IO.hClose hdl >> close sock
>
>          receive :: (IO.Handle,Socket) -> IO Int
>          receive (hdl, sock) =
>              flip execStateT 0 . fix $ \loop -> do
>                  mediaData <- lift (B.recv sock bufferSize)
>                  unless (B.null mediaData) $ do
>                      S.modify' (+ (B.length mediaData))
>                      currentSize <- S.get
>                      lift $ B.hPut hdl mediaData
>                             >> B.send sock (int2BS currentSize)
>                             >> swapMVar mvar currentSize
>                      loop
>
> Part of the protocol is that on each `recv` I send the current received
> size on network byte order. Hence the B.send line on receive I use this
> function:
>
>      -- | given a number forms a bytestring with each digit on a separated
>      -- Word8 in network byte-order
>      int2BS :: Int -> B.ByteString
>      int2BS i | w <- (fromIntegral i :: Word32) =
>          B.pack [ (fromIntegral (shiftR w 24) :: Word8)
>                , (fromIntegral (shiftR w 16) :: Word8)
>                , (fromIntegral (shiftR w  8) :: Word8)
>                , (fromIntegral w             :: Word8)]
>
> Everything works correctly until around 1/2 of a test transfer (ie in a
> file of 340M it gets 170). That first half is gotten in the right order
> (I tested with a video and it was playable until the middle). On tinier
> files the bug doesn't happen, the file is received completly. I did a
> little bit of `strace` and `tcpdump` and I got this
>
> -- strace -e trace=network -p $client
>      (..)
>      30439 recvfrom(13,
>
> "\312\255\201\337\376\355\253\r\177\276\204X]8\6\221\301#\361<>\273+\355\5\343B
> \333\366\351W"..., 16384, 0, NULL, NULL) = 1380
>      30439 sendto(13, "\n\273\31l", 4, 0, NULL, 0) = 4
>      30439 recvfrom(13, 0x20023f010, 16384, 0, NULL, NULL) = -1 EAGAIN
> (Resource temporarily unavailable)
>      30439 recvfrom(13,
>
> "\222llq_H\23\17\275\f}\367\"P4\23\207\312$w\371J\354aW2\243R\32\v\n\251"...,
> 16384, 0, NULL, NULL) = 1380
>      30439 sendto(13, "\n\273\36\320", 4, 0, NULL, 0) = -1 EAGAIN (Resource
> temporarily unavailable)
>      30438 --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TIMER,
> si_timerid=0, si_overrun=0, si_value={int=0, ptr=0}} ---
>      30438 --- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TIMER,
> si_timerid=0, si_overrun=0, si_value={int=0, ptr=0}} ---
>      (..)
>
> -- tcpdump
>      05:20:22.788273 IP tapioca.36346 > 198.255.92.74.36103: Flags [.], ack
> 48004680, win 489, options [nop,nop,TS val 627805332 ecr
> 675358947,nop,nop,sack 1 {48006060:48066780}], length 0
>      05:20:22.975627 IP 198.255.92.74.36103 > tapioca.36346: Flags [.], seq
> 48004680:48006060, ack 82629, win 0, options [nop,nop,TS val 675359033 ecr
> 627805248], length 1380
>      05:20:23.014991 IP tapioca.36346 > 198.255.92.74.36103: Flags [.], ack
> 48066780, win 4, options [nop,nop,TS val 627805559 ecr 675359033], length 0
>      05:20:23.768012 IP 198.255.92.74.36103 > tapioca.36346: Flags [P.],
> seq 48066780:48067292, ack 82629, win 0, options [nop,nop,TS val 675359232
> ecr 627805559], length 512
>      05:20:23.768143 IP tapioca.36346 > 198.255.92.74.36103: Flags [.], ack
> 48067292, win 0, options [nop,nop,TS val 627806312 ecr 675359232], length 0
>      05:20:24.523397 IP 198.255.92.74.36103 > tapioca.36346: Flags [.], ack
> 82629, win 0, options [nop,nop,TS val 675359421 ecr 627806312], length 0
>
> What bothers me is that SIGVTALRM on the strace output. I am not the
> greatest unix hacker but that signal is related to settimer and I haven't
> explicitly set that up. So I am scratching me head a little. Maybe
> somebody has experienced something related with the network package? Do
> you notice something on the logs? thanks in advance.
>
> --
> -- Ruben Astudillo
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20160712/dc82073d/attachment.html>


More information about the Haskell-Cafe mailing list