From folsk0pratima at cock.li Fri Apr 12 05:58:19 2024 From: folsk0pratima at cock.li (Folsk Pratima) Date: Fri, 12 Apr 2024 08:58:19 +0300 Subject: [Haskell-beginners] FFI to POSIX libc; strerror (errnum); unfreed memory Message-ID: <20240412085819.6862f051@folsk0pratima.cock.li> I need to turn POSIX errno to the error string. StrError.hs import Foreign.C.String import Foreign.C.Types foreign import ccall unsafe "strerror" c_strerror :: CInt -> IO CString strError :: Int -> IO String strError i = do cstr <- c_strerror ci peekCAString cstr where ci = fromIntegral i main = mapM_ (>>= putStrLn) $ map strError [1 .. 450] $ ghc -g StrError.hs $ valgrind --leak-check=full --show-leak-kinds=all ./StrError ==2350== ==2350== HEAP SUMMARY: ==2350== in use at exit: 18 bytes in 1 blocks ==2350== total heap usage: 685 allocs, 684 frees, 97,998 bytes allocated ==2350== ==2350== 18 bytes in 1 blocks are still reachable in loss record 1 of 1 ==2350== at 0x48407B4: malloc (vg_replace_malloc.c:381) ==2350== by 0x496A427: __vasprintf_internal (vasprintf.c:71) ==2350== by 0x493DBD5: asprintf (asprintf.c:31) ==2350== by 0x498A9F0: strerror_l (strerror_l.c:45) ==2350== by 0x408BC0: ??? (StrError.hs:8) ==2350== ==2350== LEAK SUMMARY: ==2350== definitely lost: 0 bytes in 0 blocks ==2350== indirectly lost: 0 bytes in 0 blocks ==2350== possibly lost: 0 bytes in 0 blocks ==2350== still reachable: 18 bytes in 1 blocks ==2350== suppressed: 0 bytes in 0 blocks ==2350== ==2350== For lists of detected and suppressed errors, rerun with: -s ==2350== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) I am already frustrated with having to use IO for this. I feel like it is better to manually create a pure solution by copying whatever the C code from libc does. But I still do not understand what causes this memory to remain reachable. Calling strerror from pure C code does not leave reachable memory regions. Can anyone explain? And maybe you have a suggestion as to how to implement this strError function properly? Just in case you are the type of a person to inquire as to *why* I need this, for fun. I need it for fun. From sylvain at haskus.fr Fri Apr 12 08:59:44 2024 From: sylvain at haskus.fr (Sylvain Henry) Date: Fri, 12 Apr 2024 10:59:44 +0200 Subject: [Haskell-beginners] FFI to POSIX libc; strerror (errnum); unfreed memory In-Reply-To: <20240412085819.6862f051@folsk0pratima.cock.li> References: <20240412085819.6862f051@folsk0pratima.cock.li> Message-ID: <5af554d0-3e5f-4939-aa31-00e14c4eb817@haskus.fr> Hi, I get exactly this valgrind profile with the following C code: #include #include int main() {     for (int i=0; i<=450; i++) {         printf("%s\n", strerror(i));     } } So there isn't anything Haskell specific here. Looking at glibc sources, there is a "strerror_l_buf" buffer kept in the TLS (https://elixir.bootlin.com/glibc/latest/source/string/strerror_l.c#L45) so that's expected. Use strerror_r instead to use a buffer you manage explicitly (e.g. with allocaBytes). Sylvain PS: are you running valgrind on your Haskell program? A valgrind profile for a Haskell program has many more entries. E.g. for a program compiled with GHC 9.6.4: ==36067== 2,097,152 bytes in 1 blocks are still reachable in loss record 21 of 21 ==36067==    at 0x4843788: malloc (vg_replace_malloc.c:442) ==36067==    by 0x2AD470: stgMallocBytes (in /home/hsyl20/projects/ghc/scratch/strerror/Test) ==36067==    by 0x2B96D8: initEventLogging (in /home/hsyl20/projects/ghc/scratch/strerror/Test) ==36067==    by 0x2B7CFE: initTracing (in /home/hsyl20/projects/ghc/scratch/strerror/Test) ==36067==    by 0x2A60AE: hs_init_ghc (in /home/hsyl20/projects/ghc/scratch/strerror/Test) ==36067==    by 0x2A56F0: hs_main (in /home/hsyl20/projects/ghc/scratch/strerror/Test) ==36067==    by 0x22D725: main (in /home/hsyl20/projects/ghc/scratch/strerror/Test) On 12/04/2024 07:58, Folsk Pratima wrote: > I need to turn POSIX errno to the error string. > > StrError.hs > > import Foreign.C.String > import Foreign.C.Types > > foreign import ccall unsafe "strerror" c_strerror :: CInt -> IO CString > > strError :: Int -> IO String > strError i = do > cstr <- c_strerror ci > peekCAString cstr > where > ci = fromIntegral i > > main = mapM_ (>>= putStrLn) $ map strError [1 .. 450] > > > $ ghc -g StrError.hs > $ valgrind --leak-check=full --show-leak-kinds=all ./StrError > ==2350== > ==2350== HEAP SUMMARY: > ==2350== in use at exit: 18 bytes in 1 blocks > ==2350== total heap usage: 685 allocs, 684 frees, 97,998 bytes allocated > ==2350== > ==2350== 18 bytes in 1 blocks are still reachable in loss record 1 of 1 > ==2350== at 0x48407B4: malloc (vg_replace_malloc.c:381) > ==2350== by 0x496A427: __vasprintf_internal (vasprintf.c:71) > ==2350== by 0x493DBD5: asprintf (asprintf.c:31) > ==2350== by 0x498A9F0: strerror_l (strerror_l.c:45) > ==2350== by 0x408BC0: ??? (StrError.hs:8) > ==2350== > ==2350== LEAK SUMMARY: > ==2350== definitely lost: 0 bytes in 0 blocks > ==2350== indirectly lost: 0 bytes in 0 blocks > ==2350== possibly lost: 0 bytes in 0 blocks > ==2350== still reachable: 18 bytes in 1 blocks > ==2350== suppressed: 0 bytes in 0 blocks > ==2350== > ==2350== For lists of detected and suppressed errors, rerun with: -s > ==2350== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) > > > I am already frustrated with having to use IO for this. I feel like it > is better to manually create a pure solution by copying whatever the C > code from libc does. > > But I still do not understand what causes this memory to remain > reachable. Calling strerror from pure C code does not leave reachable > memory regions. > > Can anyone explain? And maybe you have a suggestion as to how to > implement this strError function properly? > > Just in case you are the type of a person to inquire as to *why* I need > this, for fun. I need it for fun. > _______________________________________________ > Beginners mailing list > Beginners at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners From folsk0pratima at cock.li Fri Apr 12 10:39:27 2024 From: folsk0pratima at cock.li (Folsk Pratima) Date: Fri, 12 Apr 2024 10:39:27 -0000 Subject: [Haskell-beginners] FFI to POSIX libc; strerror (errnum); unfreed memory In-Reply-To: <5af554d0-3e5f-4939-aa31-00e14c4eb817@haskus.fr> References: <20240412085819.6862f051@folsk0pratima.cock.li> <5af554d0-3e5f-4939-aa31-00e14c4eb817@haskus.fr> Message-ID: <20240412103927.5447c9f7@folsk0pratima.cock.li> On Fri, 12 Apr 2024 10:59:44 +0200 Sylvain Henry wrote: Thank you. After getting some of my C hello worlds to have all heap blocks freed at exit, I somehow thought that strerror can't be the culprit. And yes, I run valgrind on the Haskell program. No idea why so little output, I used the -g flag. I do not really understand profiling, so I do not know what you mean by "A valgrind profile for a Haskell program" > Hi, > > I get exactly this valgrind profile with the following C code: > > #include > #include > > int main() { >     for (int i=0; i<=450; i++) { >         printf("%s\n", strerror(i)); >     } > } > > So there isn't anything Haskell specific here. Looking at glibc > sources, there is a "strerror_l_buf" buffer kept in the TLS > (https://elixir.bootlin.com/glibc/latest/source/string/strerror_l.c#L45) > so that's expected. > > Use strerror_r instead to use a buffer you manage explicitly (e.g. > with allocaBytes). > > Sylvain > > PS: are you running valgrind on your Haskell program? A valgrind > profile for a Haskell program has many more entries. E.g. for a > program compiled with GHC 9.6.4: > > ==36067== 2,097,152 bytes in 1 blocks are still reachable in loss > record 21 of 21 > ==36067==    at 0x4843788: malloc (vg_replace_malloc.c:442) > ==36067==    by 0x2AD470: stgMallocBytes (in > /home/hsyl20/projects/ghc/scratch/strerror/Test) > ==36067==    by 0x2B96D8: initEventLogging (in > /home/hsyl20/projects/ghc/scratch/strerror/Test) > ==36067==    by 0x2B7CFE: initTracing (in > /home/hsyl20/projects/ghc/scratch/strerror/Test) > ==36067==    by 0x2A60AE: hs_init_ghc (in > /home/hsyl20/projects/ghc/scratch/strerror/Test) > ==36067==    by 0x2A56F0: hs_main (in > /home/hsyl20/projects/ghc/scratch/strerror/Test) > ==36067==    by 0x22D725: main (in > /home/hsyl20/projects/ghc/scratch/strerror/Test) > > > On 12/04/2024 07:58, Folsk Pratima wrote: > > I need to turn POSIX errno to the error string. > > > > StrError.hs > > > > import Foreign.C.String > > import Foreign.C.Types > > > > foreign import ccall unsafe "strerror" c_strerror :: CInt -> IO > > CString > > > > strError :: Int -> IO String > > strError i = do > > cstr <- c_strerror ci > > peekCAString cstr > > where > > ci = fromIntegral i > > > > main = mapM_ (>>= putStrLn) $ map strError [1 .. 450] > > > > > > $ ghc -g StrError.hs > > $ valgrind --leak-check=full --show-leak-kinds=all ./StrError > > ==2350== > > ==2350== HEAP SUMMARY: > > ==2350== in use at exit: 18 bytes in 1 blocks > > ==2350== total heap usage: 685 allocs, 684 frees, 97,998 bytes > > allocated ==2350== > > ==2350== 18 bytes in 1 blocks are still reachable in loss record 1 > > of 1 ==2350== at 0x48407B4: malloc (vg_replace_malloc.c:381) > > ==2350== by 0x496A427: __vasprintf_internal (vasprintf.c:71) > > ==2350== by 0x493DBD5: asprintf (asprintf.c:31) > > ==2350== by 0x498A9F0: strerror_l (strerror_l.c:45) > > ==2350== by 0x408BC0: ??? (StrError.hs:8) > > ==2350== > > ==2350== LEAK SUMMARY: > > ==2350== definitely lost: 0 bytes in 0 blocks > > ==2350== indirectly lost: 0 bytes in 0 blocks > > ==2350== possibly lost: 0 bytes in 0 blocks > > ==2350== still reachable: 18 bytes in 1 blocks > > ==2350== suppressed: 0 bytes in 0 blocks > > ==2350== > > ==2350== For lists of detected and suppressed errors, rerun with: -s > > ==2350== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 > > from 0) > > > > > > I am already frustrated with having to use IO for this. I feel like > > it is better to manually create a pure solution by copying whatever > > the C code from libc does. > > > > But I still do not understand what causes this memory to remain > > reachable. Calling strerror from pure C code does not leave > > reachable memory regions. > > > > Can anyone explain? And maybe you have a suggestion as to how to > > implement this strError function properly? > > > > Just in case you are the type of a person to inquire as to *why* I > > need this, for fun. I need it for fun. > > _______________________________________________ > > Beginners mailing list > > Beginners at haskell.org > > http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners > _______________________________________________ > Beginners mailing list > Beginners at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners From folsk0pratima at cock.li Fri Apr 12 15:27:57 2024 From: folsk0pratima at cock.li (Folsk Pratima) Date: Fri, 12 Apr 2024 15:27:57 -0000 Subject: [Haskell-beginners] FFI. unIO safety considerations Message-ID: <20240412152757.6a3ba4d6@folsk0pratima.cock.li> Also see [Haskell-beginners] FFI to POSIX libc; strerror (errnum); unfreed memory Exactly how unsafe this code is? What do I do about it? It is so very stupid to have a pure function strerror_r in the IO monad! StrError.hs import Control.Exception import Control.Monad.ST import Control.Monad.ST.Unsafe import Data.Word import Foreign.C.String import Foreign.C.String import Foreign.C.Types import Foreign.Marshal import Foreign.Ptr import Foreign.Storable foreign import ccall unsafe "strerror_r_portalbe.c strerror_r_portable" c_strerror_r :: CInt -> Ptr CChar -> CSize -> IO CInt memSet :: Ptr a -> Word8 -> Word32 -> IO (Ptr a) memSet ptr _ 0 = return ptr memSet ptr byte size = do let voidptr = castPtr ptr :: Ptr Word8 acts = map (\i -> pokeByteOff voidptr i byte) [0 .. fromIntegral (size - 1)] mapM_ id acts return ptr strError :: Int -> String strError errnum = runST $ unsafeIOToST ioString where buflen = 512 :: CSize ioString = allocaBytes (fromIntegral buflen) $ \ptr -> do zeroptr <- memSet ptr 0 (fromIntegral buflen) code <- c_strerror_r (fromIntegral errnum) ptr buflen -- heuristic! case code of 22 -> return $ "Unknown error " ++ show errnum -- this is very dangerous, as far as I understand, and nobody in -- his right might should not do it if he later intends to -- unsafely escape IO 34 -> throwIO $ userError $ "strError: " ++ show code ++ ": ERANGE: " ++ "Numerical result out of range: " ++ "this is internal an error, which means not enough space " ++ "was allocated to store the error. You can not recover" _ -> peekCAString ptr main = mapM_ (putStrLn) $ map strError [1 .. 1000] strerror_portable.c #define _POSIX_C_SOURCE 200112L #include int strerror_r_portable (int e, char *c, size_t s) { return strerror_r (e, c, s); } The C code is needed because GHC uses _GNU_SOURCE, which I personally do not want to use. Besides, I do not know how to predict whether GHC will define _GNU_SOURCE or not, so this also feels more reliable. To compile, do $ mkdir lib $ gcc -c -o lib/portable.o strerror_portable.c $ ar -csr lib/portable.a lib/portable.o $ ghc StrError.hs lib/portable.a No leaks, of course. From folsk0pratima at cock.li Sat Apr 13 14:38:45 2024 From: folsk0pratima at cock.li (Folsk Pratima) Date: Sat, 13 Apr 2024 14:38:45 -0000 Subject: [Haskell-beginners] FFI. unIO safety considerations In-Reply-To: <20240412152757.6a3ba4d6@folsk0pratima.cock.li> References: <20240412152757.6a3ba4d6@folsk0pratima.cock.li> Message-ID: <20240413143845.49711ba7@folsk0pratima.cock.li> import Control.Exception import Control.Monad.ST import Control.Monad.ST.Unsafe import Data.Word import Foreign.C.String import Foreign.C.String import Foreign.C.Types import Foreign.Marshal import Foreign.Ptr import Foreign.Storable foreign import ccall unsafe "strerror_r_portalbe.c strerror_r_portable" c_strerror_r :: CInt -> Ptr CChar -> CSize -> IO CInt memSet :: Ptr a -> Word8 -> Word32 -> IO (Ptr a) memSet ptr _ 0 = return ptr memSet ptr byte size = do let voidptr = castPtr ptr :: Ptr Word8 acts = map (\i -> pokeByteOff voidptr i byte) [0 .. fromIntegral (size - 1)] mapM_ id acts return ptr strError :: Int -> String strError errnum = runST $ unsafeIOToST ioString where baseSize :: CSize baseSize = 50 ioString = run baseSize run :: CSize -> IO String run size | size > 100000 = return $ "!!! INTERNAL strError memory leak detected, " ++ "you can not recover !!!" | otherwise = do may <- tryIOString size case may of Just str -> return str Nothing -> run (size + baseSize) tryIOString :: CSize -> IO (Maybe String) tryIOString size = allocaBytes (fromIntegral size) $ \ptr -> do zeroptr <- memSet ptr 0 (fromIntegral size) st <- c_strerror_r (fromIntegral errnum) ptr size -- heuristic case st of 22 -> return . return $ "Unknown error " ++ show errnum 34 -> return Nothing _ -> peekCAString ptr >>= return . return main = mapM_ (putStrLn) $ map strError [1 .. 1000] On Fri, 12 Apr 2024 15:27:57 -0000 Folsk Pratima wrote: > Also see > [Haskell-beginners] FFI to POSIX libc; strerror (errnum); unfreed > memory > > Exactly how unsafe this code is? What do I do about it? It is so very > stupid to have a pure function strerror_r in the IO monad! > > StrError.hs > > > import Control.Exception > import Control.Monad.ST > import Control.Monad.ST.Unsafe > import Data.Word > import Foreign.C.String > import Foreign.C.String > import Foreign.C.Types > import Foreign.Marshal > import Foreign.Ptr > import Foreign.Storable > > foreign import ccall unsafe "strerror_r_portalbe.c > strerror_r_portable" c_strerror_r :: CInt -> Ptr CChar -> CSize -> IO > CInt > > memSet :: Ptr a -> Word8 -> Word32 -> IO (Ptr a) > memSet ptr _ 0 = return ptr > memSet ptr byte size = do > let voidptr = castPtr ptr :: Ptr Word8 > acts = > map > (\i -> pokeByteOff voidptr i byte) > [0 .. fromIntegral (size - 1)] > mapM_ id acts > return ptr > > strError :: Int -> String > strError errnum = runST $ unsafeIOToST ioString > where > buflen = 512 :: CSize > ioString = > allocaBytes (fromIntegral buflen) $ \ptr -> do > zeroptr <- memSet ptr 0 (fromIntegral buflen) > code <- c_strerror_r (fromIntegral errnum) ptr buflen > -- heuristic! > case code of > 22 -> return $ "Unknown error " ++ show errnum > -- this is very dangerous, as far as I understand, > and nobody in -- his right might should not do it if he later intends > to -- unsafely escape IO > 34 -> > throwIO $ > userError $ > "strError: " ++ > show code ++ > ": ERANGE: " ++ > "Numerical result out of range: " ++ > "this is internal an error, which means not > enough space " ++ "was allocated to store the error. You can not > recover" _ -> peekCAString ptr > > main = mapM_ (putStrLn) $ map strError [1 .. 1000] > > > > > strerror_portable.c > > > #define _POSIX_C_SOURCE 200112L > #include > > int > strerror_r_portable (int e, char *c, size_t s) > { > return strerror_r (e, c, s); > } > > > > > The C code is needed because GHC uses _GNU_SOURCE, which I personally > do not want to use. Besides, I do not know how to predict whether GHC > will define _GNU_SOURCE or not, so this also feels more reliable. > > To compile, do > $ mkdir lib > $ gcc -c -o lib/portable.o strerror_portable.c > $ ar -csr lib/portable.a lib/portable.o > $ ghc StrError.hs lib/portable.a > > No leaks, of course. > _______________________________________________ > Beginners mailing list > Beginners at haskell.org > http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners