[Haskell-cafe] Lazy vs Strict ByteStrings

Dimitri DeFigueiredo defigueiredo at ucdavis.edu
Wed Mar 15 20:57:44 UTC 2017


I think I need to be more specific here.

My gripe with ByteString is that there are two distinct ByteString types
disguised as one. There are “disguised” simply by having the same name.
This leads me to think the use of ByteString will be simpler than it
usually turns out to be. I fear Backpack may soon make this worse.

Obviously, this is not exclusive to ByteString. Using the same name for
strict and lazy versions is all over the place. Here is another example:

https://hackage.haskell.org/package/unordered-containers/docs/Data-HashMap-Lazy.html
https://hackage.haskell.org/package/unordered-containers/docs/Data-HashMap-Strict.html

But the problem is specially insidious with BytStrings because of their
usage patterns and also because the process of finding out which version
is which through GHC error messages is very inefficient.

Here is a very ugly example where I believe I was misled by both
ByteString types having the same name.

I needed to make a “random looking but secretly not random” number and
package it up in a 128-bit UUID. In other words, I wanted to mark the
random numbers I create, but in a way that nobody else knows they are mine.

The `hash` function from the cryptonite library accepts ByteString and
its output can be converted to ByteString.
The UUID type also has a function `fromByteString`.
What could be easier?

Here’s what I ended up with:

import           System.Random
import           Data.Maybe
import           Data.Word

import           Data.UUID
import           Crypto.Hash
import           Crypto.Hash.Algorithms

import           Data.ByteArray
import           Data.ByteString.Builder
import qualified Data.ByteString            as BS
import           Data.ByteString.Lazy       as BSL

-- Ugly! Step-by-step
markAsSelf :: Word64 -> Word64 -> UUID
markAsSelf selfDetectKey rand =
  let selfKeyAsLazyBS   = toLazyByteString (word64BE selfDetectKey)
      randAsLazyBS      = toLazyByteString (word64BE rand)
      hashInput         = BSL.concat [randAsLazyBS, selfKeyAsLazyBS]
      digest            = hash (toStrict hashInput) :: Digest SHA256
      hashBitsStrict    = BS.take 8  (convert digest)
      halfAndHalf       = BSL.concat [randAsLazyBS, fromStrict
hashBitsStrict]
   in fromJust (fromByteString halfAndHalf)


secret :: Word64
secret = 12345

main = do
  r <- randomIO
  print (markAsSelf secret r)

--------- end ---------

in Data.ByteString.Builder

toLazyByteString   :: Builder -> ByteString -- exists
toStrictByteString :: Builder -> ByteString -- was not found

Similarly, in Data.UUID

fromByteString :: ByteString -> Maybe UUID  — only exists for lazy

I think this lack of a common interface makes the code above a type
conversion disaster and the process of debugging it painfully
inefficient. Is there a better approach to go about doing this?


Cheers,


Dimitri


-- 
2E45 D376 A744 C671 5100 A261 210B 8461 0FB0 CA1F




More information about the Haskell-Cafe mailing list