[Haskell-cafe] Fast code question

Michael Snoyman michael at snoyman.com
Thu Jun 4 17:38:53 EDT 2009


I *do* know what Packed Decimal is; at my previous job, I actually had a
whole Haskell library for parsing them. The only immediate suggestion that
pops to mind is to use Int instead of Integer (Int uses regular 32- or
64-bit integers, Integer uses arbitrary precision integers). If you send me
a sample Packed Decimal file, I can test out your code and get a better feel
for it that way.

Good luck with those mainframes, they can be beasts sometimes. Have you had
to parse EBCDIC yet? *That* was fun, manually copying all those character
codes from some IBM web page... ;)

Michael

On Fri, Jun 5, 2009 at 2:31 AM, Bartosz Wójcik <bartek at sudety.it> wrote:

> Hi Folks,
>
> I had to transform Packed Decimal file into csv format (does anybody here
> know
> what this Mainframe format is?).  Because of the file size I could not do
> this on mainframe directly. So I've created simply program using
> ByteString.
> Generally I'm happy with my solution: pgm processes arroud 2MB/s on my pc,
> so
> my 3GB file was transformed in reasonable 30 min time.
>
> My question is: how to do this faster?
>
> {code}
> module Main
> where
>
> import qualified Data.ByteString.Lazy as B
>
> main =  B.getContents >>= myPrint . myConcat . B.unpack
>
> myConcat = myConcat' 0
>
> myConcat' :: (Integral a) => Integer -> [a] -> [Integer]
> myConcat'  _  [] = []
> myConcat' acc (x:xs) = case x `mod` 16 of
>                12 -> (10*acc + (restOf . fromIntegral) x) : myConcat' 0 xs
>                13 -> ((-10)*acc + (restOf . fromIntegral) x) : myConcat' 0
> xs
>                15 -> (10*acc + (restOf . fromIntegral) x) : myConcat' 0 xs
>                10 -> fail $ show acc
>                11 -> fail $ show acc
>                14 -> fail $ show acc
>                 v  -> myConcat' (100*acc + (numberOf . fromIntegral) x) xs
>         where restOf n = (n - 12) `div` 16
>               numberOf n = n - 6 * (n `div` 16)
>
> myPrint [] = return ()
> myPrint xs = mapM_ myShow (take 14 xs) >> putStrLn "" >> myPrint (drop 14
> xs)
>
> myShow x = (putStr . show) x >> putStr ";"
> {code}
>
> I knew that csv output had to be 14 fields per line.
>
> Best,
> Bartek
>
>
> _______________________________________________
> Haskell-Cafe mailing list
> Haskell-Cafe at haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.haskell.org/pipermail/haskell-cafe/attachments/20090604/71bcf87d/attachment.html


More information about the Haskell-Cafe mailing list