[Haskell-cafe] Re: SHA1 again

Malte Milatz malte at gmx-topmail.de
Sun Jul 15 16:26:43 EDT 2007


Dominic Steinitz:
> Malte Milatz:
> > 	http://hpaste.org/1695#a2
> > 
> > It performs better than the SHA1 algorithm in Crypto:  It is faster by a
> > factor of approximately e.   It is also competitive (regarding time)
> > with the »unsafe« SHA1 implementation posted here some days ago,
> 
> Great. Can you post some timings?

On a 12 MB file I get


	   malte +RTS -prof -RTS

	total time  =        9.80 secs   (196 ticks @ 50 ms)
	total alloc = 3,764,831,396 bytes

	
	   anatoly +RTS -prof -RTS

	total time  =       14.35 secs   (287 ticks @ 50 ms)
	total alloc = 1,773,131,228 bytes


	   crypto +RTS -prof -RTS

	total time  =       24.55 secs   (491 ticks @ 50 ms)
	total alloc = 8,703,970,360 bytes 


> > although it uses considerably more memory.  Of course, you may safely
> 
> Does it have a space leak or does it run in constant but larger space than the
> other versions?

I haven't discovered a space leak so far.  When profiling with -S, the
first column holds fairly constant values.

> > forget it if you're interested in performance near a C implementation
> > such as GNU sha1sum.
> I don't think it's unreasonable to think we could get near to C performance and
> we've been getting closer.

Hm, but I suppose, if we want the code to remain in its pure style, that
will mean that someone of the more clever Haskell people has to invent
again a low-level library that improves performance behind the curtains.
On the other hand, we can still hope for someone to find a better
algorithm.

> PS I had a 5 minute (well actually less) look at the code and it doesn't look
> dissimilar to other versions I've seen in Haskell. I wonder if the speedup is
> principally because of ByteString.

Before I started tweaking (with the help of the IRC channel), I had read
the input in as a ByteString, then unpacked it to [Word8] for further
processing.  That version can be found on top of the same hpaste page.
It delivers:

	   malte_old +RTS -prof -RTS

	total time  =       15.95 secs   (319 ticks @ 50 ms)
	total alloc = 5,844,911,772 bytes

This is still better than crypto.  (By the way, I'm really astonished by
this.  I translated the SHA1 specification into Haskell, and as it seems
so far, I'm better than a library.  That's worth a party!)

> PPS We have to put splitn in a library.

Yea.  It's disappointing not to find it in Data.List.  However, splitByN
might yet be a better name.

Regards, Malte



More information about the Haskell-Cafe mailing list