Robert Dockins robdockins at fastmail.fm
Sat Apr 8 13:21:06 EDT 2006

```On Apr 8, 2006, at 4:24 AM, Bulat Ziganshin wrote:

> Hello Daniel,
>
> Saturday, April 8, 2006, 4:21:14 AM, you wrote:
>
>> Unless I overlooked something, I use foldBits only via size
>> (though that's
>> used a lot).
>
> size of set? there is much faster method - use a table
>
> [0..255] -> number of bits in this number seen as set

Or:

bitcount :: Word -> Int
bitcount 0 = 0
bitcount x = 1 + bitcount (x .&. (x-1))

-- | /O(1)/. The number of elements in the set.
size :: Set a -> Int
size (Set w) = bitcount w

Taking a look at the generated core (with -O2) , bitcount gets
unboxed the way I'd expect, so this might do the trick.

> then we split Word to the bytes and count total size of set
> by adding number of bits set in each byte
>
> foldBits can be made faster (may be) by adding strict annotations:
>
> foldBits :: Bits c => (a -> Int -> a) -> a -> c -> a
> foldbits _ z bs | z `seq` bs `seq` False  = undefined
>
> foldBits' :: Bits c => (a -> Int -> a) -> Int -> c -> a -> a
> foldbits' _ i bs z | i `seq` bs `seq` z `seq` False  = undefined
>
> moreover, GHC don't inline recursive functions! so foldbits' is out of
> luck and it seems that GHC generates polymorphic version that is of
> course very-very slow. what you can do?
>
> 1. use SPECIALIZE pragma. this allow to make faster version at least
> for typical cases (a=c=Int, for example)
>
> 2. use recursion on the internal foldbits' function. may be this will
> help to inline and therefore specialize each call to foldbits'. it's
>
> --
> Best regards,
>  Bulat                            mailto:Bulat.Ziganshin at gmail.com
>
> _______________________________________________