Fast Mutable arrays and a general question about IO

Garner, Robin Robin.Garner@crsrehab.gov.au
Sun, 4 May 2003 12:46:29 +1000


This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------_=_NextPart_001_01C311E7.5CFE67C0
Content-Type: text/plain;
	charset="iso-8859-1"

Here's an example of mine, that uses a hash table to construct a word
frequency table.  By no means is this great code - for example the hash
table filling up will lead to an infinite loop ...

This function produces a word frequency list using an intermediate hash
table to accumulate the counts.

> makemap :: [String] -> [(String,Int)]
> makemap l  = filter ((> 0) . snd) $
>	runST (emptyHT >>= flip (foldM insertHT) l >>= getElems)

The hash table structure, an array.ST

> maxHash = 10000 ::Int
> 
> type HT s = STArray s Int (String,Int)
> 
> emptyHT :: ST s (HT s)
> emptyHT = newArray (0,maxHash) ("",0)

Update the hash table with a single entry

> insertHT :: HT s -> String -> ST s (HT s)
> insertHT h s = findSlot s h (hash s) >>= updateSlot h s >> return h

Finds the correct slot in the hash table, returning the slot index
and the old count.

> findSlot :: String -> HT s -> Int -> ST s (Int,Int)
> findSlot s h i = do (s',c) <- readArray h i
>                     if c == 0 || s == s'
>                        then return (i,c)
>                        else findSlot s h ((i+1) `mod` maxHash)
> 
> updateSlot :: HT s -> String -> (Int,Int) -> ST s ()
> updateSlot h s (i,n) = writeArray h i (s,n+1)

Simple (simplistic ?) hash function

> hash :: String -> Int
> hash = (`mod` maxHash) . foldl' hash' 0
>   where hash' acc c = acc * 2 + ord c


------_=_NextPart_001_01C311E7.5CFE67C0
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
5.5.2654.89">
<TITLE>RE: Fast Mutable arrays and a general question about IO</TITLE>
</HEAD>
<BODY>

<P><FONT SIZE=3D2>Here's an example of mine, that uses a hash table to =
construct a word frequency table.&nbsp; By no means is this great code =
- for example the hash table filling up will lead to an infinite loop =
....</FONT></P>

<P><FONT SIZE=3D2>This function produces a word frequency list using an =
intermediate hash table to accumulate the counts.</FONT>
</P>

<P><FONT SIZE=3D2>&gt; makemap :: [String] -&gt; [(String,Int)]</FONT>
<BR><FONT SIZE=3D2>&gt; makemap l&nbsp; =3D filter ((&gt; 0) . snd) =
$</FONT>
<BR><FONT SIZE=3D2>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; runST =
(emptyHT &gt;&gt;=3D flip (foldM insertHT) l &gt;&gt;=3D =
getElems)</FONT>
</P>

<P><FONT SIZE=3D2>The hash table structure, an array.ST</FONT>
</P>

<P><FONT SIZE=3D2>&gt; maxHash =3D 10000 ::Int</FONT>
<BR><FONT SIZE=3D2>&gt; </FONT>
<BR><FONT SIZE=3D2>&gt; type HT s =3D STArray s Int (String,Int)</FONT>
<BR><FONT SIZE=3D2>&gt; </FONT>
<BR><FONT SIZE=3D2>&gt; emptyHT :: ST s (HT s)</FONT>
<BR><FONT SIZE=3D2>&gt; emptyHT =3D newArray (0,maxHash) =
(&quot;&quot;,0)</FONT>
</P>

<P><FONT SIZE=3D2>Update the hash table with a single entry</FONT>
</P>

<P><FONT SIZE=3D2>&gt; insertHT :: HT s -&gt; String -&gt; ST s (HT =
s)</FONT>
<BR><FONT SIZE=3D2>&gt; insertHT h s =3D findSlot s h (hash s) =
&gt;&gt;=3D updateSlot h s &gt;&gt; return h</FONT>
</P>

<P><FONT SIZE=3D2>Finds the correct slot in the hash table, returning =
the slot index</FONT>
<BR><FONT SIZE=3D2>and the old count.</FONT>
</P>

<P><FONT SIZE=3D2>&gt; findSlot :: String -&gt; HT s -&gt; Int -&gt; ST =
s (Int,Int)</FONT>
<BR><FONT SIZE=3D2>&gt; findSlot s h i =3D do (s',c) &lt;- readArray h =
i</FONT>
<BR><FONT =
SIZE=3D2>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if c =
=3D=3D 0 || s =3D=3D s'</FONT>
<BR><FONT =
SIZE=3D2>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;&nbsp; then return (i,c)</FONT>
<BR><FONT =
SIZE=3D2>&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp=
;&nbsp; else findSlot s h ((i+1) `mod` maxHash)</FONT>
<BR><FONT SIZE=3D2>&gt; </FONT>
<BR><FONT SIZE=3D2>&gt; updateSlot :: HT s -&gt; String -&gt; (Int,Int) =
-&gt; ST s ()</FONT>
<BR><FONT SIZE=3D2>&gt; updateSlot h s (i,n) =3D writeArray h i =
(s,n+1)</FONT>
</P>

<P><FONT SIZE=3D2>Simple (simplistic ?) hash function</FONT>
</P>

<P><FONT SIZE=3D2>&gt; hash :: String -&gt; Int</FONT>
<BR><FONT SIZE=3D2>&gt; hash =3D (`mod` maxHash) . foldl' hash' =
0</FONT>
<BR><FONT SIZE=3D2>&gt;&nbsp;&nbsp; where hash' acc c =3D acc * 2 + ord =
c</FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C311E7.5CFE67C0--