finding sublist
Garner, Robin
Robin.Garner@crsrehab.gov.au
Mon, 6 May 2002 14:15:55 +1000
This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.
------_=_NextPart_001_01C1F4B4.B78BAF00
Content-Type: text/plain;
charset="iso-8859-1"
See Dijkstra's 'Discipline of Programming' for an o(M + N) algorithm - naive
approches are o(MN) where M and N are the length of the list and substring
respectively.
Essentially the algorithm calculates a sort of autocorrelation table of the
substring, showing where to resume the search after a failed match. For
example, if you're matching the substring [1,2,3,4], and fail on the last
comparison, you already know that there's no point in advancing one element
and attempting another match - you can actually start again at the element
that failed. When matching the string [1,2,1,3], you can resume at the
current element of the main string, and the second element of the search
string.
How you would do this in a functional implementation is another question -
Dijkstra's example is comparing two arrays, and there may be inefficiencies
translating it to a list-based implementation.
Cheers
-----Original Message-----
From: Serge D. Mechveliani [mailto:mechvel@botik.ru]
Sent: Thursday, 2 May 2002 17:37
To: haskell@haskell.org
Subject: finding sublist
Thanks to people who helped me with the task
>> Import two space separated columns of integers from file.
Claus Reinke <claus.reinke@talk21.com> recommends to exploit `lines'.
Indeed, it bocomes shorter now:
main = readFile "data" >>= (putStr . show . twoIntLists)
where
twoIntLists str = case span (not . null) $ dropWhile null $ lines str
of
(lns, lns') -> (readInts lns, readInts lns')
readInts = map (\ str -> read str :: Integer) . dropWhile null
Another question:
has the Haskell Standard library a function for such a usable task
as finding of occurence of a segment in a list? Say
findSegmentBy (...) [2,2,3] [0,0,2,2,1,2,2,3,1,2,3] -->
([0,0,2,2,1], [2,2,3,1,2,3])
I have heard, an efficient algorithm (for large lists) for this
is not so simple.
-----------------
Serge Mechveliani
mechvel@botik.ru
_______________________________________________
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell
------_=_NextPart_001_01C1F4B4.B78BAF00
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
5.5.2653.12">
<TITLE>RE: finding sublist</TITLE>
</HEAD>
<BODY>
<P><FONT SIZE=3D2>See Dijkstra's 'Discipline of Programming' for an o(M =
+ N) algorithm - naive approches are o(MN) where M and N are the length =
of the list and substring respectively.</FONT></P>
<P><FONT SIZE=3D2>Essentially the algorithm calculates a sort of =
autocorrelation table of the substring, showing where to resume the =
search after a failed match. For example, if you're matching the =
substring [1,2,3,4], and fail on the last comparison, you already know =
that there's no point in advancing one element and attempting another =
match - you can actually start again at the element that failed. =
When matching the string [1,2,1,3], you can resume at the current =
element of the main string, and the second element of the search =
string.</FONT></P>
<P><FONT SIZE=3D2>How you would do this in a functional implementation =
is another question - Dijkstra's example is comparing two arrays, and =
there may be inefficiencies translating it to a list-based =
implementation.</FONT></P>
<P><FONT SIZE=3D2>Cheers</FONT>
</P>
<P><FONT SIZE=3D2>-----Original Message-----</FONT>
<BR><FONT SIZE=3D2>From: Serge D. Mechveliani [<A =
HREF=3D"mailto:mechvel@botik.ru">mailto:mechvel@botik.ru</A>]</FONT>
<BR><FONT SIZE=3D2>Sent: Thursday, 2 May 2002 17:37</FONT>
<BR><FONT SIZE=3D2>To: haskell@haskell.org</FONT>
<BR><FONT SIZE=3D2>Subject: finding sublist</FONT>
</P>
<BR>
<P><FONT SIZE=3D2>Thanks to people who helped me with the task</FONT>
</P>
<P><FONT SIZE=3D2>>> Import two space separated columns of =
integers from file.</FONT>
<BR><FONT SIZE=3D2> </FONT>
<BR><FONT SIZE=3D2>Claus Reinke <claus.reinke@talk21.com> =
recommends to exploit `lines'.</FONT>
</P>
<P><FONT SIZE=3D2>Indeed, it bocomes shorter now:</FONT>
</P>
<P><FONT SIZE=3D2> main =3D readFile "data" >>=3D =
(putStr . show . twoIntLists)</FONT>
<BR><FONT SIZE=3D2> where</FONT>
<BR><FONT SIZE=3D2> twoIntLists str =3D case =
span (not . null) $ dropWhile null $ lines str</FONT>
<BR><FONT =
SIZE=3D2> &nb=
sp; =
of</FONT>
<BR><FONT =
SIZE=3D2> &nb=
sp; &nb=
sp; (lns, lns') -> (readInts lns, readInts lns')</FONT>
</P>
<P><FONT SIZE=3D2> readInts =3D map (\ str -> read =
str :: Integer) . dropWhile null</FONT>
</P>
<BR>
<P><FONT SIZE=3D2>Another question:</FONT>
<BR><FONT SIZE=3D2>has the Haskell Standard library a function for such =
a usable task</FONT>
<BR><FONT SIZE=3D2>as finding of occurence of a segment in a list? Say =
</FONT>
</P>
<P><FONT SIZE=3D2> findSegmentBy (...) [2,2,3] =
[0,0,2,2,1,2,2,3,1,2,3] --> </FONT>
</P>
<P><FONT =
SIZE=3D2> &nb=
sp; &nb=
sp; ([0,0,2,2,1], =
[2,2,3,1,2,3])</FONT>
</P>
<P><FONT SIZE=3D2>I have heard, an efficient algorithm (for large =
lists) for this </FONT>
<BR><FONT SIZE=3D2>is not so simple.</FONT>
</P>
<P><FONT SIZE=3D2>-----------------</FONT>
<BR><FONT SIZE=3D2>Serge Mechveliani</FONT>
<BR><FONT SIZE=3D2>mechvel@botik.ru</FONT>
<BR><FONT =
SIZE=3D2>_______________________________________________</FONT>
<BR><FONT SIZE=3D2>Haskell mailing list</FONT>
<BR><FONT SIZE=3D2>Haskell@haskell.org</FONT>
<BR><FONT SIZE=3D2><A =
HREF=3D"http://www.haskell.org/mailman/listinfo/haskell" =
TARGET=3D"_blank">http://www.haskell.org/mailman/listinfo/haskell</A></F=
ONT>
</P>
</BODY>
</HTML>
------_=_NextPart_001_01C1F4B4.B78BAF00--