Incremental Regex Parsing with Fixed Strings

Andrew Martin andrew.thaddeus at gmail.com
Thu Feb 28 21:54:39 UTC 2019


Let me make sure I'm interpretting "a suffix tree of your text" correctly.
You mean just stripping the common prefix "Lost" from both the regex (the
texual representation of it) and from the sample text? If so, it should be
possible to do both prefixes and suffixes. But it's still not the win I'm
hoping for.

On Thu, Feb 28, 2019 at 4:47 PM David Feuer <david.feuer at gmail.com> wrote:

> I'm not too optimistic in general about editing regex machines
> interactively. I'd expect you to do a lot of work to save a little. But I'd
> expect you could win a good bit (at least in some common cases) by using
> something like a suffix tree of your text. That'll let you jump quickly to
> the end of "Lost ", for example, and may offer other interesting
> opportunities.
>
> On Thu, Feb 28, 2019, 1:43 PM Andrew Martin <andrew.thaddeus at gmail.com>
> wrote:
>
>> Greetings,
>>
>> In Dan Piponi's blog post Fast Incremental Regular Expression Matching
>> with Monoids [1], he outlines a way to take advantage of fingertrees to
>> perform increment regex matching. In his article, the regular expression
>> (more accurately, its equivalent DFA) is fixed, and the strings we are
>> matching on are modified incrementally. What I'm interested in is the
>> opposite: What if the regular expression is modified incrementally and the
>> string we are testing is static?
>>
>> Thinking about this, it seems to run into problems immediately.
>> Typically, to evaluate a regular expression, I would convert it to a NFA
>> (and then maybe to a DFA). But there is no way to do incremental graph
>> modification on the resulting graphs. (Or is there?). And even if there
>> were a way, it seems unclear how one would go about exploiting its
>> incremental nature. To provide an example, consider the following regex and
>> sample string:
>>
>> Regex: Lost (dog|cat) loo for home
>> Sample: Lost dog looking for home
>>
>> In the regex, I'm currently in the middle of typing "looking", so it
>> doesn't match the sample string. But large parts of it do. So when I update
>> the regex by typing "k", I'd like to be able to not redo this work. Maybe
>> there's a certain subset of regex that's amenable to this kind of
>> incremental evaluation. Maybe not. If anyone has any additional insights or
>> can point me to any research, I would appreciate it greatly. Thanks.
>>
>> [1]
>> http://blog.sigfpe.com/2009/01/fast-incremental-regular-expression.html
>>
>>
>> --
>> -Andrew Thaddeus Martin
>> _______________________________________________
>> Libraries mailing list
>> Libraries at haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/libraries
>>
>

-- 
-Andrew Thaddeus Martin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/libraries/attachments/20190228/3e918f87/attachment.html>


More information about the Libraries mailing list