logo       

Re: Slow regular-expression engine: msg#02392

ruby-talk

Subject: Re: Slow regular-expression engine

On Jul 31, 11:37 am, domini...@xxxxxxx wrote:
> w_a_x_man <w_a_x_...@xxxxxxxxx> writes:
> > This Awk  code takes less than one hundredth of a second to run:
>
> > BEGIN {
>
> >   regex = \
> >     "o?o?o?o?o?o?o?o?o?o?o?o?o?o?" \
> >     "o?o?o?o?o?o?o?o?o?o?o?o?o?" \
> >     "ooooooooooooooooooooooooooooo"
>
> >   print "ooooooooooooooooooooooooooooo" ~ regex
>
> > }
>
> > This Ruby code takes 27.5 seconds:
>
> > t = Time.now
>
> > regex = Regexp.new(
> >   "o?o?o?o?o?o?o?o?o?o?o?o?o?o?" +
> >   "o?o?o?o?o?o?o?o?o?o?o?o?o?" +
> >   "ooooooooooooooooooooooooooooo" )
>
> > p "ooooooooooooooooooooooooooooo" =~ regex
>
> > puts "#{ Time.now - t } seconds"
>
> > Seehttp://swtch.com/~rsc/regexp/regexp1.html
>
> In Ruby 1.9 its 1/4 of the time (still slow).
>
> But: One might ask if that is a problem at all, considering that this
> Regexp looks and is awful and could be written in a way better way which
> takes times of 0.000139965 seconds.
>
> t = Time.now
> regex = Regexp.new(/o{0,27}ooooooooooooooooooooooooooooo/ )
> p "ooooooooooooooooooooooooooooo" =~ regex
> puts "#{ Time.now - t } seconds"
>
> --
> Dominik Honnef
> domini...@xxxxxxx

Quoting the article:

... it is possible to write so-called "pathological" regular
expressions that Perl matches very very slowly. In contrast,
there are no regular expressions that are pathological for
the Thompson NFA implementation. Seeing the two graphs side
by side prompts the question, "why doesn't Perl use the
Thompson NFA approach?" It can, it should ...


Google Custom Search

News | Mail Home | sitemap | FAQ | advertise