|
Re: non-greedy regexp: msg#01124lang.ruby.general
Hello -- On Tue, 13 Aug 2002, Tom Robinson wrote: > Hi, > > The following regexp is supposed to chop off the last / of a string > and all characters following it, but it seems to be ignoring the > non-greedy indicator (?): > > irb(main):001:0> "http://www.x.com/y/z.html".sub(%r|/.+?\.html$|, '') > "http:" > > The expected result should be "http://www.x.com/y". I thought this > was a bug but perl produces the same result, so what am I missing? You're missing the notion of a leftmost match. The regex engine reads from left to right, so to speak, in looking for the '/'. It finds it in the sixth character. Then it does what you ask: namely, look for '.html' at the end of the line. To do what you were trying to do, try this: irb> "http://www.x.com/y/z.html".sub(%r|/[^/]+/?\.html$|, '') "http://www.x.com/y" That also finds the leftmost match -- but in this case, the leftmost match doesn't start until the last '/' (because none of the other '/'s, even though they're further left, allow the rest of the match to succeed). David -- David Alan Black home: dblack@xxxxxxxxxxxxxxxxxxxx work: blackdav@xxxxxxx Web: http://pirate.shu.edu/~blackdav
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | non-greedy regexp, Tom Robinson |
|---|---|
| Next by Date: | Re: non-greedy regexp, Mauricio Fernández |
| Previous by Thread: | non-greedy regexp, Tom Robinson |
| Next by Thread: | Re: non-greedy regexp, dblack |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |