logo       

[SPOILER] Expert QOTW #14: msg#00059

Subject: [SPOILER] Expert QOTW #14
Hello, everyone:

My first time posting on this list, so greetings, hola, zdrastvuite, and
so on. :)

First off, I'd like to thank Mark-Jason for coming up with this
deceptively simple-sounding question. I've had a heck of a lot of fun
playing with this, even though the result is... well... I expected to do
better. However, I've learned a lot, so it's all good.

One thing I've learned in doing this is just how awful the regex engine
is at doing recursive backtracking, especially in long strings. Yeesh.
Despite a few years of using it, I never realized it until now, although
I'd heard rumors. The regex-based solution took three lines... and the
best part of forever to do just the SQL string! So, you won't see that
one. I've hidden it in an unused corner of "/dev/null"...

Also, I got a small surprise while playing around with the solution I
did choose: it seemed to me that it would make sense to jump forward,
just past the already-checked part of the string, instead of simply
incrementing the "pointer". The solution worked fine - but it was
*slower* than just incrementing. <shrug> I guess I'll have to start
digging a little deeper in "perlguts" to find out what's going on there.

Anyway... I ended up letting "substr" do all the heavy lifting for me; I
couldn't do any better with my own custom-mangled solutions. It's fairly
fast, despite the simplicity: under .2 seconds for the SQL string, just
over 5 seconds for the Constitution, and looks like well under two hours
for "Tristram" (it's past the half-way point at 1 hour, and it speeds up
quite a bit toward the end as the remainder string gets shorter.) I'd
appreciate any comments that you folks may have on improving it,
anything I've missed that would bite me in special cases, etc.

BTW, all of the above was tested on a 1.7GHz Dell laptop, 256MB.

------------------------------------------------------------------------
#!/usr/bin/perl -wl
# Created by Ben Okopnik on Sat Jun 14 10:21:19 EDT 2003

sub repeated_string {
        $str = shift;
        $longest = "";
        
        return "" unless $str =~ /(.).*\1/;
        $off = 0;
        $len = 1;
        
        {
                $match = substr $str, $off, $len;
                if ( ( substr $str, $off + $len ) =~ /\Q$match\E/ ){
                        $longest = $match;
                        $len++;
                }
                else{
                        $off++;
                }
                return $longest if ( $off + $len ) >= length $str;
                redo;
        }
}

{
        local $/;
        ( $string = <> ) =~ s/\n//g;
}


print repeated_string $string
------------------------------------------------------------------------



Ben Okopnik
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
I expect the gods to play pranks; I suspect immortality gets /dull/ if
one doesn't.
 -- Darkhawk



<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

Recently Viewed:
linux.arklinux....    user-groups.lin...    kde.usability/2...    ietf.ipp/2002-0...    mail.spam.spamc...    os.netbsd.devel...    audio.cd-record...    text.unicode.de...    php.documentati...    games.fps.halfl...    window-managers...    suse.oracle.gen...    bug-tracking.gn...    video.dvdrip.us...    xfree86.cvs/200...    java.netbeans.m...    network.argus/2...    culture.sf.kill...    debian.ports.al...    freebsd.questio...    qplus.devel/200...    handhelds.palm....   
Home | blog view | USPTO Patent Archive | advertise | OSDir is an inevitable website. super tiny logo

Free Magazines

Cisco News
Receive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business.
subscribe

Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field.
subscribe

The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business.
subscribe

Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company.
subscribe

Total Telecom Total Telecom is "The Economist of the communications industry".
subscribe