logo       

[QUIZ] Perl 'Easy' Quiz of the Week #2005-1: msg#00030

Subject: [QUIZ] Perl 'Easy' Quiz of the Week #2005-1
IMPORTANT: Please do not post solutions, hints, or other spoilers
        until at least 60 hours after the date of this message.
        Thanks.

IMPORTANT: S'il vous plaît, attendez au minimum 60 heures après la
        date de ce message avant de poster solutions, indices ou autres
        révélations. Merci.

[Other translations omitted as I am not in a position to verify them.
Besides, I always found it bizarre to include translations of the
caveat without including translations of the quiz]

This week's quiz was inspired by a problem seen at my day job (*)
about a year ago that I was recently inspired to look at again.
Without further ado:

Your task is to transform an input file containing unique sorted lines
matching the regular expression /^\w+\.[A-Z]$/.  For example, a
typical input file might contain:

ALDA.D
ALDA.Q
ALDA.W
AMTA.B
AMTA.E
AMTA.M
AMTA.X
BMX.F
C.X
DMZ.A
DMZ.X

Note that the input can be grouped by what comes before the '.' - in
the example, we have everything beginning with 'ALDA.', then
everything beginning with 'AMTA.', etc.  Call the bit before the '.'
the prefix, and the bit after the dot the suffix.

Your job is to insert lines into the input so that for every unique
prefix, there is one (and only one!) line with the suffix ".M".  For
example, the input above would yield the output

ALDA.D
ALDA.M
ALDA.Q
ALDA.W
AMTA.B
AMTA.E
AMTA.M
AMTA.X
BMX.F
BMX.M
C.M
C.X
DMZ.A
DMZ.M
DMZ.X

Note that the output will have every line that was in the input, plus
some extra .M lines (although note that AMTA.M was already present in
the input).  Also, the output must consist only of sorted unique
lines.

Your program should take 2 arguments, an input filename and an output
filename.  It would be convenient if your program also worked with
fewer arguments, but I'm not going to make that part of the quiz.  (If
you do decide to go the extra mile and also accept fewer arguments,
with only one argument your program should use STDOUT as the output
file, and with no arguments your program should behave as a filter -
STDIN for input, STDOUT for output.)

As an added challenge, the file may be entirely too big to fit into
available memory.  Your program should still behave properly.  It is
in fact possible to write a solution whose memory requirements are
determined by the maximum length of an individual line, not the number
of lines in the file.

Be careful of the following edge cases:

echo C.X > test_input_1
perl ricfilter.pl test_input_1 test_output_1
# test output 1 should now contain two lines: C.M followed by C.X

echo C.G > test_input_2
perl ricfilter.pl test_input_2 test_output_2
# test output 2 should now contain two lines: C.G followed by C.M

cat /dev/null > test_input_3
perl ricfilter.pl test_input_3 test_output_3
# test output 3 should be blank

Have fun!

(*) However, the details(**) of the problem have been scrubbed so as
to have little to do with my employer.

(**) Extra points to someone who guesses what the real suffix that had
to be added was (it wasn't .M), and why, though you'd need to be in a
very small specific corner of the financial data processing world to
know that.



<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

Recently Viewed:
boot-loaders.gr...    php.pear.genera...    debugging.valgr...    kde.redhat.user...    text.xml.xsl.ge...    culture.languag...    hardware.microc...    java.servicemix...    redhat.release....    web.zope.plone....    user-groups.lin...    opendarwin.webk...    video.mjpeg.use...    sysutils.bcfg2....    encryption.gpg....    lx-office.devel...    xfree86.forum/2...    mail.mutt.devel...    acpi.devel/2003...    qnx.openqnx.dev...    network.irc.irs...    freebsd.devel.m...   
Home | blog view | USPTO Patent Archive | advertise | OSDir is an inevitable website. super tiny logo

Free Magazines

Cisco News
Receive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business.
subscribe

Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field.
subscribe

The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business.
subscribe

Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company.
subscribe

Total Telecom Total Telecom is "The Economist of the communications industry".
subscribe