logo       

[SPOILER] My Solutions to the Problem: msg#00037

lang.perl.qotw.discuss

Subject: [SPOILER] My Solutions to the Problem

Hi all!

Well, I realized I could do the task in Perl using a linear scan on the lines,
and adding .M lines in between where appropriate. However, I also thought how
I do it in shell, and came to the following program:

<<<<<<<<<<<<<<<<<<<<<<<
#!/bin/bash
IN_FILE="$1"
shift
OUT_FILE="$1"
shift
(
( cat < "$IN_FILE" |
sed 's!\..*$!!' |
uniq | # [1]
(while read T ; do
echo "$T.M" ;
done)
) ;
( cat < "$IN_FILE" )
) |
sort |
uniq > "$OUT_FILE"

# [1] - This "uniq" command isn't absolutely necessary, but it helps in the
# running time.

>>>>>>>>>>>>>>>>>

It was tested in bash, but I believe it will run in any Bourne
Shell-derivative. In case you're not familiar with bash, I've translated the
algorithm to Perl:

<<<<<<<<<<<<<<<<<<<<<<
#!/usr/bin/perl

use strict;
use warnings;

sub uniq
{
my $before = undef;
my @ret;
foreach my $item (@_)
{
if (defined($before) && ($item eq $before))
{
# Do Nothing
}
else
{
push @ret, $item;
$before = $item;
}
}
return @ret;
}

sub get_lines
{
my $filename = shift;
open I, "<", $filename;
my @lines = (<I>);
close(I);
chomp(@lines);
return @lines;
}

my $in_file = shift(@ARGV);
my $out_file = shift(@ARGV);

my @file_lines = get_lines($in_file);

my @added_Ms =
(map { "$_.M" }
(uniq(
map { /^(\w+)\.[A-Z]$/; $1; }
@file_lines
)
)
);

my @total_lines = (@added_Ms, @file_lines);

my @result = uniq(sort { $a cmp $b } @total_lines);

open O, ">", $out_file;
print O (map { "$_\n" } @result);
close(O);
>>>>>>>>>>>>>>>>>>>>>>

Considerably more verbose, and I had to implement my own "uniq" function.

Then I thought of a slightly different algorithm and implemented it in Perl
(albeit it too is highly shell-like):

<<<<<<<<<<<<<<<<<<<<<
#!/usr/bin/perl

use strict;
use warnings;

sub uniq
{
my $before = undef;
my @ret;
foreach my $item (@_)
{
if (defined($before) && ($item eq $before))
{
# Do Nothing
}
else
{
push @ret, $item;
$before = $item;
}
}
return @ret;
}

sub get_lines
{
my $filename = shift;
open I, "<", $filename;
my @lines = (<I>);
close(I);
chomp(@lines);
return @lines;
}

my $in_file = shift(@ARGV);
my $out_file = shift(@ARGV);

my @result =
uniq(
sort { $a cmp $b }
map { my $s = $_; $s=~s{[A-Z]$}{M}; ($_, $s) }
get_lines($in_file)
);

open O, ">", $out_file;
print O (map { "$_\n" } @result);
close(O);
>>>>>>>>>>>>>>>>>>>>>

The real algorithm within it appears here:

<<<
my @result =
uniq(
sort { $a cmp $b }
map { my $s = $_; $s=~s{[A-Z]$}{M}; ($_, $s) }
get_lines($in_file)
);
>>>>

Now, all these solutions use sorting which has the unfortunate side-effects of
increasing the complexity to O(N*log(N)). But I also wrote a linear-time Perl
solution:

<<<
#!/usr/bin/perl -w

use strict;
use warnings;

my $in_file = shift;
my $out_file = shift;

open IN, "<", $in_file;
open OUT, ">", $out_file;

my $prev_prefix = undef;
my $prev_suffix = undef;
my ($prefix, $suffix);
my $line;

while ($line = <IN>)
{
chomp($line);
$line =~ /^(\w+)\.([A-Z])$/ or
die "Invalid line - $line!";

($prefix, $suffix) = ($1, $2);

if ((!defined($prev_prefix)) ||
($prev_prefix ne $prefix))
{
if (defined($prev_suffix) && ($prev_suffix lt "M"))
{
print OUT "$prev_prefix.M\n";
}
if ($suffix gt "M")
{
print OUT "$prefix.M\n";
}
}
# The prefixes are the same.
elsif (($suffix gt "M") && ($prev_suffix lt "M"))
{
print OUT "$prefix.M\n";
}
}
continue
{
print OUT "$line\n";
$prev_prefix = $prefix;
$prev_suffix = $suffix;
}

if (defined($prev_prefix) && ($prev_suffix lt "M"))
{
print OUT "$prev_prefix.M\n";
}

close(IN);
close(OUT);
>>>

This solution is similar to Bill Smith' one, except that I used a $prev_suffix
variable instead of a flag that indicates whether the .M line was already
printed.

Regards,

Shlomi Fish

---------------------------------------------------------------------
Shlomi Fish shlomif-ik1l9ssToec+JF/nGntIXQ@xxxxxxxxxxxxxxxx
Homepage: http://www.shlomifish.org/

Knuth is not God! It took him two days to build the Roman Empire.



<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise