|
[SPOILER] My Solutions to the Problem: msg#00037lang.perl.qotw.discuss
Hi all! Well, I realized I could do the task in Perl using a linear scan on the lines, and adding .M lines in between where appropriate. However, I also thought how I do it in shell, and came to the following program: <<<<<<<<<<<<<<<<<<<<<<< #!/bin/bash IN_FILE="$1" shift OUT_FILE="$1" shift ( ( cat < "$IN_FILE" | sed 's!\..*$!!' | uniq | # [1] (while read T ; do echo "$T.M" ; done) ) ; ( cat < "$IN_FILE" ) ) | sort | uniq > "$OUT_FILE" # [1] - This "uniq" command isn't absolutely necessary, but it helps in the # running time. >>>>>>>>>>>>>>>>> It was tested in bash, but I believe it will run in any Bourne Shell-derivative. In case you're not familiar with bash, I've translated the algorithm to Perl: <<<<<<<<<<<<<<<<<<<<<< #!/usr/bin/perl use strict; use warnings; sub uniq { my $before = undef; my @ret; foreach my $item (@_) { if (defined($before) && ($item eq $before)) { # Do Nothing } else { push @ret, $item; $before = $item; } } return @ret; } sub get_lines { my $filename = shift; open I, "<", $filename; my @lines = (<I>); close(I); chomp(@lines); return @lines; } my $in_file = shift(@ARGV); my $out_file = shift(@ARGV); my @file_lines = get_lines($in_file); my @added_Ms = (map { "$_.M" } (uniq( map { /^(\w+)\.[A-Z]$/; $1; } @file_lines ) ) ); my @total_lines = (@added_Ms, @file_lines); my @result = uniq(sort { $a cmp $b } @total_lines); open O, ">", $out_file; print O (map { "$_\n" } @result); close(O); >>>>>>>>>>>>>>>>>>>>>> Considerably more verbose, and I had to implement my own "uniq" function. Then I thought of a slightly different algorithm and implemented it in Perl (albeit it too is highly shell-like): <<<<<<<<<<<<<<<<<<<<< #!/usr/bin/perl use strict; use warnings; sub uniq { my $before = undef; my @ret; foreach my $item (@_) { if (defined($before) && ($item eq $before)) { # Do Nothing } else { push @ret, $item; $before = $item; } } return @ret; } sub get_lines { my $filename = shift; open I, "<", $filename; my @lines = (<I>); close(I); chomp(@lines); return @lines; } my $in_file = shift(@ARGV); my $out_file = shift(@ARGV); my @result = uniq( sort { $a cmp $b } map { my $s = $_; $s=~s{[A-Z]$}{M}; ($_, $s) } get_lines($in_file) ); open O, ">", $out_file; print O (map { "$_\n" } @result); close(O); >>>>>>>>>>>>>>>>>>>>> The real algorithm within it appears here: <<< my @result = uniq( sort { $a cmp $b } map { my $s = $_; $s=~s{[A-Z]$}{M}; ($_, $s) } get_lines($in_file) ); >>>> Now, all these solutions use sorting which has the unfortunate side-effects of increasing the complexity to O(N*log(N)). But I also wrote a linear-time Perl solution: <<< #!/usr/bin/perl -w use strict; use warnings; my $in_file = shift; my $out_file = shift; open IN, "<", $in_file; open OUT, ">", $out_file; my $prev_prefix = undef; my $prev_suffix = undef; my ($prefix, $suffix); my $line; while ($line = <IN>) { chomp($line); $line =~ /^(\w+)\.([A-Z])$/ or die "Invalid line - $line!"; ($prefix, $suffix) = ($1, $2); if ((!defined($prev_prefix)) || ($prev_prefix ne $prefix)) { if (defined($prev_suffix) && ($prev_suffix lt "M")) { print OUT "$prev_prefix.M\n"; } if ($suffix gt "M") { print OUT "$prefix.M\n"; } } # The prefixes are the same. elsif (($suffix gt "M") && ($prev_suffix lt "M")) { print OUT "$prefix.M\n"; } } continue { print OUT "$line\n"; $prev_prefix = $prefix; $prev_suffix = $suffix; } if (defined($prev_prefix) && ($prev_suffix lt "M")) { print OUT "$prev_prefix.M\n"; } close(IN); close(OUT); >>> This solution is similar to Bill Smith' one, except that I used a $prev_suffix variable instead of a flag that indicates whether the .M line was already printed. Regards, Shlomi Fish --------------------------------------------------------------------- Shlomi Fish shlomif-ik1l9ssToec+JF/nGntIXQ@xxxxxxxxxxxxxxxx Homepage: http://www.shlomifish.org/ Knuth is not God! It took him two days to build the Roman Empire. |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | [SPOILER] Perl 'Easy' Quiz of the Week #2005-1: 00037, Brad Greenlee |
|---|---|
| Next by Date: | Re: [QUIZ] Perl 'Easy' Quiz of the Week #2005-1: 00037, Ronald J Kimball |
| Previous by Thread: | [SPOILER] Easy QOTW 2005-1 - Test Suitei: 00037, Shlomi Fish |
| Next by Thread: | [SPOILER] Minimalistic (but CPAN-enabled) solution: 00037, Shlomi Fish |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |