On Fri, Jun 04, 2004 at 07:47:52AM -0400, Mark Jason Dominus wrote:
> John Macdonald <john-Z7w/En0MP3xWk0Htik3J/w@xxxxxxxxxxxxxxxx>:
> > > > The Berkeley DB package could be used with its numbered
> > > > line variant to accomplish this.
> > >
> > > No, because that violates the requirement that the file not be read
> > > into memory all at once. If you wanted to do that, you might as well
> > > use
> > >
> > > @words = <$words>;
> > > return $words[rand @words];
> >
> > You were talking about a prepass to rewrite the file
> > into a sorted order - using that same prepass to instead
> > rewrite the file into a DB recno file
>
> Perhaps you know something I don't, but I am not aware of any such
> thing as "a DB recno file". I believe DB_RECNO reads and writes plain
> text files.
>
> > It would no longer be a plain text file, though.
>
> I think you are mistaken.
The DB_File standard package provides tied access to a file
using a choice of 3 methods. $DB_HASH and $DB_BTREE tie
the file through a hash access mechanism using a text key
(with DB_HASH losing order, but DB_BTREE retaining it),
while $DB_RECNO ties the file to an array access method
using a line number as a key. (You have to be consistant -
the file must have been originally written using the same
access mechanism.) The $DB_HASH form was implicitly used
if you built perl using DB as the underlying mechanism for
dbmopen in perl3/4. However, with the advent of tie and
packages in perl5 you no longer needed to only use that
one default method or limit yourself to only the tied hash
variable to interface to the underlying file.
--
|