Please take our Survey
logo       

Choosing A Webhost:
A web hosting service is a type of Internet hosting service that allows individuals and organizations to provide their own website accessible via the World Wide Web. Web hosts are companies that provide space on a server they own for use by their clients as well as providing Internet connectivity, typically in a data center. Web hosts can also provide data center space and connectivity to the Internet for servers they do not own to be located in their data center, called colocation. more...

Re: Conversion of Perl Perforce repository to Subversion - Part 1: msg#00003

version-control.revml

Subject: Re: Conversion of Perl Perforce repository to Subversion - Part 1

On Wed, May 19, 2004 at 12:26:30PM -0400, John Peacock wrote:
> I have [stupidly] agreed to test the feasibility of converting the main
> Perl repository from Perforce to Subversion. Initially, this would be to
> provide a readonly public repository; eventually, it might lead to
> development being moved permanently from P4 to SVN. I have two questions,
> the first more of a possible design issue with VCP and the second more of a
> practical question based on my incomplete understanding of VCP, so I'll
> leave the second question for another message.
>
> I am using CLKao's svk to mirror the Perforce repository, which ultimately
> uses VCP to do the heavy lifting. I've attempted the conversion twice and
> both times, the server eventually swapped itself almost to death due to the
> huge RAM requirements (first 512MB then 2GB actual memory installed).
> Based on my readings of the LIMITATIONS in VCP::Dest::revml, the odds are
> good that the basic design is flawed for such a large conversion (64k
> revisions).

I hope that you're not trying to use the VCP::Dest::revml driver for
serious conversions. Even if it didn't hog up a log of disk space,
going to RevML and then away from RevML is going to be terribly slow.

The VCP::Dest::revml driver is definitely not meant to convert huge
repositories. It's a research and testing driver until someone comes up
with a good use case for RevML (we originally set out to develop RevML
with VCP's precursor being a desktop extractor/inserter to/from RevML,
but there seems to be no constituency for RevML the language and doing
conversions by extracting from the source to RevML and then from RevML
in to the destination is going to be much less efficient than going
directly from one repository to another).

That being said, should a need for production support for RevML arise,
VCP's RevML drivers could be optimized to only cache a few files and
refresh the cache from the source repository, but only if the source
repository is also not RevML.

The RAM limitation should not apply to other drivers, though I can't
speak for the svn drivers. If you're seeing massive RAM use when using
VCP::Source::{p4,cvs,vss} and VCP::Dest::p4, then I need to get to the
bottom of it. But I don't think that's what you're doing.

If you want to send me a copy of the perl repository, I can work with it
here to narrow in on the problem; the core VCP filters and {p4,vss,cvs}
drivers need to be RAM friendly.

> I don't know where to start looking; I assume if I could find out what hash
> is being used to store the metadata, I could convert that to a tied hash
> and trade performance for being able to actually finish the conversion.
> I'm not even sure if this is a flaw in VCP::Dest::svk or if it is in one of
> the other modules that makes up VCP.
>
> Any hints and directions to start my hunt would be appreciated.

You can try using the null: destination and (first) no filter, then
(second and later) the filters VCP reports using in its log file on the
p4->svn conversion to isolate the RAM usage.

By far the most common data structure is the VCP::Rev object, so tracing
the lifecycle of VCP::Rev instance is likely to turn up some
information. In order to conserve memory, however, this is a packed
data structure in memory and a lot of the standard strings are stored in
tied hashes so that VCP::Rev instances can contain ints. Forcing a
coredump and looking at it with the strings command might be informative
(in case I forgot to tie a hash).

- Barrie


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

Recently Viewed:
hardware.arm.at...    cms.citadel.dev...    video.gstreamer...    java.facelets.u...    misc.basics.qna...    web.wiki.instik...    network.uip.use...    xdg.devel/2003-...    tex.bibtex.bibd...    finance.quotesp...    ietf.zeroconf/2...    redhat.blinux.g...    suse.db2/2003-0...    php.phpesp/2004...    uml.devel/2003-...    gnome.labyrinth...    qnx.openqnx.dev...    boot-loaders.gr...    db.dataperfect....    audio.audacity....    linux.uclinux.m...    editors.j.devel...    os.openbsd.tech...    kde.users.multi...   
Home | advertise | OSDir is an inevitable website. super tiny logo

Free Magazines

Cisco News
Receive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business.
subscribe

Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field.
subscribe

The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business.
subscribe

Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company.
subscribe

Total Telecom Total Telecom is "The Economist of the communications industry".
subscribe

Navigation