logo       

Re: [bdbxml] problems on concurrent container access (and some questions, t: msg#00009

Subject: Re: [bdbxml] problems on concurrent container access (and some questions, too)

Thanks to David for answering most of these questions.

Sebastian Wrede wrote:

* adding indexes to an already populated container fails when there are
more than about 5000 documents present.  all documents are smaller than
1kb. adding the index first and then populating the database works, but
we'd like to be able to add indexes afterwards to optimize query
performance.

The error is:
        Dbc::get: Cannot allocate memory.

Again, some code:
                DbTxn *txn;
                env_->txn_begin(0, &txn, 0);             

                XmlIndexSpecification(idxSpec=dbxml_.getIndexSpecification(txn);
                idxSpec.addIndex (spec.uri_, spec.node_, spec.index_);
                dbxml_.setIndexSpecification(txn,idxSpec);                      
                txn->commit(0);                      

URI is usually empty.


note that we used default database settings throughout and the
environment has been opened using the flags:
DB_INIT_LOG|DB_CREATE|DB_INIT_MPOOL|DB_INIT_TXN|DB_INIT_LOCK


Adding the index causes the indexer to walk over all the documents creating
additional index entries for each document. With many documents this creates
a really huge transaction. I'd suggest you don't transact calls to addIndex. You
probably don't want to do indexing and update operations at the same time
anyway.

In addition to the above problems, we have two other feature questions:

* is it possible to abort a query in some way, e.g. from a signal
handler?  of course, without damaging the database ;-)

Hmm, we could maybe have a progress callback that the query thread calls between operations... the callback would get some progress indication and could return an
error code to terminate the query...

But, why do you want to do this? Because queries take a long time and the user gives up and wants to abort them? How about an estimate of the query cost instead?

* is there a way to use an index for queries like '[@timestamp > 10 and
@timestamp < 20]'?  an edge-attribute-equality-number index is used for
equality comparisons but not for interval expressions, apparently
Sebastian Wrede & Ingo Lütkebohle

We do support 'range' operations against numeric indexes... BUT... only against node
indexes, not edge ones.  Would a node index work OK for you instead?

John




<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

Recently Viewed:
linux.arklinux....    user-groups.lin...    kde.usability/2...    ietf.ipp/2002-0...    mail.spam.spamc...    os.netbsd.devel...    audio.cd-record...    text.unicode.de...    php.documentati...    games.fps.halfl...    window-managers...    suse.oracle.gen...    bug-tracking.gn...    video.dvdrip.us...    xfree86.cvs/200...    java.netbeans.m...    network.argus/2...    culture.sf.kill...    debian.ports.al...    freebsd.questio...    qplus.devel/200...    handhelds.palm....   
Home | blog view | USPTO Patent Archive | advertise | OSDir is an inevitable website. super tiny logo

Free Magazines

Cisco News
Receive a free quarterly e-newsletter with exclusive articles on how Cisco IT uses its own products and solutions to enable the business.
subscribe

Systems Management News, the newspaper for IT systems administration and data center managers! Each issue of Systems Management News is chock-full of news and analysis to help you understand what's happening in your field.
subscribe

The Enterprise Newsweekly eWeek is the essential technology information source for builders of e-business.
subscribe

Oracle Magazine Oracle Magazine contains technology strategy articles, sample code, tips, Oracle and partner news, how to articles for developers and DBAs, and more. Oracle (NASDAQ: ORCL) is the world's largest enterprise software company.
subscribe

Total Telecom Total Telecom is "The Economist of the communications industry".
subscribe