Hi,
Many thanks for taking the time to respond.
It is very informative.
Best regards,
PhiHo
> -----Original Message-----
> From: David Warren [mailto:warren-EX0cT3Az47bauI2f2gSDlQ@xxxxxxxxxxxxxxxx]
> Sent: Wednesday, October 10, 2007 11:33 AM
> To: PhiHo Hoang
> Cc: 'David Warren';
> xsb-development-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@xxxxxxxxxxxxxxxx
> Subject: RE: [Xsb-development] Question of dynamic loading and
> multifile
>
> PhiHo Hoang writes:
> > > We regularly run programs with over 200MB of data (that's after
> > > loading it into XSB). With some care it runs fine (of course have
> to
> > > have enough RAM to store it all and everything else that is
> needed.)
> >
> > Would you please elaborate on this "data".
> >
> > Are they "facts", "rules" or just numbers, text...
>
> It varies. We are using the CDF package in XSB and it stores
> "ontologies". For example, we have a taxonomy which is an extension
> of the UNSPSC taxonomy, a taxonomy of parts and services, which is
> four levels deep and contains about 12,000 nodes. Our extension makes
> it about 60,000 nodes. It is represented in CDF, which identifies
> classes and object with small binary terms, represents subclass as a
> binary predicate over these id's, and represents relationships as
> 3-ary predicates over these id's. So "all" of these are facts, except
> of course, for the routines that process these facts, doing
> inheritance, etc. Those facts are compiled and not stored in the
> dynamic database. They are both numbers and text and small
> structures.
>
> (And then we have part data, that describes the 5,000,000 parts that
> are managed by our customer. We don't (can't) load all that data into
> memory, but we sometimes process batches of it.)
>
> > > The big issue is good indexing.
> > >
> >
> > If the data are just relational facts, would it be better to
> leverage "real"
> > DBMS?
>
> Yes, we could (and sometimes do) put them into a database, but we
> generally don't use the database retrieval mechanisms (except to load
> them) since they are too slow to do things like inheritance. That
> requires a recursive evaluation and retrieving rows tuple-at-a-time
> from a relational database is just too slow. We have room to load
> them into memory and then we run at program speeds, not external
> database speeds. That can be up to a couple of orders of magnitude
> faster.
>
> > I have been wondering how well would Prolog inference engine scale
> in
> > handling relational data.
> >
> > If DBMS is used to store the relational facts, can their indexing
> capability
> > be used for Prolog?
>
> Yes, of course. The problem of efficiently interfacing Prolog with
> Relational databases has a long and rich history. There's a lot of
> work on it. My conclusion is that if you're doing things that
> relational databases do well (store large amounts of flat data in
> relatively few tables and can retrieve what you need with SQL
> queries), the by all means use a RDBMS. If your data is of complex,
> hierarchical structure, and it doesn't fit easily into a few
> relational tables (without ridiculously many null values), then you
> need something else. If it fits in memory (or you can batch load and
> process it in pieces that will fit in memory), then Prolg works
> better. If there's no way you could fit it in memory, then either you
> suffer with relational technology, or try object-oriented database
> technology, or build your own hybrid.
>
> Regards,
> -David
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems? Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
|