|
|
Subject: Re: Re: segmented fusion - a-ha! - msg#00086
List: parsers.spirit.devel
dan marsden wrote:
<snip lots of stuff about sourceforge crash issue leading to missing changes>
> Sorry if this is proving a distraction, I'll aim to get things back in sync
> with my local copy shortly. All of this in on the FUSION_V2 branch, I've no
> idea if anything has happened on other branches.
Ok, I think I've synced things up with my local copy to recover the changes
that went missing. I checked out a clean copy and built all the unit tests on a
couple of compilers, so hopefully all is well.
Any issues now, and I'll have a look at whats going on. Docs still need fixing,
will be done soon.
Cheers
Dan
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Was this page helpful?
Thread at a glance:
Previous Message by Date:
click to view message preview
[spirit] Ternary Search Tree (TST) revisited
Hi!
As many of you might not know, there's a nice TST(*)
implementation in the Spirit CVS written by Steve Rowe
based on the one you see used by Spirit's symbol table. The
code has methods for iteration (begin/end), insert, erase,
find, find_longest and find_all_sub_matches. There's also a
full test suite.
I would want to use it as the basis for Spirit-2's symbol
table. The objective is to optimize the code and do some
benchmarks with it. The last time I heard from Steve Rowe,
he said that the code is not (yet) optimal. I was wondering
if anyone would be interested to adopt the code, bring it
up to date, and carry on with its development? I'm sure it
would be a good candidate for a Boost Library.
(*)
Ternary Search Tree implementation. The data structure is faster than
hashing for many typical search problems especially when the search
interface is iterator based. Searching for a string of length k in a
ternary search tree with n strings will require at most O(log n+k)
character comparisons. TSTs are many times faster than hash tables
for unsuccessful searches since mismatches are discovered earlier
after examining only a few characters. Hash tables always examine an
entire key when searching.
The code is in the Spirit CVS in the GENERALIZE_TST branch.
Please email me if you are interested.
Regards,
--
Joel de Guzman
http://www.boost-consulting.com
http://spirit.sf.net
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Next Message by Date:
click to view message preview
Re: [spirit] Ternary Search Tree (TST) revisited
Joel de Guzman wrote:
>
> As many of you might not know, there's a nice TST(*)
> implementation in the Spirit CVS written by Steve Rowe
> I would want to use it as the basis for Spirit-2's symbol
> table. The objective is to optimize the code and do some
> benchmarks with it.
You may want to look at http://www.abc.se/~re/code/tst
It's not finished, but may be usable in present state.
The unfinished interface bit is: defining subkey iterators, and rewrite the
imprecise matching algorithms as nonmember functions.
I define the TST containers as Associative Sorted with internal structure
in the keys - the keys are Forward Containers.
(hence the name Structured Containers, but this may be an overstatement,
suggestions welcome)
So iterators should perhaps use a variant of segmented iterators
- though not for performance purposes.
I haven't found time to finish it (I just use it as is),
but with some suggestions for interface and implementation I could perhaps
get going again. The implementation is needlessly messy in spots,
as I tried to do without Boost.Iterator and MPL.
> (*)
> Ternary Search Tree implementation. The data structure is faster than
> hashing for many typical search problems especially when the search
> interface is iterator based. Searching for a string of length k in a
> ternary search tree with n strings will require at most O(log n+k)
> character comparisons. TSTs are many times faster than hash tables
> for unsuccessful searches
I have not found this to be generally true, at least not the "many times
faster" claim,
at least not for the implementations I've seen.
"About as fast as hashed containers" is a more reasonable statement.
Please see http://www.abc.se/~re/code/tst/tst_docs/tst_perf.html
re
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
Previous Message by Thread:
click to view message preview
Re: Re: segmented fusion - a-ha!
Joel de Guzman wrote:
>Eric Niebler wrote:
>
>> I'm in need of a stable branch point. Looks like you've been applying
>> changes as a result of the review, and there is at least one problem.
>> You've added some references to the non-existent "detail::tag_of". This
>> should be "traits::tag_of". Patch attached. OK to commit?
>
>Oh, sure. please.
>
>Thanks!
detail::tag_of should be detail::tag_of (its not a defect). The problem is that
since the sourceforge outage a variety of changes from just pre-review through
to when sourceforge came back up have gone missing. I've got the changes
locally, and I'm in the process of checking things back in, but it is a fiddly
process as cvs is getting confused about the missing version numbers, so I'm
taking my time about checking in the latest stuff. I think references to
detail::tag_of were checked in before I noticed how out of sync cvs had become
over the outage (diff doesn't identify all the diffs from what I can see, due
to confusion over version numbers).
Sorry if this is proving a distraction, I'll aim to get things back in sync
with my local copy shortly. All of this in on the FUSION_V2 branch, I've no
idea if anything has happened on other branches.
Cheers
Dan
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
Next Message by Thread:
click to view message preview
Re: segmented fusion - a-ha!
dan marsden wrote:
dan marsden wrote:
<snip lots of stuff about sourceforge crash issue leading to missing changes>
Sorry if this is proving a distraction, I'll aim to get things back in sync
with my local copy shortly. All of this in on the FUSION_V2 branch, I've no
idea if anything has happened on other branches.
Ok, I think I've synced things up with my local copy to recover the changes
that went missing. I checked out a clean copy and built all the unit tests on a
couple of compilers, so hopefully all is well.
Any issues now, and I'll have a look at whats going on. Docs still need fixing,
will be done soon.
Thanks, Dan. I've created a new branch called SEGMENTED_FUSION_V2. (It
is a branch from FUSION_V2.) I will be adding segmented support to
Fusion there, and periodically reverse integrating from FUSION_V2. When
Fusion moves into Boost main CVS, we'll add this branch there, too. When
segmented support is mature, we can merge it into the trunk, if there's
interest.
--
Eric Niebler
Boost Consulting
www.boost-consulting.com
-------------------------------------------------------
All the advantages of Linux Managed Hosting--Without the Cost and Risk!
Fully trained technicians. The highest number of Red Hat certifications in
the hosting industry. Fanatical Support. Click to learn more
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=107521&bid=248729&dat=121642
|
|