logo       

Re: Fusion-ifying proto parse trees: msg#00033

parsers.spirit.devel

Subject: Re: Fusion-ifying proto parse trees

Joel de Guzman wrote:
Eric Niebler wrote:

Eric Niebler wrote:

Joel de Guzman wrote:

Eric Niebler wrote:

The segmented algorithm looks complicated, but fortunately, I think all the algorithms will follow a similar pattern, so we should be able to crank them out without too much fuss. It is perfectly general and will work with any segmented data structure, not just binary trees, assuming you give it suitably defined iterators.

I have no performance numbers, but I have a good feeling about it. :-)

Thanks! I'm perusing the code now. This would be a good addition
to Fusion. I wonder how algorithms that return views would look
like, or if they are as efficient as suggested.

A view over a segmented sequence is itself a segmented sequence and would need segmented iterators. It would probably be best at this point to come up with some perf numbers to justify further work in this area.

I've done a little perf testing of the various approaches we have for traversing parse trees. I've attached the test.

Very interesting! Alas, I can't compile it :(

Ok, I got it to compile. Be wary of using namespace in the
global scope with VC7.1. I get lots of problems doing that
and I never do it again. In fact, this is most probably the
reason for the VC8 ICE, because now, I can compile on VC8.
See attached main.cpp.

I got mixed results testing with VC7.1, VC8.0 and g++3.4
on my laptop (1.5GHZ centrino) using fusion list and vector
(I fixed the fusion vector bug. It was a minor typo. please
do a cvs update). Fusion gives better results, except for
VC7.1/list/long-test, which I don't quite understand.
See attached results.txt

It's plausible that VC7.1 has an optimization bug, or it
fails to optimize the case for Fusion. I find it hard to
explain the 3x speedup. On VC8.0, that gain is lost.

The only way to verify this is to rewrite the test with
deterministic results. As it is, the numbers outputed
by the accumulator functor do not make sense. The results
of the traversals should be deterministic so that we can
determine that the code is doing the right thing. We
need reproducable results for all the tests.

Regards,
--
Joel de Guzman
http://www.boost-consulting.com
http://spirit.sf.net
/*=============================================================================
Copyright (c) 2006 Eric Niebler

Use, modification and distribution is subject to the Boost Software
License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
http://www.boost.org/LICENSE_1_0.txt)
==============================================================================*/
#define BOOST_PROTO_FUSION_V2
#define FUSION_MAX_LIST_SIZE 25
#define FUSION_MAX_VECTOR_SIZE 25

#ifdef _MSC_VER
// inline aggressively
# pragma inline_recursion(on) // turn on inline recursion
# pragma inline_depth(255) // max inline depth
#endif

#include <iostream>
#include <boost/timer.hpp>
#include <boost/fusion/sequence.hpp>
#include <boost/xpressive/proto/fusion.hpp>
#include <boost/fusion/sequence/container/list.hpp>
#include <boost/fusion/sequence/container/vector.hpp>
#include "./for_each_s.hpp"
#include "./segmented_proto.hpp"

#define THE_SEQUENCE list
//~ #define THE_SEQUENCE vector

namespace test
{
using boost::proto::lit;
using boost::proto::unary_op;
using boost::proto::noop_tag;
using boost::proto::arg;

struct accumulator
{
accumulator(int &i_)
: i(i_)
{}

template<typename T>
void operator()(T const &t) const
{
this->i += arg(t);
}

int &i;
};


int const REPEAT_COUNT = 10;

template<typename T>
double time_for_each_s(T const &t)
{
boost::timer tim;
int i = 0;
long long iter = 65536;
long long counter, repeats;
double result = 0;
double run;
do
{
tim.restart();
for(counter = 0; counter < iter; ++counter)
{
boost::fusion::for_each_s(t, accumulator(i));
}
result = tim.elapsed();
iter *= 2;
} while(result < 0.5);
iter /= 2;

// repeat test and report least value for consistency:
for(repeats = 0; repeats < REPEAT_COUNT; ++repeats)
{
tim.restart();
for(counter = 0; counter < iter; ++counter)
{
boost::fusion::for_each_s(t, accumulator(i));
}
run = tim.elapsed();
result = (std::min)(run, result);
}
std::cout << i << std::endl;
return result / iter;
}

#define EXPR1 (lit(3) >> 42) >> (lit(6) >> 29)
#define EXPR2 EXPR1 >> EXPR1 >> EXPR1 >> EXPR1 >> EXPR1 >> EXPR1

void test_short()
{
typedef unary_op<int, noop_tag> T;
boost::fusion::THE_SEQUENCE<T,T,T,T> const l(
lit(3),lit(42),lit(6),lit(29));

double list_time =
time_for_each_s( l );

double non_segmented_time =
time_for_each_s( EXPR1 );

double segmented_time =
time_for_each_s( make_segmented_view( EXPR1 ) );

std::cout << "Fusion list time : " << list_time << std::endl;
std::cout << "Non-segmented proto time : " << non_segmented_time <<
std::endl;
std::cout << "Segmented proto time : " << segmented_time <<
std::endl;
}

void test_long()
{
typedef unary_op<int, noop_tag> T;

boost::fusion::THE_SEQUENCE<T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T>
const l(
lit(3),lit(42),lit(6),lit(29)
,lit(3),lit(42),lit(6),lit(29)
,lit(3),lit(42),lit(6),lit(29)
,lit(3),lit(42),lit(6),lit(29)
,lit(3),lit(42),lit(6),lit(29)
,lit(3),lit(42),lit(6),lit(29));

double list_time =
time_for_each_s( l );

double non_segmented_time =
time_for_each_s( EXPR2 );

double segmented_time =
time_for_each_s( make_segmented_view( EXPR2 ) );

std::cout << "Fusion list time : " << list_time << std::endl;
std::cout << "Non-segmented proto time : " << non_segmented_time <<
std::endl;
std::cout << "Segmented proto time : " << segmented_time <<
std::endl;
}
}

int main()
{
std::cout << "Short test ... \n";
test::test_short();

std::cout << "Long test ... \n";
test::test_long();

return 0;
}
/////////////////////// list VC7.1
Short test ...
-5242880
-5242880
-5242880
Fusion list time : 4.19468e-009
Non-segmented proto time : 4.77582e-009
Segmented proto time : 4.18723e-009
Long test ...
-1642070016
1713373184
-31457280
Fusion list time : 1.04189e-007
Non-segmented proto time : 6.2561e-007
Segmented proto time : 2.93255e-008

/////////////////////// vector VC7.1
Short test ...
-5242880
-5242880
-5242880
Fusion list time : 4.19468e-009
Non-segmented proto time : 4.89503e-009
Segmented proto time : 4.76837e-009
Long test ...
-31457280
1713373184
2116026368
Fusion list time : 2.74777e-008
Non-segmented proto time : 5.35965e-007
Segmented proto time : 2.98023e-008

/////////////////////// list VC8.0
Short test ...
-5242880
-5242880
-5242880
Fusion list time : 4.06802e-009
Non-segmented proto time : 4.07547e-009
Segmented proto time : 4.07547e-009
Long test ...
-31457280
1713373184
-31457280
Fusion list time : 2.28286e-008
Non-segmented proto time : 5.96046e-007
Segmented proto time : 2.51532e-008

/////////////////////// vector VC8.0
Short test ...
-5242880
-5242880
-5242880
Fusion list time : 4.06802e-009
Non-segmented proto time : 4.07547e-009
Segmented proto time : 4.18723e-009
Long test ...
-31457280
1713373184
2116026368
Fusion list time : 1.86265e-008
Non-segmented proto time : 5.36919e-007
Segmented proto time : 4.00543e-008

/////////////////////// list g++ 3.4
Short test ...
-1078984704
-273678336
2008023040
Fusion list time : 4.65512e-08
Non-segmented proto time : 1.93834e-07
Segmented proto time : 2.01225e-07
Long test ...
-836763648
1478492160
-1306525696
Fusion list time : 2.98023e-07
Non-segmented proto time : 1.48773e-06
Segmented proto time : 1.46103e-06

/////////////////////// vector g++ 3.4
Short test ...
-1078984704
-273678336
2008023040
Fusion list time : 4.75049e-08
Non-segmented proto time : 2.00987e-07
Segmented proto time : 2.08378e-07
Long test ...
-836763648
-1306525696
-1306525696
Fusion list time : 2.90394e-07
Non-segmented proto time : 1.52016e-06
Segmented proto time : 1.54877e-06

<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise