|
Re: Fusion-ifying proto parse trees: msg#00036parsers.spirit.devel
Joel de Guzman wrote: Joel de Guzman wrote: I think the problem with doing it this way is that the compiler may see that the assignments to i which happen further down in the times calls are never used. They might be getting optimized away. I've attached a version that tries harder to force the compiler to actually generate code. The results are interesting. I still see the bizarre 3x speed-up on VC7.1 for large ET trees, but now segmented proto is beating the pre-flattened lists for small ET trees, too! // VC7.1, fusion list Short test ... 80 80 80 Fusion list time : 2.84985e-009 Non-segmented proto time : 2.79397e-009 Segmented proto time : 2.79397e-009 Long test ... 480 480 480 Fusion list time : 4.1008e-008 Non-segmented proto time : 4.17233e-007 Segmented proto time : 1.86265e-008 Proto still lags badly with gcc: // gcc, fusion list Short test ... 80 80 80 Fusion list time : 2.7895e-08 Non-segmented proto time : 1.13606e-07 Segmented proto time : 1.19209e-07 Long test ... 480 480 480 Fusion list time : 1.74999e-07 Non-segmented proto time : 1.10245e-06 Segmented proto time : 9.53674e-07 The reason I used a list in the test is because that's what Spirit-2 was using. Would you really consider switching to vector, which would place limits on the number of nodes in a sequence? Also, these tests leave out one detail ... the cost of building the expression template. Is building an unflattened proto parse tree more or less expensive than flattening into a list/vector eagerly? We should really be taking that into account. The cost of building the ET would probably dominate. Regarding your question about taking the minimum result, I lifted this code from John Maddock's regex performance shoot-out. This is how he collects his timing. Yes, this is a black art -- I don't really understand it. -- Eric Niebler Boost Consulting www.boost-consulting.com /*============================================================================= Copyright (c) 2006 Eric Niebler Use, modification and distribution is subject to the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) ==============================================================================*/ #define BOOST_PROTO_FUSION_V2 #define FUSION_MAX_LIST_SIZE 25 #define FUSION_MAX_VECTOR_SIZE 25 #ifdef _MSC_VER // inline aggressively # pragma inline_recursion(on) // turn on inline recursion # pragma inline_depth(255) // max inline depth #endif #include <iostream> #include <boost/timer.hpp> #include <boost/xpressive/proto/fusion.hpp> #include <boost/fusion/sequence/container/list.hpp> #include "./for_each_s.hpp" #include "./segmented_proto.hpp" #define THE_SEQUENCE list //~ #define THE_SEQUENCE vector namespace test { using boost::proto::lit; using boost::proto::unary_op; using boost::proto::noop_tag; struct accumulator { accumulator(int &i_) : i(i_) {} template<typename T> void operator()(T const &t) const { this->i += arg(t); } int &i; }; int const REPEAT_COUNT = 10; template<typename T> double time_for_each_s(T const &t, int &j) { boost::timer tim; int i = 0; long long iter = 65536; long long counter, repeats; double result = 0; double run; do { tim.restart(); for(counter = 0; counter < iter; ++counter) { i = 0; boost::fusion::for_each_s(t, accumulator(i)); static_cast<void>(i); } result = tim.elapsed(); iter *= 2; } while(result < 0.5); iter /= 2; // repeat test and report least value for consistency: for(repeats = 0; repeats < REPEAT_COUNT; ++repeats) { tim.restart(); for(counter = 0; counter < iter; ++counter) { i = 0; boost::fusion::for_each_s(t, accumulator(i)); j += i; } run = tim.elapsed(); result = (std::min)(run, result); } std::cout << i << std::endl; return result / iter; } #define EXPR1 (lit(3) >> 42) >> (lit(6) >> 29) #define EXPR2 EXPR1 >> EXPR1 >> EXPR1 >> EXPR1 >> EXPR1 >> EXPR1 int test_short() { int j = 0; typedef unary_op<int, noop_tag> T; boost::fusion::THE_SEQUENCE<T,T,T,T> const l( lit(3),lit(42),lit(6),lit(29)); double list_time = time_for_each_s( l, j ); double non_segmented_time = time_for_each_s( EXPR1, j ); double segmented_time = time_for_each_s( boost::proto::make_segmented_view( EXPR1 ), j ); std::cout << "Fusion list time : " << list_time << std::endl; std::cout << "Non-segmented proto time : " << non_segmented_time << std::endl; std::cout << "Segmented proto time : " << segmented_time << std::endl; return j; } int test_long() { int j = 0; typedef unary_op<int, noop_tag> T; boost::fusion::list<T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T,T> const l( lit(3),lit(42),lit(6),lit(29) ,lit(3),lit(42),lit(6),lit(29) ,lit(3),lit(42),lit(6),lit(29) ,lit(3),lit(42),lit(6),lit(29) ,lit(3),lit(42),lit(6),lit(29) ,lit(3),lit(42),lit(6),lit(29)); double list_time = time_for_each_s( l, j ); double non_segmented_time = time_for_each_s( EXPR2, j ); double segmented_time = time_for_each_s( boost::proto::make_segmented_view( EXPR2 ), j ); std::cout << "Fusion list time : " << list_time << std::endl; std::cout << "Non-segmented proto time : " << non_segmented_time << std::endl; std::cout << "Segmented proto time : " << segmented_time << std::endl; return j; } } int main() { std::cout << "Short test ... \n"; int i = test::test_short(); std::cout << "Long test ... \n"; int j = test::test_long(); return i + j; } |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: Fusion-ifying proto parse trees: 00036, Joel de Guzman |
|---|---|
| Next by Date: | Re: Fusion-ifying proto parse trees: 00036, Joel de Guzman |
| Previous by Thread: | Re: Fusion-ifying proto parse treesi: 00036, Joel de Guzman |
| Next by Thread: | Re: Fusion-ifying proto parse trees: 00036, Joel de Guzman |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |