logo       

RE: GHC-6.4.1 much slower than GHC-6.4: msg#00070

lang.haskell.glasgow.bugs

Subject: RE: GHC-6.4.1 much slower than GHC-6.4

Interesting... there's no obvious culprit, although transF appears to be
taking a bit more time, and dfs_abort has moved up to 7% time from 2%.

The program is doing about 6x more allocation, and taking about 4x more
time, but the extra alloc and time is pretty much spread around the
program. I'd say that indicates that the program is really *doing more*
under 6.4.1, i.e. some major part of the algorithm is being
re-calculated until 6.4.1, whereas it was being shared under 6.4.

Can you insert any traces in the program that would tell you whether
this is happening? Also, check the entry counts in the profiled output.
They should indicate if any functions are being called a lot more often
in 6.4.1 (with the caveat that the entry counts aren't terribly
reliable).

Cheers,
Simon

On 28 October 2005 12:10, Christian Maeder wrote:

> (Moved to glasgow-haskell-bugs)
>
> Below is my profiling data (on Linux turing 2.6.11.4-21.9-default)
> Any ideas?
>
> Cheers Christian
>
>
> interesting-ghc-6.4 +RTS -p -RTS
>
> total time = 253.76 secs (12688 ticks @ 20 ms)
> total alloc = 15,991,765,312 bytes (excludes profiling
overheads)
>
> COST CENTRE MODULE %time %alloc
>
> transF NFA.Type 13.9 1.1
> trans_manyF NFA.Type 6.4 10.3
> steps Util.Graph.BFS 6.4 2.1
> build PCP.Rules.Excl.Type 5.5 9.4
> dfs_collect Util.Graph.DFS 5.0 3.9
> step Util.Graph.BFS 4.4 0.6
> isPartOf Util.Words 2.9 5.8
> lang_ok PCP.Rules.Type 2.9 5.0
> dfs_abort Util.Graph.DFS 2.2 2.3
> meet PCP.Search 2.2 0.5
> transF NFA.DFA.Type 1.9 1.4
> genericCross Util.Cross 1.8 4.3
> complement NFA.Det 1.7 1.1
> det_tqf NFA.Det 1.7 3.2
> det_dfa NFA.Det 1.6 1.0
> productive NFA.Intersect 1.5 3.6
> subset NFA.Subset 1.5 2.3
> map_fun Util.Map 1.5 1.2
> build NFA.DFA.Hopcroft 1.4 2.9
> apply PCP.Rules.Rule 1.3 2.6
> apply PCP.Rules.Type 1.2 1.5
> hop NFA.DFA.Hopcroft 1.1 1.0
> ok PCP.Util.OK 1.0 0.8
> select PCP.Rules.Type 0.9 1.3
> is_final NFA.Type 0.6 1.0
> final_conf PCP.Rules.Rule 0.6 1.1
> sublistsN Util.Words 0.5 1.2
>
>
> interesting-ghc-6.4.1 +RTS -p -RTS
>
> total time = 1020.74 secs (51037 ticks @ 20 ms)
> total alloc = 70,086,586,572 bytes (excludes profiling
overheads)
>
> COST CENTRE MODULE %time %alloc
>
> transF NFA.Type 17.3 1.6
> dfs_abort Util.Graph.DFS 7.4 7.4
> trans_manyF NFA.Type 7.3 11.5
> build PCP.Rules.Excl.Type 5.7 6.8
> dfs_collect Util.Graph.DFS 5.3 3.6
> intersect NFA.Intersect 4.6 9.8
> satisfies_conditions PCP.Rules.Excl.Excl 4.2 5.0
> subset NFA.Subset 4.1 7.1
> genericCross Util.Cross 2.3 5.3
> isPartOf Util.Words 2.2 3.5
> steps Util.Graph.BFS 2.1 0.5
> candidates PCP.Rules.Excl.Excl 2.0 2.4
> det_tqf NFA.Det 1.7 2.9
> det_dfa NFA.Det 1.6 1.0
> transF NFA.DFA.Type 1.5 1.0
> build NFA.DFA.Hopcroft 1.4 2.6
> power_splitL Util.Words 1.3 1.3
> map_fun Util.Map 1.2 1.0
> step Util.Graph.BFS 1.1 0.1
> productive NFA.Intersect 1.1 2.6
> complement NFA.Det 1.1 0.6
> candidates PCP.Rules.Excl.Div 1.0 1.0
> lang_ok PCP.Rules.Type 0.7 1.1
> splits Util.Words 0.6 1.0
>
> Mirko Rahn wrote:
>>
>> Hello,
>>
>> I was surprised about really big differences in running times between
>> 6.4 and 6.4.1, whereby 6.4.1 is *much* slower:
>>
>> *** 6.4: cpu-time interesting (3,3): 00:01:38.08
>> *** 6.4.1: cpu-time interesting (3,3): 00:05:58.91
>>
>> Here we have a factor of about 3.6!! Are there others that have
>> noticed such behavior? What could be the reason for that?
>>
>> Sorry, my example is a small monster, but at the moment there is
>> neither time left to search for the reason for this behavior nor to
>> break it down to some smaller pieces of code. I just noticed it...
>>
>> However, you can find the source tree in
>>
>> http://liinwww.ira.uka.de/~rahn/src/
>>
>> The file
>>
>> http://liinwww.ira.uka.de/~rahn/src/prog.tar.gz
>>
>> contains the tree as well. For the output above: Extract, cd
>> proj/prog and run
>>
>> rahn@i90pc18:~/local/proj/prog$ for version in 6.4 6.4.1; do find .
>> -name '*.hi' | rm -f ; ghc-$version --make -Wall -cpp
>> -D__POS__='__FILE__ ++ "[" ++ show(__LINE__::Int) ++ "]: " ++ '
>> -fglasgow-exts -fallow-overlapping-instances
>> -fallow-undecidable-instances -O Prog/Interesting.hs -o
>> Prog/Interesting.$version.bin 2>/dev/null >/dev/null; echo -n "***
>> $version: " ; ./Prog/Interesting.$version.bin 2>&1 | grep time ; done
>>
>> Regards, Mirko
>>
>> P.S.: I run a debian-box with i686 GNU/Linux 2.6.11.11 and a
>> dualcore P4
>> 3.2 GHz.
>>
>
> _______________________________________________
> Glasgow-haskell-bugs mailing list
> Glasgow-haskell-bugs@xxxxxxxxxxx
> http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise