|
Re: loop performance bug: msg#00077lang.haskell.glasgow.bugs
On Tue, 2005-03-15 at 01:39 +0100, Josef Svenningsson wrote: > Hi, > > On Mon, 14 Mar 2005 13:41:06 +0000, Duncan Coutts > <duncan.coutts@xxxxxxxxxxxxx> wrote: > > This sort of code runs very slowly when compared to the equivalent in C: > > > I'm afraid the kind of array processing code you posted will always be > slower than the C "equivalent". Haskell only has safe array meaning > that every operation checks the array bounds and that's going to cost > you in situations like these. True, True. I'm not being quite fair. For some programs I have switched to unsafe indexing (and manual array offset calculation) for a 25% or so performnce boost. (I did this for a rewrite of our ICFP'04 ant simulator and got the runtime down from ~5min to ~20sec. The main performance improvement was due to switching from finite maps of structs to structs of mutable unboxed arrays. The unsafe array indexing saved an additional 5 sec or so.) > But I do agree that the timings you showed are overly bad. Let's see > what we can do about it. As you indicate, the bounds checking is not really the significant factor here. > A first hint is to never try to guess what kind of code ghc generates. > If you're in need of performance you need to look at some lower level > code. I recommend the -fext-core flag to produce external core, a > sort-of-readable output of ghc's internal representation of the > program. One can learn a lot from that. Although when I do I don't necessarily understand what I see! I see 'let' (ie allocating forms) where I would have hoped there would not be but it doesn't help me to figure out why the code I've written ends up producing allocations. Simon M previously recommended: > Since you're interested in allocation, you might be better off > with -ddump-prep rather than -ddump-simpl: the former has all > the allocation made into explicit 'let' expressions ready for > code generation. > I'm very much surprised that this program was slower than the one > without arrays. You mean the one without lists. > Since all lists are produced and consumed with "good" > producers and consumers (as in section 7.10.3 in the User's Guide) > they should really go away. This seems to be a real bug. This is what Simon M thought when I brought this up previously. He suggested I use explicit recursion (which I thought I was doing with my second version though it doesn't do much better!) > [C program deleted] Good riddence! :-) > > I initially assumed that the second version would run fast since there > > is no obvious need for any memory allocations (which is the usual > > culprit). > > > Oh, there are still reasons for memory allocations. You use an > algebraic data type to index the arrays. That is also going to kill > performance. If we look at the external core we can clearly see that > the indexing with the Pos data type is adding overhead. So to improve > we can write the program in a lower level style like this: I was hoping that if the array reading was inlined that the calculation of the index would also be inlined and so the Pos constructor need never be allocated. > > test3 :: IO () > > test3 = do > > (arr :: IOUArray Int Bool) <- newArray_ (0,100*100-1) > > doFromTo 0 9999 $ \_ -> > > doFromTo 0 99 $ \y -> > > doFromTo 0 99 $ \x -> > > writeArray arr (x*(y+1)) False > > This gave me a factor two speedup compared to test2. So the Pos is really being allocated each time. I guess this would be the same with (,) (,,) etc. > Well, if we stare some more at the external core generated from ghc we > will see that there is still a lot of junk in the inner loop. Many of > these things I don't know what they're doing there. I'm sure the > implementors can explain what's going on there. Hope so. :-) Even if we use unsafeWriteArray with manual array offset calculation we're still a few orders of magnitude off where we'd like to be. It's the loop itself I think that is the problem rather than whetever we're doing inside it. > /Josef BTW have you thought about picking up hIDE development again? I might go back to it after Gtk2Hs hits version 1.0 Duncan
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: loop performance bug, Sean Seefried |
|---|---|
| Next by Date: | Re: loop performance bug, Duncan Coutts |
| Previous by Thread: | Re: loop performance bug, Duncan Coutts |
| Next by Thread: | Re: loop performance bug, Josef Svenningsson |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |