logo       

Re: loop performance bug: msg#00077

lang.haskell.glasgow.bugs

Subject: Re: loop performance bug

On Tue, 2005-03-15 at 01:39 +0100, Josef Svenningsson wrote:
> Hi,
>
> On Mon, 14 Mar 2005 13:41:06 +0000, Duncan Coutts
> <duncan.coutts@xxxxxxxxxxxxx> wrote:
> > This sort of code runs very slowly when compared to the equivalent in C:
> >
> I'm afraid the kind of array processing code you posted will always be
> slower than the C "equivalent". Haskell only has safe array meaning
> that every operation checks the array bounds and that's going to cost
> you in situations like these.

True, True. I'm not being quite fair. For some programs I have switched
to unsafe indexing (and manual array offset calculation) for a 25% or so
performnce boost.

(I did this for a rewrite of our ICFP'04 ant simulator and got the
runtime down from ~5min to ~20sec. The main performance improvement was
due to switching from finite maps of structs to structs of mutable
unboxed arrays. The unsafe array indexing saved an additional 5 sec or
so.)

> But I do agree that the timings you showed are overly bad. Let's see
> what we can do about it.

As you indicate, the bounds checking is not really the significant
factor here.

> A first hint is to never try to guess what kind of code ghc generates.
> If you're in need of performance you need to look at some lower level
> code. I recommend the -fext-core flag to produce external core, a
> sort-of-readable output of ghc's internal representation of the
> program. One can learn a lot from that.

Although when I do I don't necessarily understand what I see! I see
'let' (ie allocating forms) where I would have hoped there would not be
but it doesn't help me to figure out why the code I've written ends up
producing allocations.

Simon M previously recommended:

> Since you're interested in allocation, you might be better off
> with -ddump-prep rather than -ddump-simpl: the former has all
> the allocation made into explicit 'let' expressions ready for
> code generation.

> I'm very much surprised that this program was slower than the one
> without arrays.

You mean the one without lists.

> Since all lists are produced and consumed with "good"
> producers and consumers (as in section 7.10.3 in the User's Guide)
> they should really go away. This seems to be a real bug.

This is what Simon M thought when I brought this up previously. He
suggested I use explicit recursion (which I thought I was doing with my
second version though it doesn't do much better!)

> [C program deleted]
Good riddence! :-)

> > I initially assumed that the second version would run fast since there
> > is no obvious need for any memory allocations (which is the usual
> > culprit).
> >
> Oh, there are still reasons for memory allocations. You use an
> algebraic data type to index the arrays. That is also going to kill
> performance. If we look at the external core we can clearly see that
> the indexing with the Pos data type is adding overhead. So to improve
> we can write the program in a lower level style like this:

I was hoping that if the array reading was inlined that the calculation
of the index would also be inlined and so the Pos constructor need never
be allocated.

> > test3 :: IO ()
> > test3 = do
> > (arr :: IOUArray Int Bool) <- newArray_ (0,100*100-1)
> > doFromTo 0 9999 $ \_ ->
> > doFromTo 0 99 $ \y ->
> > doFromTo 0 99 $ \x ->
> > writeArray arr (x*(y+1)) False
>
> This gave me a factor two speedup compared to test2.

So the Pos is really being allocated each time. I guess this would be
the same with (,) (,,) etc.

> Well, if we stare some more at the external core generated from ghc we
> will see that there is still a lot of junk in the inner loop. Many of
> these things I don't know what they're doing there. I'm sure the
> implementors can explain what's going on there.

Hope so. :-)

Even if we use unsafeWriteArray with manual array offset calculation
we're still a few orders of magnitude off where we'd like to be.

It's the loop itself I think that is the problem rather than whetever
we're doing inside it.

> /Josef

BTW have you thought about picking up hIDE development again? I might go
back to it after Gtk2Hs hits version 1.0

Duncan


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise