logo       

Re: loop performance bug: msg#00073

lang.haskell.glasgow.bugs

Subject: Re: loop performance bug

Hi,

On Mon, 14 Mar 2005 13:41:06 +0000, Duncan Coutts
<duncan.coutts@xxxxxxxxxxxxx> wrote:
> This sort of code runs very slowly when compared to the equivalent in C:
>
I'm afraid the kind of array processing code you posted will always be
slower than the C "equivalent". Haskell only has safe array meaning
that every operation checks the array bounds and that's going to cost
you in situations like these. But I do agree that the timings you
showed are overly bad. Let's see what we can do about it.

A first hint is to never try to guess what kind of code ghc generates.
If you're in need of performance you need to look at some lower level
code. I recommend the -fext-core flag to produce external core, a
sort-of-readable output of ghc's internal representation of the
program. One can learn a lot from that.

> > {-# OPTIONS -fglasgow-exts #-}
> > module Main where
> >
> > import Data.Array.MArray
> > import Data.Array.IO
> >
> > data Pos = Pos !Int !Int
> > deriving (Eq, Ord, Ix)
> >
> > main = test1
> >
> > test1 :: IO ()
> > test1 = do
> > (arr :: IOUArray Pos Bool) <- newArray_ (Pos 0 0, Pos 99 99)
> > sequence_ [ sequence_ [ writeArray arr (Pos y x) False
> > | y <- [0..99]
> > , x <- [0..99] ]
> > | _ <- [0..9999] ]
> >
>
> Timing:
>
> $ ghc --make -O2 SpeedTest.hs -o speedtest-hs
> $ time ./speedtest-hs
>
> real 0m10.968s
> user 0m10.952s
> sys 0m0.005s
>
I'm very much surprised that this program was slower than the one
without arrays. Since all lists are produced and consumed with "good"
producers and consumers (as in section 7.10.3 in the User's Guide)
they should really go away. This seems to be a real bug.

[C program deleted]

> > {-# INLINE doFromTo #-}
> > -- do the action for [from..to] ,ie it's inclusive.
> > doFromTo :: Int -> Int -> (Int -> IO ()) -> IO ()
> > doFromTo from to action =
> > let loop n | n > to = return ()
> > | otherwise = do action n
> > loop (n+1)
> > in loop from
> >
> > test2 :: IO ()
> > test2 = do
> > (arr :: IOUArray Pos Bool) <- newArray_ (Pos 0 0, Pos 99 99)
> > doFromTo 0 9999 $ \_ ->
> > doFromTo 0 99 $ \y ->
> > doFromTo 0 99 $ \x ->
> > writeArray arr (Pos y x) False
>
> Timing:
>
> $ ghc --make -O2 SpeedTest.hs -o speedtest-hs
> $ time ./speedtest-hs
>
> real 0m7.942s
> user 0m7.936s
> sys 0m0.001s
>
> So not much better really. :-(
>
> If there's way to write loops that is faster I'd dearly like to know.
>
> I initially assumed that the second version would run fast since there
> is no obvious need for any memory allocations (which is the usual
> culprit).
>
Oh, there are still reasons for memory allocations. You use an
algebraic data type to index the arrays. That is also going to kill
performance. If we look at the external core we can clearly see that
the indexing with the Pos data type is adding overhead. So to improve
we can write the program in a lower level style like this:

> test3 :: IO ()
> test3 = do
> (arr :: IOUArray Int Bool) <- newArray_ (0,100*100-1)
> doFromTo 0 9999 $ \_ ->
> doFromTo 0 99 $ \y ->
> doFromTo 0 99 $ \x ->
> writeArray arr (x*(y+1)) False

This gave me a factor two speedup compared to test2.

Well, if we stare some more at the external core generated from ghc we
will see that there is still a lot of junk in the inner loop. Many of
these things I don't know what they're doing there. I'm sure the
implementors can explain what's going on there.

HTH

/Josef


<Prev in Thread] Current Thread [Next in Thread>
Google Custom Search

News | FAQ | advertise