|
RE: loop performance bug: msg#00088lang.haskell.glasgow.bugs
I couldn't resist looking into this. Here's the story. All of this applies to the explicitly-recursive program; I have not looked at the list version. To preview the headlines: Original program 13.1s 1. Use Int instead of Bool 12.6s 2. Use -fliberate-case-threshold=100 4.8s 3. Give instance with unsafeIndex 1.25s 4. {-# NOINLINE main #-} 0.85s Lessons ~~~~~ Going from 13s to 0.85s is big jump. A. It would be a big help if someone felt like doing this kind of detective work in a more systematic way, and producing a list of recommendations for the implementation. B. Arising from (3), 'deriving Ix' should define unsafeIndex. I'll take that as an action. C. Maybe the liberate-case threshold should be higher. I'd be interested to hear if increasing the threshold has good effects for anyone else. D. (4) actually only arises because everything is constant here. I'm not super-inclined to fix this dark corner. Simon The details ~~~~~~~ 1. IOUArrays are represented in a space-efficient manner. In particular, Bool arrays take one bit per element, but that entails lots of bit-twiddling instructions. Using Int instead reduces the time from 13.1 to 12.6s on my machine. 2. The inner loop repeatedly takes apart the array data structure, the pair representing its bounds, and the Pos structure for each bound, each time around the loop. GHC has an optimisation designed specifically to catch this, called "liberate case" (only run with -O2). To avoid code duplication, though, it has a code-size threshold that is too low for this example. Try -fliberate-case-threshold=100. This reduces the time from 12.6s to 4.8s 3. Array-bound checks are performed each time around the loop. Actually, it's much worse than that. The call writeArray arr (Pos x y) 1 turns into unsafeWriteArray arr (index (bounds arr) (Pos x y)) 1 The derived method for index on Pos (generated by 'deriving Ix') looks like this index (Pos l1 h2, Pos l2 h2) i j = index (l2,h2) j + index (l2,h2) h2 * index (l1,h1) i And *each* of those calls to 'index (l,h) i' (done at type Int) checks that i>=l and i<=h. Result is six range checks for every access! You might think we could use unsafeIndex instead: unsafeWriteArray arr (unsafeIndex (bounds arr) (Pos x y)) 1 but unfortunately the *derived* instance for Ix defines unsafeIndex = index. So you can instead give the Ix instance by hand: instance Ix Pos where unsafeIndex (Pos l1 l2, Pos h1 h2) (Pos i j) = unsafeIndex (l2,h2) j + (unsafeIndex (l2,h2) h2 + 1) * unsafeIndex (l1,h1) i inRange (Pos l1 l2, Pos h1 h2) (Pos i j) = inRange (l1,h1) i && inRange (l2,h2) j This reduces the run-time to 1.25s 4. Finally, there's a rather obscure problem. You'd expect that since the array is explicitly allocated with fixed bounds, the program might be able to see those bounds when doing its index calculations. It turns out that a bad order of inlining stops this happening. Just to see the effect, I "fixed" the problem in two ways, both of which give the same end result: a) replace the (bounds arr) in the argument to unsafeIndex by (Pos 0 0, Pos 99 99). b) add an {-# NOINLINE main #-} pragme I won't explain (b) -- try -ddump-simpl and see! Anyway, this "fix" means that computations like 99+1 can be constant folded, and reduces the time to 0.85s | -----Original Message----- | From: glasgow-haskell-bugs-bounces@xxxxxxxxxxx [mailto:glasgow-haskell-bugs- | bounces@xxxxxxxxxxx] On Behalf Of Duncan Coutts | Sent: 14 March 2005 13:41 | To: GHC-bugs list | Subject: loop performance bug | | Hi, | | This sort of code runs very slowly when compared to the equivalent in C: | | > {-# OPTIONS -fglasgow-exts #-} | > module Main where | > | > import Data.Array.MArray | > import Data.Array.IO | > | > data Pos = Pos !Int !Int | > deriving (Eq, Ord, Ix) | > | > main = test1 | > | > test1 :: IO () | > test1 = do | > (arr :: IOUArray Pos Bool) <- newArray_ (Pos 0 0, Pos 99 99) | > sequence_ [ sequence_ [ writeArray arr (Pos y x) False | > | y <- [0..99] | > , x <- [0..99] ] | > | _ <- [0..9999] ] | > | | Timing: | | $ ghc --make -O2 SpeedTest.hs -o speedtest-hs | $ time ./speedtest-hs | | real 0m10.968s | user 0m10.952s | sys 0m0.005s | | Comparing to an 'equivalent' C program: | | > int main () { | > char arr[100][100]; | > int n,x,y; | > | > for (n = 0; n != 10000; n++) | > for (y = 0; y != 100; y++) | > for (x = 0; x != 100; x++) | > arr[y][x] = 0; | > } | | $ gcc -O2 speedtest.c -o speedtest-c | $ time ./speedtest-c | | real 0m0.129s | user 0m0.123s | sys 0m0.000s | | Now parhaps this is unfair sice the Haskell program is written in terms | of lists. So lets write it using explicit looping and no lists. | | | > {-# INLINE doFromTo #-} | > -- do the action for [from..to] ,ie it's inclusive. | > doFromTo :: Int -> Int -> (Int -> IO ()) -> IO () | > doFromTo from to action = | > let loop n | n > to = return () | > | otherwise = do action n | > loop (n+1) | > in loop from | > | > test2 :: IO () | > test2 = do | > (arr :: IOUArray Pos Bool) <- newArray_ (Pos 0 0, Pos 99 99) | > doFromTo 0 9999 $ \_ -> | > doFromTo 0 99 $ \y -> | > doFromTo 0 99 $ \x -> | > writeArray arr (Pos y x) False | | Timing: | | $ ghc --make -O2 SpeedTest.hs -o speedtest-hs | $ time ./speedtest-hs | | real 0m7.942s | user 0m7.936s | sys 0m0.001s | | So not much better really. :-( | | If there's way to write loops that is faster I'd dearly like to know. | | I initially assumed that the second version would run fast since there | is no obvious need for any memory allocations (which is the usual | culprit). | | Duncan | | _______________________________________________ | Glasgow-haskell-bugs mailing list | Glasgow-haskell-bugs@xxxxxxxxxxx | http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs
|
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | Re: GPF WinXP GHCi 6.4, Ross Paterson |
|---|---|
| Next by Date: | GADT internal error on incorrect program, Andres Loeh |
| Previous by Thread: | Re: loop performance bug, Donald Bruce Stewart |
| Next by Thread: | RE: loop performance bug, Manuel M T Chakravarty |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |