|
Re: Memory usage: msg#00083lang.haskell.general
In message <3A1D3124.933312D8@xxxxxxxxxxxxxxxxx>, Dusan Kolar <kolar@xxxxxxxxxxxxxxxxx> writes >Hello, > > working on a small utility for de/compression of special files >using special algorithm (useless in other cases and not important >for the problem) I've encountered I cannot decompress even small >files because I always run out of the memory. > > After some time I've found the problem. It can be simplified >to the following function (computing nothing reasonable): Dusan's function, (reformatted - this message is a hugs script, unless the lines get wrapped by your email) is: > tst :: [a] -> [a] -> [a] -> [a] > tst l [] = id > tst l (x:xs) = if null l then ((++) [x]) . (tst (l++[x]) xs) > else if null (tail l) then ((++) [x,last l]) . > (tst (tail l++[last l]++[x]) xs) > else ((++) [x,last l]) . > (tst (tail (tail l)++[last l]++[x]) xs) >Trying to run it in hugs (default settings) this way: > > tst [] [1..20000] [] > >the program stops writing almost all the lines and running out of >memory. >From the example you give, I assume that the 'purpose' of tst is to compute p efficiently, where > p :: [a] -> [a] > p xs = tst [] xs [] p [1,2] = [1,2,1] p [1,2,3] = [1,2,1,3,2] p [1,2,3,4] = [1,2,1,3,2,4,3] It looks as though this function can be computed in constant space, but p leaks space. I check this by starting hugs with a small heap: hugs +g -h30k Then: Main> last $ p [1..10000] {{Gc:17584}}{{Gc:15496}}{{Gc:13652}}{{Gc:12036}}{{Gc:10608}}{{Gc:9352}} {{Gc:8246 }}{{Gc:7258}}{{Gc:6396}}{{Gc:5644}}{{Gc:4964}}{{Gc:4379}} {{Gc:3865}}{{Gc:3404}}{{Gc:3002}}{{Gc:2642}}{{Gc:2340}}{{Gc:2052}} {{Gc:1816}}{{Gc:1592}}{{Gc:1404}}{{Gc:1248}}{{Gc:1092}}{{Gc:960}} (89416 reductions, 159870 cells, 23 garbage collections) {{Gc:960}}ERROR: Garbage collection fails to reclaim sufficient space The tst function is 'equivalent' to tst1, which is derived by splitting out the cases and simplifying. I prefer this style! > tst1:: [a] -> [a] -> [a] -> [a] > tst1 _ [] b = b > tst1 [] (x:xs) b = [x] ++ tst1 [x] xs b > tst1 [y] (x:xs) b = [x,y] ++ tst1 [y,x] xs b > tst1 [y,z] (x:xs) b = [x,z] ++ tst1 [z,x] xs b > tst1 (y:z:m) (x:xs) b = [x,n] ++ tst1 (m ++ [n,x]) xs b > where n = last m > p1 :: [a] -> [a] > p1 xs = tst1 [] xs [] Also, p1 *doesn't leak space*. Why? I really haven't much idea: it may be because there is only one 'last m', but a better guess could be that tst is a bit too complicated! Main> last $ p1 [1..10000] {{Gc:19934}}{{Gc:19932}}{{Gc:19932}}{{Gc:19932}}{{Gc:19932}}{{Gc:19932}} {{Gc:19932}}{{Gc:19932}}{{Gc:19932}}{{Gc:19932}}{{Gc:19932}}{{Gc:19932}} {{Gc:19932}}{{Gc:19932}}{{Gc:19932}}{{Gc:19932}}{{Gc:19932}}{{Gc:19932}} {{Gc:19932}}{{Gc:19932}}{{Gc:19932}}{{Gc:19932}}{{Gc:19932}}9999 (240028 reductions, 440040 cells, 23 garbage collections) {{Gc:19967}}Main> The following 'specification' of p uses lots of space: > p0 :: [a] -> [a] > p0 [] = [] > p0 [x] = [x] > p0 xs = p0 (init xs) ++ [last xs, last $ init xs] Another 'implementation' using constant space is: > p2 :: [a] -> [a] > p2 [] = [] > p2 [x] = [x] > p2 xs@(x:ys) = (x:merge ys xs) > where > merge [] _ = [] > merge (a:as) (b:bs) = (a:b:merge as bs) There have been lots of messages on this list concerning space leaks in Haskell. It seems that only compiler writers can understand the causes of subtle space leaks. This isn't really very satisfactory. Hope this helps. -- William Marsh |
|
| <Prev in Thread] | Current Thread | [Next in Thread> |
|---|---|---|
| Previous by Date: | "Semantic Knowledge Acquisition and Categorisation" - Workshop at ESSLLI 2001: 00083, Alessandro Lenci |
|---|---|
| Next by Date: | MBR'01 Deadline extension: 00083, Lorenzo Magnani |
| Previous by Thread: | Memory usagei: 00083, Dusan Kolar |
| Next by Thread: | Union Types for Haskell!?: 00083, Bernd Holzmüller |
| Indexes: | [Date] [Thread] [Top] [All Lists] |
| News | FAQ | advertise |