ArchiveOrangemail archive

The Haskell-Beginners Mailing List - Discussion of primarily beginner-level topics related to Haskell


beginners.haskell.org
(List home) (Recent threads) (40 other Haskell lists)

Subscription Options

  • RSS or Atom: Read-only subscription using a browser or aggregator. This is the recommended way if you don't need to send messages to the list. You can learn more about feed syndication and clients here.
  • Conventional: All messages are delivered to your mail address, and you can reply. To subscribe, send an email to the list's subscribe address with "subscribe" in the subject line, or visit the list's homepage here.
  • Low traffic list: less than 3 messages per day
  • This list contains about 14,040 messages, beginning Jul 2008
  • 0 messages added yesterday
Report the Spam
This button sends a spam report to the moderator. Please use it sparingly. For other removal requests, read this.
Are you sure? yes no

Memory tuning

Ad
Thorsten Hater 1286806458Mon, 11 Oct 2010 14:14:18 +0000 (UTC)
Hello everyone,

I'm trying to reduce the memory footprint (and possibly execution time, 
too) of a small program
I wrote. It's from the project Euler context, produces the correct 
result, but it is unsatifiing as
indicated by a profiling run:

    2,633,675,888 bytes allocated in the heap
    1,221,377,976 bytes copied during GC
       27,240,696 bytes maximum residency (22 sample(s))
        1,576,200 bytes maximum slop
               80 MB total memory in use (1 MB lost due to fragmentation)
   Generation 0:  5002 collections,     0 parallel,  1.38s,  1.36s elapsed
   Generation 1:    22 collections,     0 parallel,  0.62s,  0.64s elapsed
   INIT  time    0.00s  (  0.00s elapsed)
   MUT   time    1.76s  (  1.80s elapsed)
   GC    time    1.99s  (  2.00s elapsed)
   EXIT  time    0.00s  (  0.00s elapsed)
   Total time    3.76s  (  3.81s elapsed)
   %GC time      53.1%  (52.6% elapsed)
   Alloc rate    1,494,074,809 bytes per MUT second
   Productivity  46.9% of total user, 46.3% of total elapsed

Not only is the actual memory consumption quite high, but the 
GC/computation ratio is almost 1:1.
I assume that there is something to be done regarding strictness, but a 
series of experiments with
BangPatterns return no yields at all (using foldr instead of foldl' 
produces stack overflow)
Here is my program:

-- project euler p75.hs
import qualified Data.Map as Map
import Data.List                  -- strict foldl

circum :: Int -> Int -> Int
circum a b = 2*a*(a+b)

count :: Int -> (Map.Map Int Int) -> Int -> (Map.Map Int Int)
count lmt mp c = foldl' step mp $ takeWhile (<=lmt) [k*c | k <- [1..] ]
                         where step m x = Map.insertWith' (+) x 1 m

solve :: Int -> Int
solve l = Map.size $ Map.filter (== 1) primitive
     where lmt = floor $ sqrt $ (fromIntegral l)/2
           cand = [c | m <- [2..lmt],
                           n <- [1..m],
                           odd (m+n),
                           1 == gcd m n,
                           let c = circum m n,
                           l >= c]
           primitive = foldl' (count l) (Map.empty) cand

main = do print $ solve 1500000

Any advice and or tips welcome, as I'm quite new to Haskell.

Best regards.
Daniel Fischer 1286812493Mon, 11 Oct 2010 15:54:53 +0000 (UTC)
On Monday 11 October 2010 16:14:11, Thorsten Hater wrote:
> Not only is the actual memory consumption quite high, but the
> GC/computation ratio is almost 1:1.You construct a largish map (about 350000 entries), doing a lot (more than 
a million) of insertWith's.
That means frequent rebalancing (for new circumferences) and many updates 
of already present values (which in all likelihood are heap allocated 
Ints), both give the garbage collector a lot of work. Further, you 
construct a lot of lists, which need to be GCed too.
Also, the memory overhead of Maps is nontrivial, I think five or six words 
per item, so a Map Int Int needs about size*8(or 7)*sizeof(Int) bytes.

You can reduce the GC time by specifying a larger allocation area (e.g.
+RTS -A16M -RTS, the default size is 512K) thus making GC less frequent.

> I assume that there is something to be done regarding strictness,

Nothing obvious.
But for the computation pattern you use, since the ratio of possible values 
to occurring values is not too high, using accumArray (on an Unboxed array) 
is far better.> but a series of experiments with
> BangPatterns return no yields at all (using foldr instead of foldl'
> produces stack overflow)Yes, using foldr with insert(With)s into Maps is a Bad Idea™
Thorsten Hater 1286832577Mon, 11 Oct 2010 21:29:37 +0000 (UTC)
On 11.10.2010 17:54, Daniel Fischer wrote:
> On Monday 11 October 2010 16:14:11, Thorsten Hater wrote:
>> Not only is the actual memory consumption quite high, but the
>> GC/computation ratio is almost 1:1.
> You construct a largish map (about 350000 entries), doing a lot (more than 
> a million) of insertWith's.
> That means frequent rebalancing (for new circumferences) and many updates 
> of already present values (which in all likelihood are heap allocated 
> Ints), both give the garbage collector a lot of work. Further, you 
> construct a lot of lists, which need to be GCed too.
> Also, the memory overhead of Maps is nontrivial, I think five or six words 
> per item, so a Map Int Int needs about size*8(or 7)*sizeof(Int) bytes.
>
> You can reduce the GC time by specifying a larger allocation area (e.g.
> +RTS -A16M -RTS, the default size is 512K) thus making GC less frequent.
>
>> I assume that there is something to be done regarding strictness,
> Nothing obvious.
> But for the computation pattern you use, since the ratio of possible values 
> to occurring values is not too high, using accumArray (on an Unboxed array) 
> is far better.
>
>> but a series of experiments with
>> BangPatterns return no yields at all (using foldr instead of foldl'
>> produces stack overflow)
> Yes, using foldr with insert(With)s into Maps is a Bad Idea™
>Thank you for pointing me to Unboxed Arrays and accumarray, it works
like a charm.
The new statistics:

      35,180,328 bytes allocated in the heap
          13,608 bytes copied during GC
      12,027,912 bytes maximum residency (2 sample(s))
         594,520 bytes maximum slop
              13 MB total memory in use (0 MB lost due to fragmentation)

  Generation 0:    43 collections,     0 parallel,  0.00s,  0.00s elapsed
  Generation 1:     2 collections,     0 parallel,  0.00s,  0.00s elapsed

  INIT  time    0.00s  (  0.00s elapsed)
  MUT   time    0.14s  (  0.17s elapsed)
  GC    time    0.00s  (  0.00s elapsed)
  EXIT  time    0.00s  (  0.00s elapsed)
  Total time    0.14s  (  0.17s elapsed)

  %GC time       0.0%  (0.6% elapsed)

  Alloc rate    251,288,057 bytes per MUT second

  Productivity 100.0% of total user, 82.9% of total elapsed

I don't think there is much left to do in terms of optimization.
So I'm left with the question of when to use Data.Map, if accumarray
is so much better even for heavy number crunching as this.
Especially when Data.Map advertises as efficient.
(Even without Unboxed, normal Arrays beat Maps by a factor of 2)
Best regards.
Daniel Fischer 1286837567Mon, 11 Oct 2010 22:52:47 +0000 (UTC)
On Monday 11 October 2010 23:29:30, Thorsten Hater wrote:
>
> Thank you for pointing me to Unboxed Arrays and accumarray, it works
> like a charm.
> The new statistics:
>
>       35,180,328 bytes allocated in the heap
>           13,608 bytes copied during GC
>       12,027,912 bytes maximum residency (2 sample(s))
>          594,520 bytes maximum slop
>               13 MB total memory in use (0 MB lost due to fragmentation)
>
>   Generation 0:    43 collections,     0 parallel,  0.00s,  0.00s
> elapsed Generation 1:     2 collections,     0 parallel,  0.00s,  0.00s
> elapsed
>
>   INIT  time    0.00s  (  0.00s elapsed)
>   MUT   time    0.14s  (  0.17s elapsed)
>   GC    time    0.00s  (  0.00s elapsed)
>   EXIT  time    0.00s  (  0.00s elapsed)
>   Total time    0.14s  (  0.17s elapsed)
>
>   %GC time       0.0%  (0.6% elapsed)
>
>   Alloc rate    251,288,057 bytes per MUT second
>
>   Productivity 100.0% of total user, 82.9% of total elapsedYes, that's more like it :)>
> I don't think there is much left to do in terms of optimization.Generate the primitive triangles by a more efficient method, all those 
gcd's take a lot of time.> So I'm left with the question of when to use Data.Map, if accumarray
> is so much better even for heavy number crunching as this.That's not really heavy number crunching, it's a pique-nique.

There are several considerations.

- sparse (irregular) data
If only one in 100 possible data points actually occurs, an array is a huge 
waste of space. You quickly lose cache-locality then, and Maps are often 
faster in those cases.

- unknown bounds
That rules out arrays.

- pattern of computation
For this problem, accumArray is perfectly suited, but that is not always 
the case. Consider problems where you update your Map using previously 
stored data, then accumArray isn't applicable. In such cases Maps give you 
nice code and usually decent performance, although in most cases if your 
data is not sparse, using an unboxed mutable array (STUArray) if possible 
gives you much better performance - but at the cost of uglier code. A boxed 
mutable array (STArray) will in many cases also perform better than a Map, 
but usually not as well as an unboxed one.


Of course, arrays can't be used (or only with great convolutions) for stuff 
like Map String a, where the key can't be mapped easily to an array index.

> Especially when Data.Map advertises as efficient.

It is efficient for what it does (of course, there's always room for 
improvement), but it's not the appropriate data structure for everything.
By default, Haskell's data structures are immutable, thus if you modify a 
Map, you get a new one, which typically requires copying of O(log size) 
nodes (and the bulk of the Map is shared, otherwise performance would be 
abysmal).
But still, that is a nontrivial operation, much more costly than modifying 
a mutable map in an impure language (where such a modification is often 
just relinking a few pointers).
So if you have an algorithm that requires frequent updates, the cost of 
these operations makes itself felt.

But what happens if you use an array instead of a Map?
If the keys map easily to array indices, that is very easy to do, but if 
you use immutable arrays, each update requires copying the entire array (in 
case of boxed arrays only the pointers to the values are copied, but still, 
allocating a new memory block and copying 10 million pointers likely takes 
much longer than copying 20-25 nodes of a Map). If you use a mutable array, 
such a modification is cheap, only one array slot needs to be changed. But 
that ties you to a mutability monad (ST s or IO, mostly), which has its 
downsides too.
And if data is sparse, you again waste a lot of space.

And if the keys don't map easily to indices, well, you can implement a Map 
as an Array Int (Key,Value) or a pair (Array Int Key, Array Int Value), 
keeping the entries sorted by the keys.
In fact, if you build the map once and use it later for a lot of lookups, 
that is often better than a Map (regarding performance).
Both give you O(log size) lookup, but the array solution can have less 
space overhead.
However, inserting a new element or deleting an element becomes an O(size) 
operation, so if you have to do that a lot, this is a terrible solution, 
you should use a Map.> (Even without Unboxed, normal Arrays beat Maps by a factor of 2)
> Best regards.
Home | About | Privacy