osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Python-Dev] mmap & munmap loop (Was: Compact ordered set


Maybe pickle is inefficient in its memory management and causes a lot
of memory fragmentation?

It's hard to write an efficient memory allocator :-( My notes on memory:

* "Excessive peak memory consumption by the Python parser"
   https://bugs.python.org/issue26415
* https://pythondev.readthedocs.io/memory.html
* https://vstinner.readthedocs.io/heap_fragmentation.html

Sometimes I would like to be able to use a separated memory allocator
for one function, to not pollute the global allocator and so avoid
"punching holes" in global memory pools or in the heap memory. The
problem is to track the lifetime of objects. If the allocated objects
live longer than the function, PyObject_Free() should be able to find
the alllocator used by the memory block. pymalloc is already able to
check if its manages a memory block using its address. If it's not
allocated by pymalloc, PyObject_Free() falls back to libc free().

The Python parser already uses PyArena which is a custom memory
allocator. It uses PyMem_Malloc() to allocate memory. In Python 3.7,
PyMem_Malloc() uses pymalloc:
https://docs.python.org/dev/c-api/memory.html#default-memory-allocators

Victor

Le mer. 27 f?vr. 2019 ? 12:36, INADA Naoki <songofacandy at gmail.com> a ?crit :
>
> It happened very accidentally.  Since venv is used,
> many paths in the interpreter is changed.  So how memory
> is used are changed.
>
> Let's reproduce the accident.
>
> $ cat m2.py
> import pickle, sys
>
> LIST = pickle.dumps([[0]*10 for _ in range(10)], pickle.HIGHEST_PROTOCOL)
>
> N = 1000
> z = [[0]*10 for _ in range(N)]
>
> if '-c' in sys.argv:
>     sys._debugmallocstats()
>     sys.exit()
>
> for _ in range(100000):
>     pickle.loads(LIST)
>
> $ /usr/bin/time python3 m2.py
> 0.42user 0.00system 0:00.43elapsed 99%CPU (0avgtext+0avgdata 9100maxresident)k
> 0inputs+0outputs (0major+1139minor)pagefaults 0swaps
>
> There are only 1139 faults.  It is less than 100000.
>
> $ /usr/bin/time python3 m2.py -c
> ...
> 14 unused pools * 4096 bytes       =               57,344
> ...
>
> adjust N im m2.py until it shows "0 unused pools".
> In my case, N=1390.
>
> $ /usr/bin/time python3 m2.py
> 0.51user 0.33system 0:00.85elapsed 99%CPU (0avgtext+0avgdata 9140maxresident)k
> 0inputs+0outputs (0major+201149minor)pagefaults 0swaps
>
> 200000 faults!
> It seems two page fault / loop.  (2 pools are used and returned).
>
>
> On Wed, Feb 27, 2019 at 7:51 PM Victor Stinner <vstinner at redhat.com> wrote:
>>
>> Sorry, I didn't get a coffee yet: more *often* in a venv.
>>
>> Le mer. 27 f?vr. 2019 ? 11:32, Victor Stinner <vstinner at redhat.com> a ?crit :
>> >
>> > Any idea why Python calls mmap+munmap more even in a venv?
>> >
>> > Victor
>> >
>> > Le mer. 27 f?vr. 2019 ? 10:00, INADA Naoki <songofacandy at gmail.com> a ?crit :
>> > >
>> > > >
>> > > > > Ah, another interesting point, this huge slowdown happens only when bm_pickle.py
>> > > > > is executed through pyperformance.  When run it directly, slowdown is
>> > > > > not so large.
>> > > >
>> > > > pyperformance runs benchmarks in a virtual environment. I don't know
>> > > > if it has any impact on bm_pickle.
>> > > >
>> > > > Most pyperformance can be run outside a virtual env if required
>> > > > modules are installed on the system. (bm_pickle only requires the
>> > > > stdlib and perf.)
>> > > >
>> > >
>> > > Bingo!
>> > >
>> > > Without venv:
>> > >
>> > > unpickle: Mean +- std dev: 26.9 us +- 0.0 us
>> > > % time     seconds  usecs/call     calls    errors syscall
>> > > ------ ----------- ----------- --------- --------- ----------------
>> > >  28.78    0.000438           0      1440           read
>> > >  27.33    0.000416           1       440        25 stat
>> > >   9.72    0.000148           1       144           mmap
>> > > ...
>> > >   0.79    0.000012           1        11           munmap
>> > >
>> > > With venv:
>> > >
>> > > % time     seconds  usecs/call     calls    errors syscall
>> > > ------ ----------- ----------- --------- --------- ----------------
>> > >  57.12    0.099023           2     61471           munmap
>> > >  41.87    0.072580           1     61618           mmap
>> > >   0.23    0.000395           1       465        27 stat
>> > >
>> > > unpickle and unpickle_list creates massive same-sized objects, then all objects are
>> > > removed.  If all pools in the arena is freed, munmap is called.
>> > >
>> > > I think we should save some arenas to reuse.  On recent Linux,
>> > > we may be able to use MADV_FREE instead of munmap.
>> > >
>> >
>> >
>> > --
>> > Night gathers, and now my watch begins. It shall not end until my death.
>>
>>
>>
>> --
>> Night gathers, and now my watch begins. It shall not end until my death.
>
>
>
> --
> INADA Naoki  <songofacandy at gmail.com>



-- 
Night gathers, and now my watch begins. It shall not end until my death.