A overview how python allocate/release memory
Small object ( <512 bytes) allocate use 3 structure to manage — arena , pool and block
Large object use standard c allocator
Block
Each block is a chunk of memory increase by 8 bytes range from 8 to 512 .
Pool
Pool is a collection of blocks(each pool linked by double list) , size of pool is size of memory page . e.g. 4kb . Limiting pool to the fixed size of blocks helps with fragmentation. If an object gets destroyed, the memory manager can fill this space with a new object of the same size.Pool also keep track used blocks and arena index where pool created .
Each Pool has 3 states
untouched: a portion of memory that has not been allocated
free: a portion of memory that was allocated but later made “free” by CPython and that no longer contains relevant data
allocated: a portion of memory that actually contains relevant data
Arena
The arena is a chunk of 256kB memory allocated on the heap, which provides memory for 64 pools.
All arenas are linked using doubly linked list .
The freepools field points to the linked list of available pools.
GC
Object
Every object contains type , value, reference
Reference counting
This algorithm is straight forward . count the number of objects referenced by others .once reached 0 , will be gc collected
E.g .
a=1b=a
b=None
Variable ‘b’ will be collected
The cycle reference issue
a=[]a.append(a)
Gc reference counting version will never collect ‘a’
Generation based GC
Python GC is generation based . 0 ,1 , 2 . smaller generation keep “younger” objects ,they will be GC more frequently . after each GC ,if younger generation object still survive, will be moved to higher generation .
Each generation has an individual counter and threshold to trigger GC. If more than one generation threshold reached at same time, trigger older first. Threshold can get from gc.get_threshold .
Use weakref to offload gc
When deal with linklist or graph structure could use weakref.ref or weakref.proxy to reference next nodes .
This operation will not increase object counting in GC. The ref or proxy object lifecycle could be manually managed .
sample code
import weakrefimport sysclass Node(object): passdef alive_obj(o): ref = o() if ref: print('[ref]obj still alive') else: print('[ref]obj died') return odef alive_proxy(p): try : assert(p != None) print('[proxy]obj still alive') except ReferenceError : print('[proxy] died') return pif __name__ == "__main__": a = Node() print(sys.getrefcount(a)) # 2 b1 = weakref.ref(a) b2 = weakref.proxy(a) print(sys.getrefcount(a)) # still 2 b1=alive_obj(b1) b2=alive_proxy(b2) #output : # [ref]obj still alive # [proxy]obj still alive del a b1=alive_obj(b1) b2=alive_proxy(b2) # output: # [ref]obj died # [proxy] died