Dynamic Storage Management

Lecture Notes for CS 140
Winter 2012
John Ousterhout

  • Readings for this topic from Operating System Concepts: none.
  • Static memory allocation is simple and convenient, but it's not sufficient for everything.
  • Two basic operations in dynamic storage management:
    • Allocate a given number of bytes
    • Free a previously allocated block
  • Two general approaches to dynamic storage allocation:
    • Stack allocation (hierarchical): restricted, but simple and efficient.
    • Heap allocation: more general, but less efficient, more difficult to implement.

Stack Allocation

  • A stack can be used when memory allocation and freeing are partially predictable: memory is freed in opposite order from allocation. If alloc(A) then alloc(B) then alloc(C), then it will be free(C) then free(B) then free(A).
  • Example: procedure call. X calls Y calls Y again.
  • Stacks are also useful for lots of other things: tree traversal, expression evaluation, top-down recursive descent parsers, etc.
  • A stack-based organization keeps all the free space together in one place.

Heap Allocation

  • Heap allocation must be used when allocation and release are unpredictable
  • Memory consists of allocated areas and free areas (or holes). Inevitably end up with lots of holes.
  • Goal: reuse the space in holes to keep the number of holes small, keep their size large.
  • Fragmentation: inefficient use of memory due to holes that are too small to be useful. Stack allocation is perfect: only one hole.
  • Typically, heap allocation schemes use a free list to keep track of the storage that is not in use.
  • Best fit: keep linked list of free blocks, search the whole list on each allocation, choose block that comes closest to matching the needs of the allocation, save the excess for later. During release operations, merge adjacent free blocks.
  • First fit: just scan list for the first hole that is large enough. Free excess. Also merge on releases. Most first fit implementations are rotating first fit.
  • Best fit is not necessarily better than first fit!
  • First fit tends to leave "average" size holes, while best fit tends to leave some very large ones, some very small ones. The very small ones can't be used very easily.
  • Problem: over time, holes tend to fragment, approaching the size of the smallest objects allocated
  • Bit map: used for managing storage that comes in fixed-size chunks (e.g. disk blocks, or 32-byte chunks).
    • Keep a large array of bits, one for each chunk.
    • If bit is 0 it means chunk is in use, if bit is 1 it means chunk is free.
  • Pools: keep a separate linked list for each popular size.
    • Allocation is fast, no fragmentation.

Storage Reclamation

  • How do we know when dynamically-allocated memory can be freed?
    • Easy when a chunk is only used in one place.
    • Reclamation is hard when information is shared: it can't be recycled until all of the users are finished.
    • Usage is indicated by the presence of pointers to the data. Without a pointer, can't access (can't find it).
  • Two problems in reclamation:
    • Dangling pointers: better not recycle storage while it's still being used.
    • Memory leaks: storage gets "lost" because no one freed it even though it can't ever be used again.
  • Reference counts: keep count of the number of outstanding pointers to each chunk of memory. When this becomes zero, free the memory. Example: Smalltalk, file descriptors in Unix. Works fine for hierarchical structures.
  • Garbage collection: storage isn't freed explicitly (using free operation), but rather implicitly: just delete pointers.
    • When the system needs storage, it searches through all of the pointers (must be able to find them all!) and collects things that aren't used.
    • If structures are circular then this is the only way to reclaim space.
    • Garbage collectors typically compact memory, moving objects to coalesce all free space.
  • One way to implement garbage collection: mark and sweep:
    • Must be able to find all objects.
    • Must be able to find all pointers to objects.
    • Pass 1: mark. Go through all statically-allocated and procedure-local variables, looking for pointers. Mark each object pointed to, and recursively mark all objects it points to. The compiler has to cooperate by saving information about where the pointers are within structures.
    • Pass 2: sweep. Go through all objects, free up those that aren't marked.
  • Garbage collection is often expensive: 10-20% or all CPU time in systems that use it.