The problem
Memory managers are easy to write incorrectly:
I set my GC header bit to zero.
Other runtime system code runs.
My GC header bit is now set to one.
The distinction matters: this is a memory safety error, not a static bug. Bugs are statically possible violations; errors are runtime violations of dynamic semantics. A bug need not produce an error, and an error need not trace to an obvious bug.
Why current solutions fall short
The standard toolkit — printf debugging, code editors, static control-flow analysis, mailing lists — addresses bugs, not errors. Memory safety errors in memory managers require:
- Knowing the current configuration of memory
- Knowing what code is running
- Knowing what memory that code is allowed to access
Documentation is a programming-language concern, not just a comment in a source file. Weeks of manual debugging typically follow from the false assumption that all errors are the result of statically-catchable bugs.
PL approach: every problem is a language
Frame the problem formally. Define:
- M(VM) — the set of valid memory configurations: Is address 0x0000…0xFFFF ∈ M(VM)?
- C(VM) — the set of valid (code, memory-type) pairs: Is (Code(ip), MemoryType(esi)) ∈ C(VM)?
Specify M(VM) and C(VM) explicitly, then check membership dynamically.
Prior work: dynamic binary instrumentation (Permchecker)
- Specify M(VM) and C(VM) imperatively, in terms of the runtime
- Half-portable — requires VM-specific alloc() and free() hooks
- Shared Permchecker API across VMs
Ongoing work: Floorplan (FLP) specification language
Memory manager memory has three structural properties that Permchecker exploits:
- Layered: Page → Space → Block → Cell
- Bifurcated: alloc-path vs. free-path
- Owned: metadata at each layer, free chunks at each layer
Floorplan (FLP) is a language for specifying M(VM) separately from the memory manager implementation — making the layout explicit so tools can reason about it.
FLP toolchain
- Parser implemented in Haskell
- Transpiler in progress: tiles FLP layers onto C structs or C++ classes
- Target implementations: HotSpot, JikesRVM, GHC
- Pin and Valgrind DBI backend
- Take-away: a language for specifying memory layout capable of describing a wide variety of VM implementations (Java, Haskell, JavaScript, …)
Future work: specifying C(VM)
Integrate permission specifications with the host language or a framework:
- Target C++ class methods (portability: lowest common denominator)
- Target Rust (safety: stronger static guarantees)
- Integrate with LLVM (automation)