53 453 4
53 453 4
53 4
53 4
53 4
A Type Theory for Memory Allocation and Data Layout
Leaf Petersen, Robert Harper,Karl Crary and Frank Pfenning
Carnegie Mellon University
53 4
Carnegie Mellon University 2
• High-level languages – Abstract view of data, characterized by
operations– e.g. pairs:
• Introduction: (e1,e2) : t1 x t2• Elimination: fst e : t1 , snd e : t2
• Low-level languages– Concrete view of data, characterized by layout
in memory– e.g. C structs:
• Contiguous layout • Memory size determined by type
Views of data
Carnegie Mellon University 3
• Usually programmers don’t care – But sometimes have to– Marshalling, interaction with low-level
devices, precise control of initialization, interoperability
– Generally no type safety
• Compilers have to care– Represent high-level data abstractions– Allocation and initialization code
Data layout
Carnegie Mellon University 4
(3,(4,5)) : int x (int x int)
53 453 4
53 4
Carnegie Mellon University 5
Type theory for data layout
• Expose the fine structure– Expose memory layout in types– Implementation choices explicit– High-level object types defined in terms of
low-level memory types– High-level operations on objects broken
down into low-level operations on memory
• What is the fine structure of memory?
Carnegie Mellon University 6
Initialization
• Data objects – Created by initializing raw memory.
• Initialization changes types– e.g. from ns to int
• Commonly dealt with via linearity– New memory is linear– No aliases – Linear type theory handles re-typing
Carnegie Mellon University 7
Adjacency
• Memory provides a primitive notion of adjacent items: e.g. 3 next to 4.
• Large objects composed of adjacent smaller objects
• Sub-objects referenced by offsets or interior pointers.
3 4
Carnegie Mellon University 8
Associativity
• Adjacency is associative: the same memory layout is described by:– (3 next to 4) next to 5 – 3 next to (4 next to 5)
• But not commutative!– 3 next to 4 4 next to 3
3 4 5
Carnegie Mellon University 9
Indirection
• Not all objects are adjacent• Memory supports a notion of
indirection (pointers or labels).– Refer to non-adjacent data via
indirection– 3 next to (pointer to (4 next to 5))
3 4 5
Carnegie Mellon University 10
Ordered Type Theory
• Linear type theory handles initialization– Doesn’t capture other memory properties
• Ordered type theory– Variables used exactly once (linear)– Variables may not be permuted.– Adjacent variables remain adjacent– No weakening, contraction, or exchange.
• Claim: Ordered constructs admit a natural interpretation as adjacency and indirection.
Carnegie Mellon University 11
Variables and Resources
• Typing judgments:
• Ordering of x’s does not matter.– Unrestricted variables, bound to small objects
• Ordering and usage of a’s does matter.– Bound to memory– Adjacent variables bound to adjacent memory
Carnegie Mellon University 12
Ordered product
• Ordered product (fuse):
• Ordered products model adjacency
Carnegie Mellon University 13
• 3 next to 4– 3 ² 4 : int ² int
• 3 next to 4 next to 5– 3 ² (4 ² 5) : int ² (int ² int)– (3 ² 4) ² 5 : (int ² int) ² int
3 4
Adjacency
3 4 5
Carnegie Mellon University 14
Memory properties
• Associativity: – (1 ² 2) ² 3 and 1 ² (2 ² 3) are
isomorphic– Functions witness isomorphism
• Non-commutativity: – 1 ² 2 and 2 ² 1 are not isomorphic
– No function mapping one to the other (in general)
Carnegie Mellon University 15
Indirection
• Ordered modality models indirection– !M : ! corresponds to a pointer to M
– Non-linear, un-ordered term
Carnegie Mellon University 16
(3,(4,5)) : int x (int x int)
53 453 4
53 4
Carnegie Mellon University 17
(3,(4,5)) : int x (int x int)
int x (int x int) Ã !(int ² !(int ² int))(3,(4,5)) Ã !(3 ² ! (4 ² 5) )
53 4
Carnegie Mellon University 18
(3,(4,5)) : int x (int x int)
int x (int x int) Ã ! (! int ² !(! int ² ! int)(3,(4,5)) Ã !(!3 ² ! (!4 ² !5))
53 4
Carnegie Mellon University 19
(3,(4,5)) : int x (int x int)
int x (int x int) Ã ! (int ² (int ² int))(3,(4,5)) Ã !(3 ² (4 ² 5))
53 4
Carnegie Mellon University 20
Explicit Allocation
• Ordered type theory – Fine structure of data layout– But not allocation
• For example: !(x ² x)– Each time x is instantiated, new object– Initialized atomically
• Make allocation explicit– Remove !M from syntax– Add allocation primitives to introduce !
Carnegie Mellon University 21
Memory Allocation
• A well-known GC allocation protocol for copying garbage collectors:– Reserve: obtain raw, un-initialized
space.– Initialize: assign values to individual
locations.– Allocate: baptize some or all as valid
objects.
Carnegie Mellon University 22
Example: Memory Allocation
Heap
AP LPAP
? ? ??1 2 0
ReserveInitializeAllocate
x = (0,(1,2))
x
Carnegie Mellon University 23
Memory Allocation
• Type system separates terms and expressions– Terms M: no effects– Expressions E: have effects
• Allocation is an effect– Allocation primitives are expressions
Carnegie Mellon University 24
Resource a is used up!Create names for parts.Reserve space at a.
Allocating a Pair
• Allocate (1,2): Initialize a1, using it up.Re-introduce b1:int
Fuse parts and allocate.
Carnegie Mellon University 25
Coalescing ReservationAllocate two pairs: (1,2) and (3,4)
Carnegie Mellon University 26
Coalescing ReservationAllocate two pairs: (1,2) and (3,4)
Carnegie Mellon University 27
Coalescing ReservationAllocate two pairs: (1,2) and (3,4)
Carnegie Mellon University 28
Coalescing ReservationAllocate two pairs: (1,2) and (3,4)
Carnegie Mellon University 29
Summary
• Type theory for describing data layout– Adjacency requirements.– Precise control over representations.
• Type system for allocation:– Allocate raw memory.– Initialize, destructively changing types.– Ensures correct use of allocation protocol.– Permits code motion optimizations.
Carnegie Mellon University 30
What I’m not telling you
• It’s more subtle than it seems.– Plain ordered –calculus doesn’t work. – Need notion of size preserving terms,
other refinements.
• For details see the paper– Technical presentation and examples.– Interpretation of a calculus with
pairs.
Carnegie Mellon University 31
Current and Future Work
• POPL paper– Only finite products
• Technical Report: – Sums, recursive types, ordered
functions.– Extended coercion language.
• Ongoing– Dynamic extent (arrays)– Other allocation models
Carnegie Mellon University 32
Conclusion
• Ordered type theory is a natural framework for modeling data layout.– Low level issues dealt with entirely
realistically in a -calculus setting.– Correctness of allocation and
initialization protocols can be captured in the type system
Top Related