Type Systems and Alignment
Computer programming languages use type systems for two distinct purposes:
Defining the physical representation of values in a computer
Helping human programmers think abstractly, without regard to that representation
In a perfect world, each type would serve both purposes equally well, in a harmonious union of form and function. For example, a value of the Rust type i32 may be viewed as both:
A sequence of exactly thirty-two bits, in two's complement
An integer between -2,147,483,648 and 2,147,483,647, inclusive
Sadly, the purposes are sometimes at odds. A classic example is the ordering of an aggregate type's member variables. To serve the first purpose—specifying physical representation—the language should lay out data in exactly the order specified by the programmer. To serve the second, though, the language should allow the programmer to declare variables in whatever order makes logical sense, without regard to the physical implementation of the type.
The C programming language heavily favors the first purpose. For example, consider a C struct having two 16-bit integers, interleaved with two 32-bit integers:
#include <stddef.h> | |
#include <stdint.h> | |
#include <stdio.h> | |
struct CStruct { | |
int16_t a; // 2 bytes | |
int32_t b; // 4 bytes | |
int16_t c; // 2 bytes | |
int32_t d; // 4 bytes | |
}; | |
int main() { | |
puts("C struct:\n"); | |
printf(" offset of a: %2ld bytes\n", offsetof(struct CStruct, a)); | |
printf(" offset of b: %2ld bytes\n", offsetof(struct CStruct, b)); | |
printf(" offset of c: %2ld bytes\n", offsetof(struct CStruct, c)); | |
printf(" offset of d: %2ld bytes\n", offsetof(struct CStruct, d)); | |
printf(" total size: %2ld bytes\n", sizeof(struct CStruct)); | |
return 0; | |
} |
You might think the total size of the struct would be the sum of the sizes of the component values: 2 + 4 + 2 + 4 = 12 bytes. However, each two-byte field must be followed by two more bytes of “padding” to meet the alignment requirements of the subsequent four-byte fields. (“Alignment requirements” mean that some objects can only live at certain addresses in the computer’s memory.) The struct thus has four bytes of wasted space, making its total size 16 bytes:
$ cc -std=c11 -pedantic -Wall -Wextra main.c && ./a.out | |
C struct: | |
offset of a: 0 bytes | |
offset of b: 4 bytes | |
offset of c: 8 bytes | |
offset of d: 12 bytes | |
total size: 16 bytes |
The compiler could avoid the wasted space by ordering the fields differently, but it won’t, because C makes a point of doing exactly what you tell it, for better or worse. C (and C++) programmers must thus either accept suboptimal run-time resource requirements, or else explicitly order field declarations to avoid padding (usually in order of decreasing size), regardless of whether that order makes any logical sense to someone reading the code.
An arguably better approach is to let programmers declare fields in whatever order makes the code clearest, then let the compiler reorder them for maximal run-time efficiency. This is Rust’s (default) approach:
struct RustStruct { | |
a: u16, // 2 bytes | |
b: u32, // 4 bytes | |
c: u16, // 2 bytes | |
d: u32, // 4 bytes | |
} | |
fn address<T>(r: &T) -> isize { | |
r as *const _ as isize | |
} | |
fn main() { | |
let object = RustStruct { a: 0, b: 0, c: 0, d: 0 }; | |
let offset = |p| p - address(&object); | |
println!("Rust struct:\n"); | |
println!(" offset of a: {:2} bytes", offset(address(&object.a))); | |
println!(" offset of b: {:2} bytes", offset(address(&object.b))); | |
println!(" offset of c: {:2} bytes", offset(address(&object.c))); | |
println!(" offset of d: {:2} bytes", offset(address(&object.d))); | |
println!(" total size: {:2} bytes", std::mem::size_of::<RustStruct>()); | |
} |
Even though the source code lists the fields in the logical order a, b, c, d, the Rust compiler produces machine code that orders them b, d, a, c, obviating the wasted padding and minimizing the structure’s size:
$ cargo run | |
Rust struct: | |
offset of a: 8 bytes | |
offset of b: 0 bytes | |
offset of c: 10 bytes | |
offset of d: 4 bytes | |
total size: 12 bytes |
In a systems programming language, there should also be a way for programmers to force a specific layout, even if it’s suboptimal. Rust lets you do that by specifying ‘repr(C),’ meaning the type’s physical representation should be C-compatible:
#[repr(C)] | |
struct CRepr { | |
a: u16, // 2 bytes | |
b: u32, // 4 bytes | |
c: u16, // 2 bytes | |
d: u32, // 4 bytes | |
} | |
fn address<T>(r: &T) -> isize { | |
r as *const _ as isize | |
} | |
fn main() { | |
let object = CRepr { a: 0, b: 0, c: 0, d: 0 }; | |
let offset = |p| p - address(&object); | |
println!("Rust struct with C-like layout:\n"); | |
println!(" offset of a: {:2} bytes", offset(address(&object.a))); | |
println!(" offset of b: {:2} bytes", offset(address(&object.b))); | |
println!(" offset of c: {:2} bytes", offset(address(&object.c))); | |
println!(" offset of d: {:2} bytes", offset(address(&object.d))); | |
println!(" total size: {:2} bytes", std::mem::size_of::<CRepr>()); | |
} |
Of course, C compatibility comes with a size (and potentially performance) penalty:
$ cargo run | |
Rust struct with C-like layout: | |
offset of a: 0 bytes | |
offset of b: 4 bytes | |
offset of c: 8 bytes | |
offset of d: 12 bytes | |
total size: 16 bytes |
The moral of the story here is not that language X is better than language Y, but that type systems must juggle the demands of wildly different use cases. Being aware of a particular type system’s faults and virtue can be key to using it effectively.