When I need a quick low-level programming "fix", I browse the archives at PageTable.com. Last week I read the post on Readable and Maintainable Bitfields in C which argued the merits of using bitfields over bitmasks+macros. Although I agree with the post's points I think it omitted one important detail - the danger of using bitfields with hardware registers.
Hardware registers can be mapped into a processor's memory space and
accessed with standard memory read/write instructions. Therefore the
temptation is to define a bitfield type representing a register's
structure and set a pointer to its base memory address. For example,
assuming register bar
is four bytes wide, has five bit-fields, and
is located at memory address 0xBAADF00D
one might be tempted to do
the following:
typedef struct { unsigned int f1:8; unsigned int f2:4; unsigned int f3:8; unsigned int f4:4; unsigned int f5:8; } registerBar_t; // Set pointer to register's memory address registerBar_t *pBar = 0xBAADF00D; // Use pointer to access register pBar->f1 = 0xFE;
One problem with this approach is that many registers are designed to
be accessed only at their full size - all accesses must be aligned to
the register's base address and read/write the whole
thing. Unfortunately, the setting of f1
in the above example may
produce a partial register write that can lead to unexpected and
unintended results.
Another challenge when dealing with registers is that, unlike main memory, register accesses can have side effects. Even register reads can cause the hardware to initiate action or clear information. Consider the following example:
... tmpf1 = pBar->f1; tmpf2 = pBar->f2; ...
If reads of register bar
are destructive causing its contents to be
cleared then the read of f1
may clear the contents of f2
before it
is read by the subsequent pointer dereference. The resulting loss of
information could cause the driver, hardware, or both to behave
unexpectedly.
Alternatively, if reads of register bar
cause the hardware to
initiate action then spurious activity may occur if the f1
and f2
accesses are done separately.
Because of these granularity and side effect issues, I was advised early in my career to avoid using bitfields with hardware registers. This is, I think, an important point that is captured in the PageTable.com post's comments but not in the post itself which is an unfortunate omission.
After reading the PageTable.com post, I realized that I always took this advice on faith and never looked at the instructions generated by bitfield accesses. So I decided to do a quick experiment, below is a short program that accesses a bitfield with fields of varying size and alignment.
#include <stdio.h> typedef union { struct { unsigned int f1:8; // Bits 07:00 unsigned int f2:4; // Bits 11:08 unsigned int f3:8; // Bits 19:12 unsigned int f4:4; // Bits 23:20 unsigned int f5:8; // Bits 31:24 }; unsigned int raw; } bitfield_t; int main() { bitfield_t bitfield; unsigned int tmp; bitfield.raw = 0x0; // Set bit field f1 bitfield.f1 = 0xEF; tmp = bitfield.f1; printf(" After f1: F1(0x%02x) RAW(0x%08x)\n", tmp, bitfield.raw); // Set bit field f2 bitfield.f2 = 0xE; tmp = bitfield.f2; printf(" After f2: F2(0x%02x) RAW(0x%08x)\n", tmp, bitfield.raw); // Set bit field f3 bitfield.f3 = 0xDB; tmp = bitfield.f3; printf(" After f3: F3(0x%02x) RAW(0x%08x)\n", tmp, bitfield.raw); // Set bit field f4 bitfield.f4 = 0xA; tmp = bitfield.f4; printf(" After f4: F4(0x%02x) RAW(0x%08x)\n", tmp, bitfield.raw); // Set bit field f5 bitfield.f5 = 0xDE; tmp = bitfield.f5; printf(" After f5: F5(0x%02x) RAW(0x%08x)\n", tmp, bitfield.raw); // Set with raw bitfield.raw = 0xDECAFBAD; tmp = bitfield.raw; printf("After raw: RAW(0x%08x)\n", tmp); return 0; }
Compiling and running this program on an Ubuntu system results in the expected output.
jcardent@ubuntu:~/tmp$ gcc -g -o foo foo.c jcardent@ubuntu:~/tmp$ ./foo After f1: F1(0xef) RAW(0x000000ef) After f2: F2(0x0e) RAW(0x00000eef) After f3: F3(0xdb) RAW(0x000dbeef) After f4: F4(0x0a) RAW(0x00adbeef) After f5: F5(0xde) RAW(0xdeadbeef) After raw: RAW(0xdecafbad)
Running the command
jcardent@ubuntu:~/tmp$ objdump -d -S foo
reveals the instructions generated to access the bit-fields. Looking at
the f1
write and read sequence shows:
// Set bit field f1 bitfield.f1 = 0xEF; 80483dc: c6 45 f8 ef movb $0xef,-0x8(%ebp) tmp = bitfield.f1; 80483e0: 0f b6 45 f8 movzbl -0x8(%ebp),%eax 80483e4: 0f b6 c0 movzbl %al,%eax 80483e7: 89 45 f4 mov %eax,-0xc(%ebp)
The first thing to note from this disassembly fragment is that
bitfield
is located on the stack eight bytes below %ebp
. Likewise,
tmp
is located at offset 0xC.
From this example it's clear that the write to f1
uses a single byte
move instruction. If bitfield
had been mapped to a hardware
register, this would have resulted in an aligned but too short write
access that could have produced unintended behavior.
The read of f1
is less clear until the movzbl
instruction is
understood to be a move from a single byte to a word, four bytes in
this case. So here again, if bitfield
had been mapped to a register
the single-byte access may have resulted in unintended behavior like
dataloss (top three bytes cleared) or spurious action (if subsequent
reads are done to other fields for the same operation).
Looking at the f2
write and read sequence shows:
// Set bit field f2 bitfield.f2 = 0xE; 8048404: 0f b6 45 f9 movzbl -0x7(%ebp),%eax 8048408: 83 e0 f0 and $0xfffffff0,%eax 804840b: 83 c8 0e or $0xe,%eax 804840e: 88 45 f9 mov %al,-0x7(%ebp) tmp = bitfield.f2; 8048411: 0f b6 45 f9 movzbl -0x7(%ebp),%eax 8048415: 83 e0 0f and $0xf,%eax 8048418: 0f b6 c0 movzbl %al,%eax 804841b: 89 45 f4 mov %eax,-0xc(%ebp)
In this case, setting the four bit-wide field f2
results in a
byte-wide read-modify-write sequence aligned with the second byte of
bitfield
, evidenced by the offset of 0x7 instead of 0x8. Similarly,
reading f2
results in a byte-wide read aligned with the second byte
of bitfield
. Both accesses are too short and misaligned.
Since f3
spans bytes 2 and 3 of bitfield
, its access sequence
results in aligned, four byte-wide mov
instructions.
bitfield.f3 = 0xDB; 8048438: 8b 45 f8 mov -0x8(%ebp),%eax 804843b: 25 ff 0f f0 ff and $0xfff00fff,%eax 8048440: 0d 00 b0 0d 00 or $0xdb000,%eax 8048445: 89 45 f8 mov %eax,-0x8(%ebp) tmp = bitfield.f3; 8048448: 8b 45 f8 mov -0x8(%ebp),%eax 804844b: c1 e8 0c shr $0xc,%eax 804844e: 80 e4 ff and $0xff,%ah 8048451: 0f b6 c0 movzbl %al,%eax 8048454: 89 45 f4 mov %eax,-0xc(%ebp)
Although the accesses themselves are well-formed, unintended behaviors
can still result if f3
is only one of multiple fields that must be
set for a single operation.
Since the structure of bitfield
is symmetrical, the accesses to
fields f4
and f5
produce instructions similar to those for f2
and f1
respectively albeit with different offsets.
Finally, the accesses to raw
produce aligned, full-width
instructions as expected.
// Set with raw bitfield.raw = 0xDECAFBAD; 80484cd: c7 45 f8 ad fb ca de movl $0xdecafbad,-0x8(%ebp) tmp = bitfield.raw; 80484d4: 8b 45 f8 mov -0x8(%ebp),%eax 80484d7: 89 45 f4 mov %eax,-0xc(%ebp)
This last example illustrates a tempting workaround for "safely" using bitfields to manage register accesses. Consider:
registerBar_t *pBar = 0xBAADF00D; registerBar_t tmpBar; // Set field f1 to 0xff tmpBar.raw = pBar->raw; tmpBar.f1 = 0xff; pBar->raw = tmpBar.raw;
While this approach works, it suffers the risk of an uninformed future maintainer "optimizing out" the temporary variable and just using the bitfield method directly. In this regard, it may be more maintainable to use bitmasks and macros for register accesses.
Of course, problems can arise regardless of the method used if "uninformed" developers are allowed to change the code. The only prevention here is to make sure there is suitable training and disciplined code reviews.