When I need a quick low-level programming "fix", I browse the archives
at PageTable.com. Last week I read the post on Readable and Maintainable Bitfields in C which argued the merits of using bitfields
over bitmasks+macros. Although I agree with the post's points I think
it omitted one important detail - the danger of using bitfields with
hardware registers.
Hardware registers can be mapped into a processor's memory space and
accessed with standard memory read/write instructions. Therefore the
temptation is to define a bitfield type representing a register's
structure and set a pointer to its base memory address. For example,
assuming register bar
is four bytes wide, has five bit-fields, and
is located at memory address 0xBAADF00D
one might be tempted to do
the following:
typedef struct {
unsigned int f1:8;
unsigned int f2:4;
unsigned int f3:8;
unsigned int f4:4;
unsigned int f5:8;
} registerBar_t;
// Set pointer to register's memory address
registerBar_t *pBar = 0xBAADF00D;
// Use pointer to access register
pBar->f1 = 0xFE;
One problem with this approach is that many registers are designed to
be accessed only at their full size - all accesses must be aligned to
the register's base address and read/write the whole
thing. Unfortunately, the setting of f1
in the above example may
produce a partial register write that can lead to unexpected and
unintended results.
Another challenge when dealing with registers is that, unlike main
memory, register accesses can have side effects. Even register reads
can cause the hardware to initiate action or clear information. Consider
the following example:
...
tmpf1 = pBar->f1;
tmpf2 = pBar->f2;
...
If reads of register bar
are destructive causing its contents to be
cleared then the read of f1
may clear the contents of f2
before it
is read by the subsequent pointer dereference. The resulting loss of
information could cause the driver, hardware, or both to behave
unexpectedly.
Alternatively, if reads of register bar
cause the hardware to
initiate action then spurious activity may occur if the f1
and f2
accesses are done separately.
Because of these granularity and side effect issues, I was advised
early in my career to avoid using bitfields with hardware
registers. This is, I think, an important point that is captured in
the PageTable.com post's comments but not in the post itself which is
an unfortunate omission.
After reading the PageTable.com post, I realized that I always took
this advice on faith and never looked at the instructions generated by
bitfield accesses. So I decided to do a quick experiment, below is
a short program that accesses a bitfield with fields of varying size and
alignment.
#include <stdio.h>
typedef union {
struct {
unsigned int f1:8; // Bits 07:00
unsigned int f2:4; // Bits 11:08
unsigned int f3:8; // Bits 19:12
unsigned int f4:4; // Bits 23:20
unsigned int f5:8; // Bits 31:24
};
unsigned int raw;
} bitfield_t;
int main()
{
bitfield_t bitfield;
unsigned int tmp;
bitfield.raw = 0x0;
// Set bit field f1
bitfield.f1 = 0xEF;
tmp = bitfield.f1;
printf(" After f1: F1(0x%02x) RAW(0x%08x)\n",
tmp,
bitfield.raw);
// Set bit field f2
bitfield.f2 = 0xE;
tmp = bitfield.f2;
printf(" After f2: F2(0x%02x) RAW(0x%08x)\n",
tmp,
bitfield.raw);
// Set bit field f3
bitfield.f3 = 0xDB;
tmp = bitfield.f3;
printf(" After f3: F3(0x%02x) RAW(0x%08x)\n",
tmp,
bitfield.raw);
// Set bit field f4
bitfield.f4 = 0xA;
tmp = bitfield.f4;
printf(" After f4: F4(0x%02x) RAW(0x%08x)\n",
tmp,
bitfield.raw);
// Set bit field f5
bitfield.f5 = 0xDE;
tmp = bitfield.f5;
printf(" After f5: F5(0x%02x) RAW(0x%08x)\n",
tmp,
bitfield.raw);
// Set with raw
bitfield.raw = 0xDECAFBAD;
tmp = bitfield.raw;
printf("After raw: RAW(0x%08x)\n",
tmp);
return 0;
}
Compiling and running this program on an Ubuntu system results in the
expected output.
jcardent@ubuntu:~/tmp$ gcc -g -o foo foo.c
jcardent@ubuntu:~/tmp$ ./foo
After f1: F1(0xef) RAW(0x000000ef)
After f2: F2(0x0e) RAW(0x00000eef)
After f3: F3(0xdb) RAW(0x000dbeef)
After f4: F4(0x0a) RAW(0x00adbeef)
After f5: F5(0xde) RAW(0xdeadbeef)
After raw: RAW(0xdecafbad)
Running the command
jcardent@ubuntu:~/tmp$ objdump -d -S foo
reveals the instructions generated to access the bit-fields. Looking at
the f1
write and read sequence shows:
// Set bit field f1
bitfield.f1 = 0xEF;
80483dc: c6 45 f8 ef movb $0xef,-0x8(%ebp)
tmp = bitfield.f1;
80483e0: 0f b6 45 f8 movzbl -0x8(%ebp),%eax
80483e4: 0f b6 c0 movzbl %al,%eax
80483e7: 89 45 f4 mov %eax,-0xc(%ebp)
The first thing to note from this disassembly fragment is that
bitfield
is located on the stack eight bytes below %ebp
. Likewise,
tmp
is located at offset 0xC.
From this example it's clear that the write to f1
uses a single byte
move instruction. If bitfield
had been mapped to a hardware
register, this would have resulted in an aligned but too short write
access that could have produced unintended behavior.
The read of f1
is less clear until the movzbl
instruction is
understood to be a move from a single byte to a word, four bytes in
this case. So here again, if bitfield
had been mapped to a register
the single-byte access may have resulted in unintended behavior like
dataloss (top three bytes cleared) or spurious action (if subsequent
reads are done to other fields for the same operation).
Looking at the f2
write and read sequence shows:
// Set bit field f2
bitfield.f2 = 0xE;
8048404: 0f b6 45 f9 movzbl -0x7(%ebp),%eax
8048408: 83 e0 f0 and $0xfffffff0,%eax
804840b: 83 c8 0e or $0xe,%eax
804840e: 88 45 f9 mov %al,-0x7(%ebp)
tmp = bitfield.f2;
8048411: 0f b6 45 f9 movzbl -0x7(%ebp),%eax
8048415: 83 e0 0f and $0xf,%eax
8048418: 0f b6 c0 movzbl %al,%eax
804841b: 89 45 f4 mov %eax,-0xc(%ebp)
In this case, setting the four bit-wide field f2
results in a
byte-wide read-modify-write sequence aligned with the second byte of
bitfield
, evidenced by the offset of 0x7 instead of 0x8. Similarly,
reading f2
results in a byte-wide read aligned with the second byte
of bitfield
. Both accesses are too short and misaligned.
Since f3
spans bytes 2 and 3 of bitfield
, its access sequence
results in aligned, four byte-wide mov
instructions.
bitfield.f3 = 0xDB;
8048438: 8b 45 f8 mov -0x8(%ebp),%eax
804843b: 25 ff 0f f0 ff and $0xfff00fff,%eax
8048440: 0d 00 b0 0d 00 or $0xdb000,%eax
8048445: 89 45 f8 mov %eax,-0x8(%ebp)
tmp = bitfield.f3;
8048448: 8b 45 f8 mov -0x8(%ebp),%eax
804844b: c1 e8 0c shr $0xc,%eax
804844e: 80 e4 ff and $0xff,%ah
8048451: 0f b6 c0 movzbl %al,%eax
8048454: 89 45 f4 mov %eax,-0xc(%ebp)
Although the accesses themselves are well-formed, unintended behaviors
can still result if f3
is only one of multiple fields that must be
set for a single operation.
Since the structure of bitfield
is symmetrical, the accesses to
fields f4
and f5
produce instructions similar to those for f2
and f1
respectively albeit with different offsets.
Finally, the accesses to raw
produce aligned, full-width
instructions as expected.
// Set with raw
bitfield.raw = 0xDECAFBAD;
80484cd: c7 45 f8 ad fb ca de movl $0xdecafbad,-0x8(%ebp)
tmp = bitfield.raw;
80484d4: 8b 45 f8 mov -0x8(%ebp),%eax
80484d7: 89 45 f4 mov %eax,-0xc(%ebp)
This last example illustrates a tempting workaround for "safely" using
bitfields to manage register accesses. Consider:
registerBar_t *pBar = 0xBAADF00D;
registerBar_t tmpBar;
// Set field f1 to 0xff
tmpBar.raw = pBar->raw;
tmpBar.f1 = 0xff;
pBar->raw = tmpBar.raw;
While this approach works, it suffers the risk of an uninformed future
maintainer "optimizing out" the temporary variable and just using the
bitfield method directly. In this regard, it may be more maintainable
to use bitmasks and macros for register accesses.
Of course, problems can arise regardless of the method used if
"uninformed" developers are allowed to change the code. The only
prevention here is to make sure there is suitable training and
disciplined code reviews.