This chapter describes several extensions to the C99 standard implemented by CompCert: compiler pragmas (section 6.1), attributes (section 6.2), built-in functions (section 6.3), code annotation mechanisms (sections 6.4 and 6.5), and GCC-style extended inline assembly (section 6.6).
This section describes the pragmas supported by CompCert C. The compiler emits a warning for an unrecognized pragma.
On PowerPC: | R14, R15, …, R31 (general-purpose registers) |
F14, F15, …, F31 (float registers) | |
On ARM: | R4, R5, …, R11 (integer registers) |
F8, F9, …, F15 (float registers) | |
On x86-32: | EBX, ESI, EDI, EBP (32-bit integer registers) |
On x86-64: | RBX, RBP, R12, R13, R14, R15 (64-bit integer registers) |
On RISC-V: | X8, X9; X18, X19, …, X27 (integer registers) |
F8, F9; F18, F19, …, F27 (float registers) | |
On AArch64: | X19, X20, …, X29 (integer registers) |
D8, D9, …, D15 (float registers) |
Functions and global variables can be explicitly placed into user-defined sections using the #pragma use_section directive (see below). Another, more indirect, less recommended way is to modify the characteristics of the default sections in which the compiler automatically place function code, global variables, and auxiliary compiler-generated data. These default sections are as follows:
Internal name | What is put there |
DATA | global, non-const variables of size greater than N bytes. N defaults to 0 and can be set using the -fsmall-data command-line option. |
CONST | global, const variables of size greater than N bytes. N defaults to 0 and can be set using the -fsmall-const command-line option. |
SDATA | global, non-const variables of size less than or equal to N bytes. N defaults to 0 and can be set using the -fsmall-data command-line option. |
SCONST | global, const variables of size less than or equal to N bytes. N defaults to 0 and can be set using the -fsmall-const command-line option. |
STRING | string literals |
CODE | machine code for function definitions |
LITERAL | constants (e.g. floating-point literals) referenced from function code |
JUMPTABLE | jump tables generated for switch statements |
A simpler, alternate way to control placement into sections is to use the section attribute.
Example: explicit placement into user-defined sections. We first define four new compiler sections.
#pragma section MYDATA "mydata_i" "mydata_u" standard RW #pragma section MYCONST "myconst" "myconst" standard R #pragma section MYSDA "mysda_i" "mysda_u" near-data RW #pragma section MYCODE "mycode" "mycode" standard RX
We then use the #pragma use_section to place variables and functions in these sections, then define the variables and functions in question.
#pragma use_section MYDATA a b int a; // uninitialized; goes into linker section "mydata_u" double b = 3.1415; // initialized; goes into linker section "mydata_i" #pragma use_section MYCONST c const short c = 42; // goes into linker section "myconst" #pragma use_section MYSDA d e int d; // goes into linker section "mysda_u" double e = 2.718; // goes into linker section "mysda_i" #pragma use_section MYCODE f int f(void) { return a + d; } // goes into linker section "mycode" // accesses d via small data relative addressing // accesses a via absolute addressing
Example: implicit placement by modifying the compiler default sections. In this example, we assume that the options -fsmall-data 8 and -fsmall-const 0 are given.
#pragma section DATA "mydata_i" "mydata_u" standard RW
#pragma section CONST "myconst" "myconst" standard R
#pragma section SDATA "mysda_i" "mysda_u" near-data RW
#pragma section CODE "mycode" "mycode" standard RX
#pragma section LITERAL "mylit" "mylit" standard R
int a; // small data, uninitialized; goes into "mysda_u"
char b[16]; // big data, uninitialized; goes into "mydata_u"
const int c = 42; // big const data, initialized; goes into "myconst"
double f(void) { return c + 3.14; }
// code goes into "mycode"
// literal 3.14 goes into "mylit"
Caveat: when using non-standard sections, the linker must, in general, be informed of these sections and how to place them in the final executable, typically by providing an appropriate mapping file to the linker.
Like the GCC compiler, the CompCert C compiler allows the programmer to attach attributes to various parts of C source code.
An attribute qualifier is of the form __attribute((attribute-list)) or __attribute__((attribute-list)), where attribute-list is a possibly empty, comma-separated lists of attributes. Each attribute is of the following form:
attribute | ::= | ident |
∣ | ident(attr-arg,…,attr-arg) | |
attr-arg | ::= | ident |
∣ | "string-literal" | |
∣ | const-expr |
Each attribute carries a name and zero or more arguments, which can be identifiers, string literals, or C expressions of integer types that are compile-time constants.
Attribute names can be specified with __ (two underscore characters) preceding and following each name. For instance, __aligned__(N) and aligned(N) denote the same attribute.
For compatibility with other C compilers, the keyword __packed__ is also recognized as an attribute:
__packed__(params) | is equivalent to | __attribute((packed(params))) |
In C source files, attribute qualifiers can occur anywhere a standard type qualifier (const, volatile) can occur, and also just after the struct and union keywords. For partial compatibility with GCC, the CompCert parser allows attributes to occur in several other places, but may silently ignore them.
Warning. Some standard C libraries, when used in conjunction with CompCert, deactivate the __attribute__ keyword: the standard includes, or CompCert itself, define __attribute__ as a macro that erases its argument. This is the case for the Glibc standard library under Linux, and the Xcode header files under macOS. For this reason, please use the __attribute keyword in preference to the __attribute__ keyword.
The following attributes are recognized by CompCert. For unrecognized attributes CompCert issues a warning.
The packed attribute applies to a struct declarations and affects the memory layout of the members of this struct. Zero, one, two or three integer arguments can be provided.
If the max-member-alignment parameter is provided, the natural alignment of every member (field) of the structure is reduced to at most max-member-alignment. In particular, if max-member-alignment = 1, members are not aligned, and no padding is ever inserted between members.
If the min-struct-alignment parameter is provided, the natural alignment of the whole structure is increased to at least min-struct-alignment.
If the byte-swap parameter is provided and equal to 1, accesses to structure members of integer or pointer types are performed using the opposite endianness than that of the target platform. For PowerPC, accesses are done in little-endian mode instead of the natural big-endian mode; for x86 and RISC-V accesses are done in big-endian mode instead of the natural little-endian mode; for ARM — where the endianness is configuration dependent — accesses are done in big-endian mode when configured as little-endian and vice versa.
Examples:
struct __attribute__((packed(1))) s1 { // suppress all padding char c; // at offset 0 short s; // at offset 1 int i; // at offset 3 }; // total size is 7, structure alignment is 1 struct __attribute__((packed(4,16,1))) s2 { char c; // at offset 0 short s; // at offset 4 (because 4-aligned); byte-swap at access int i; // at offset 8 (because 4-aligned); byte-swap at access }; // total size is 16, structure alignment is 16 struct s3 { // default layout char c; // at offset 0 short s; // at offset 2 (because 2-aligned) int i; // at offset 4 }; // total size is 8, structure alignment is 4
Limitations: For a byte-swapped structure, all members should be of integer or pointer types, or arrays of those types.
Reduced member alignment should not be used on the ARM platform, since unaligned memory accesses on ARM can crash the program or silently produce incorrect results. Only the PowerPC, x86 and RISC-V platforms support unaligned memory accesses.
Bit fields are not supported inside byte-swapped structures and reduced-alignment structures.
The #pragma section and #pragma use_section directives can be used to obtain finer control on user-defined sections (section 6.1).
Interpretation of attributes. Attributes are attached to parts of the program in a way that depends on the kind of the attribute:
For example, consider
__attribute((noreturn, section("foo"))) void (*fptr) (int x);
The section attributes applies to variable fptr while the noreturn attribute applies to the function type. The meaning, therefore, is that fptr is a variable in section foo that has type “pointer to non-returning functions from int to void”.
__attribute((aligned(16))) char * p; typedef __attribute((aligned(16))) char char16; char16 * q;
The aligned attribute applies to the variable name p and to the type name char16. Consequently, p is a 16-aligned pointer to naturally-aligned char, while q is a naturally-aligned pointer to 16-aligned char.
Built-in functions are functions that are predefined in the initial environment — the environment in which the compilation of a source file starts. In other words, these built-in functions are always available: there is no need to include header files.
Built-in functions give access to interesting capabilities of the target processor that cannot be exploited in standard C code. For example, most processors provide an instruction to quickly compute the absolute value of a floating-point number. In CompCert C, this instruction is made available to the programmer via the __builtin_fabs function. It provides a faster alternative to the fabs library function from <math.h>.
Invocations of built-in functions are automatically inlined by the compiler at point of use. It is a compile-time error to take a pointer to a built-in function.
Some built-in functions are available on all target platforms supported by CompCert. Others are specific to a particular platform.
Integer arithmetic
Floating-point arithmetic
Conditionals
Block copy with known size and alignment
__builtin_memcpy_aligned(&dst, &src, sizeof(dst), _Alignof(type_dst));where dst and src are two objects of the same complete type type_dst. An invocation of __builtin_memcpy_aligned(dst,src,sz,al) behaves like memcpy(dst,src,sz) as defined in the <string.h> standard library. Knowing the size and alignment at compile-time enables the compiler to generate very efficient inline code for the copy.
Synchronization
Optimization hints
if (__builtin_expect(p == NULL, 0)) { ... } else { ... }This tells the compiler that p is usually not NULL and, therefore, that the else branch is more likely to be executed than the then branch.
Integer arithmetic
Floating-point arithmetic
Memory accesses with reversed endianness
Synchronization instructions
Cache control instructions
Access to special registers
Atomic (sequentially-consistent) memory operations
Note: CompCert does not model multi-threaded execution and only gives the same guarantees for the atomic memory operations as it does for all other built-in functions. The built-in itself expand to a sequence of PowerPC instructions that are suitable to implement a sequentially-consistent variant of these atomic memory operations.
Note: CompCert uses the lwarx and stwcx. instructions in the expansion of the atomic builtins. Therefore, memory areas accessed by the builtins must be naturally aligned and should be in pages that are marked as WIMGE = 001xx. Otherwise, exceptions may occur.
Floating-point arithmetic
Synchronization instructions
Floating-point arithmetic
Integer arithmetic
Floating-point arithmetic
CompCert C provides a mechanism to annotate C source code with information for AbsInt’s a3 framework (a3 is an acronym for AbsInt Advanced Analyzer) combining various analyzers modules like aiT, TimingProfiler, StackAnalyzer, or ValueAnalyzer.
The analyzer modules of a3 typically operate on machine code in fully linked executables and usually expect external information for improved analysis precision to be also presented on this rather low level.
With CompCert’s annotation mechanism it is now possible to reliably annotate on C source code level and reason about C variables instead of using code addresses or processor registers. Furthermore it allows to add annotations in the AIS specification language of a3 (refer [4]). These annotations are automatically carried through the compilation chain and the linked executable into a3 without the need of external annotation files.
The annotation mechanism is available for all target architectures with ELF binary format.
Concept CompCert collects all annotations of a compilation unit and stores them in encoded form in a special section named __compcert_ais_annotations in the object file. a3 can automatically extract them from this section and utilize them in analyses.
For CompCert, annotations via __builtin_ais_annot look like a call to an external variadic function similar to printf: The first argument contains the AIS annotation and is also a format string. It can contain format specifiers (section 6.4.2) that are replaced with the AIS representation of the additional arguments to __builtin_ais_annot. In contrast to printf most format specifiers are tagged with numbers and refer to a specific argument independent of the order. It is also possible to refer to an argument more than once.
In the context of optimizations, the builtin is also a robust mechanism to attach annotations at specific code locations. It does not rely on the rather ambiguous line information of the DWARF debug information but rather utilizes the label mechanism of the assembler and linker to generate annotations with actual code addresses.
With the builtin-mechanism it is possible for CompCert to e.g. remove unreachable code together with the contained annotations or do code transformations like reordering code blocks without breaking the annotations. Consider the following C snippet:
static void func(int count)
{
int i;
for (int i = 0; i < count; i++) {
__builtin_ais_annot("try loop %here bound: %e1;", count);
...
}
...
}
If constant propagation can prove that count is always zero, CompCert can remove the whole loop since it will never be executed. In such a situation the annotation will also be removed. In contrast to this, a conventional source code annotation via special formatted C comments (e.g. // ai: …) would remain visible and probably cause problems since a3 collects such annotations by scanning the source code. The same is true, when source code uses the C preprocessor for conditional compilation: CompCert can remove not used annotations while conventional source code annotations will remain visible for a3.
The following paragraphs depict some motivating examples on how to use the annotation mechanism with a3 in, e.g. a software library that will be integrated in an embedded system.
Label definitions AIS labels can be used to create robust references to code locations, e.g. for the location of the different branches in an if-statement:
void funcExecTask(int) { __builtin_ais_annot("label 'funcExecTask_before_cond': %here;"); if (condition) { __builtin_ais_annot("label 'funcExecTask_cond_then': %here;"); ... } else { __builtin_ais_annot("label 'funcExecTask_cond_else': %here;"); ... } ... }
A user of a3 can then easily use the defined label references for his analysis specific annotations:
## User ais annotations for specific analyses# annotations for code within then and else branches try instruction label('funcExecTask_cond_then') ...; try instruction label('funcExecTask_cond_else') ...;
# annotation for the actual branch instruction try instruction label('funcExecTask_before_cond') -> conditional(1) ...;
Excluding critical code snippets Labels can also be used to exclude critical code snippets from analysis. First a label is defined to reference the beginning of a critical code section. Behind the critical code, a second annotation excludes the code snippet between the defined label and the current address (%here) from analysis:
void funcExecTask(int taskNo) { ... // Label before critical code section __builtin_ais_annot("label 'execTask_begin_critical': %here;"); // The critical code, not suitable for analysis ... // End of critical code: cut it away and annotate execution time, etc. // depending on function parameter __builtin_ais_annot("try instruction label('execTask_begin_critical')\
n" " snippet {\
n" " continue at: %here;\
n" " not analyzed;\
n" " takes: %e1 * 42 cycles;\
n" " ...\
n" " }", taskNo); ... }
Loop and recursion annotations A (probably overestimated) annotation can be specified in the source code of a common library routine to ensure that a3 can compute reasonable results without annotation effort or to increase analysis precision at specific code loations. If necessary, a user of a3 can improve this annotation by giving more specific annotations for the actual analysis context. For example:
Specifying a bound for a data dependent loop:
int strncmp_x(char s[], char t[], int len)
{
int i;
for (i = 0; i < len && ...; i++) {
__builtin_ais_annot("try loop %here bound: 0..%e1;", len);
...
}
return 0;
}
Providing unrolling hints for loops for improved precision in a3:
void strcpy_x(char s[], char t[])
{
int i = 0;
while (( s[i] = t[i] ) != '\
0') {
__builtin_ais_annot("try loop %here mapping { default unroll: 50; }");
...
}
}
Specifying a time bound for a busy waiting loop:
void openCanSocket(volatile struct device_t* device)
{
...
// Busy wait for ACK. Assume a worst-case timing of 23 us
while(device->bus_data != 0x00) {
__builtin_ais_annot("try loop %here takes: 23 us;");
}
...
}
Specifying a recursion bound and an incarnation bound for a recursive routine.
void errorHook(unsigned char err_code) { __builtin_ais_annot("try routine %e1 {\
n" " recursion bound: 1;\
n" " incarnation limit: 1;\
n" "}", &errorHook); ... }
Refining values In this example a double value with a known range is converted to an integer and used as an index, e.g. to access a data array. a3 has currently no knowledge of floating point values and needs help to restrict the range of i (and derived from it, a memory access) to a small range:
double func(double x)
{
double data[10] = { ... };
// x is known to be always >= 0.0 and < 10.0
int i = x;
// Refine the value in the location holding variable i
__builtin_ais_annot("try instruction %here { enter with: %l1 = 0..9; };",
i);
return data[i];
}
This will result in the following PowerPC assembly code, where \%here will be replaced by CompCert together with the assembler/linker with the address of label .L118:
; int i = x; fctiwz f13, f1 stfdu f13, -8(r1) lwz r5, 4(r1) addi r1, r1, 8 .L117: ; __builtin_ais_annot("try instruction %here { enter with: %l1 = 0..9; };", ; i); .L118: .L119: ; return data[i]; addi r3, r1, 16 rlwinm r4, r5, 3, 0, 28 ; 0xfffffff8 lfdx f1, r3, r4 .L120: addi r1, r1, 96 .L121: blr ... .section "__compcert_ais_annotations",,n .ascii "# file:test.c line:25 function:func\
ntry instruction " .byte 7,4 .4byte .L118 .ascii " { enter with: reg("r5") = 0..9; };" .ascii "\
n"
The annotation, as extracted by a3 will then look as follows:
# test.c line:25 function:func
instruction 0x10013c { enter with: reg("r5") = 0..9; };
Another possibility for refinement of values and improved analysis precision in a3 is to insert assert annotations about known value ranges of variables or function parameters. a3 utilizes those assertions and may also be able to report violations.
int func(int a, int b, int c) { __builtin_ais_annot("try instruction %here {\
n" " assert always enter with: %e1 in (0..7);\
n" " assert always enter with: %e2 in (0..7);\
n" " assert always enter with: %e3 in (0..7);\
n" "};", a, b, c); ... }
Areas The contents of memory areas can also be specified for a3.
The following example assumes an embedded device that is rather slow when executing code located in ROM, and faster when running code from RAM. It is a common pattern in such a context to dynamically copy (e.g. once at system boot time) tiny hotspot functions from ROM into RAM buffers and execute the code from those buffers instead of calling the actual routine. Without further annotations a3 usually cannot know which code will be executed and hence may report unresolved computed branches in function ISR1_hwcheck.
A source level annotation can specify for a3 which code is copied into RAM buffers:
typedef unsigned char(*CryptFktPtr_t)(unsigned char);
// Assume this routine to be a hotspot and copied to crypt_ram buffer
unsigned char crypt_function (unsigned char input)
{
return (input ^ 0xCAFE);
}
// Code buffer for execution from RAM
volatile char crypt_ram[50];
// Setup routine at boot time
void init_crypt_ram()
{
__builtin_ais_annot("copy area %e1 width %e2 from %e3;",
&crypt_ram,
sizeof(crypt_ram),
&crypt_function);
memcpy((void *)crypt_ram, &crypt_function, 50);
}
void ISR1_hwcheck(void)
{
CryptFktPtr_t f = (CryptFktPtr_t)((char *)crypt_ram);
...
cs = f(s);
}
Memory areas that are used for communication with external devices can be marked accordingly:
volatile char* buffer_in[128]; void init_device() { __builtin_ais_annot("area %e1 width %e2 {\
n" " readable: true;\
n" " writable: false;\
n" " volatile;\
n" "};", &buffer_in, sizeof(buffer_in)); ... }
Format specifiers __builtin_ais_annot supports the following format specifiers that are replaced with information in AIS-Syntax on export:
__builtin_ais_annot("instruction %here -> call(1) { ... };");
a3 will see this as:
# file:... line:... function:...
instruction 0x1000 -> call(1) { ... };
Note, that no debug information is necessary for the %here-replacement. Instead it relies only on the label mechanism of the assembler and linker.
For parameters that cannot be expressed in one of the three forms, CompCert will issue a warning and ignore the annotation. Apart from this restriction, %l and %e produce the same output.
Extraction order When using the __builtin_ais_annot feature, CompCert collects all annotations contained in a compilation unit and stores them in encoded form in a special section of the object file (__compcert_ais_annotations). While creating the final executable, the linker collects all annotation sections from the object files, concatenates them, and stores the result in the executable.
The order in which annotations are exported into the final executable is explicitly undefined. This applies to the extraction within each compilation unit as well as the merging process done by the linker. It is therefore not possible to rely on a specific order in which a3 will see the annotations.
Semantics Semantically, CompCert treats __builtin_ais_annot as a call to an external function. CompCert will not generate actual code for a call to an external function, but the parameters of the builtin will be evaluated, as it is mandatory in C semantics.
If needed a architecture specific nop instructions will be inserted in order to prevent merging of a generated label with a following label. For example if an annotation is directly followed by a do-while and the annotation label would be otherwise merged with the loop head.
If all additional arguments are non-volatile C variables or compile-time constant expressions, it is guaranteed that no additional code will be generated for __builtin_ais_annot. Moreover, the location displayed as a replacement for the %e or %l sequence is guaranteed to be the location where the corresponding variable resides. If one or several additional arguments are complex expressions (neither non-volatile variables nor constant expressions), useless code is generated to compute their values and leave them in temporary registers or stack locations. The location displayed as a replacement for the %e or %l sequence is that of the corresponding temporary.
It should also be noted that using symbols in parameter expressions to __builtin_ais_annot may also have an effect on the liveness of those symbols and hence prevent some optimizations. Furthermore, since __builtin_ais_annot is considered a call to an external function it also acts as a barrier for many optimizations.
In the current implementation, __builtin_ais_annot can only be used at places where C statements are valid, i.e. within a function definition, but not on the toplevel.
Lastly, CompCert has no knowledge about the AIS syntax and does neither a syntactical nor a semantical check on the annotations.
Avoiding side effects Annotations with __builtin_ais_annot should be pure code, i.e. side effects with the application code should be avoided in the parameter expressions of the builtin. This improves the clarity of the application code and also allows compatibility with other compilers, e.g. via conditional compilation.
// Bad style: a builtin-call modifying 'len'
__builtin_ais_annot("loop %here bound: 0..%e1;", ++len);"
Multiline annotations AIS annotations often span multiple lines. The simplest way to reproduce this visually with the builtin is by utilizing the automatic concatenation of string literals in C. Newline characters have to be inserted as an escape sequence \n. For example, consider the annotation:
__builtin_ais_annot("# a comment" " and " "another comment\
n" "annotation line 1\
n" "annotation line 2");
This is equivalent to an AIS file containing:
# a comment and another comment
annotation line 1
annotation line 2
String literals and quoting AIS annotations itself can also contain string literals, e.g. to refer to symbol names. Translating this directly to a string literal for use with __builtin_ais_annot would require quoting of all special characters. It is therefore recommended to use single quoted string literals in AIS annotations where possible. Example:
routine "name" { ... };
Directly translating this would result in:
__builtin_ais_annot("routine\
"name\
" { ... };");
Note the escape characters \
which are necessary to encode "
into
the C string literal. Using single quotes on AIS-level results in a more
readable annotation:
__builtin_ais_annot("routine 'name' { ... };");
There are two levels of quoting: the first one on C-level to encode the AIS content as a string literal. The second one is the quoting on AIS-level to encode special characters in the AIS-syntax.
Separate analyses of tasks Currently a3 can extract either all annotations embedded in an executable or none. Analyses that cover only a portion of the binary code – e.g. when doing a separate analysis for each task of the executable – may therefore issue warnings for annotations of unreachable program points. The try keyword of AIS can be used to suppress such warnings:
__builtin_ais_annot("try instruction 'name' -> computed(1) { ... };");
If name is not reachable from the starting point of a specific analysis, no warning will be generated by a3.
In addition to the a3 specific program annotations described in chapter 6.4 CompCert C provides a more general but lower-level mechanism to attach free-form annotations (text messages possibly mentioning the values of variables) to C program points, and have these annotations transported throughout compilation, all the way to the generated assembly code. Analysis tools can extract the annotation information from comments in the generated assembly code and process them further for their needs.
The general annotation mechanism is presented as a pseudo built-in function called __builtin_annot, taking as arguments a string literal and zero or more local variables. For example, the binary search routine bsearch in the following C code snippet can be annotated with two important pieces of information: First, the while loop executes at most ⌈ log2(100) ⌉ = 7 iterations. Second, the array index m is always between 0 and 99, ensuring that the memory access tbl[m] is always within bounds.
int bsearch(int tbl[100], int v) { int l = 0, u = 99; __builtin_annot("loop bound: 0..7"); while (l <= u) { int m = (l + u) / 2; __builtin_annot("%1 = 0..99", m); if (tbl[m] < v) l = m + 1; else if (tbl[m] > v) u = m - 1; else return m; } return -1; }
After compiling this function using ccomp -Wa,-a -c bsearch.c the generated assembly listing looks as follows:
... 4 bsearch: 5 0000 9421FFF0 stwu r1, -16(r1) 6 0004 7C0802A6 mflr r0 7 0008 90010008 stw r0, 8(r1) 8 000c 39200000 addi r9, r0, 0 9 0010 39400063 addi r10, r0, 99 10 # annotation: loop bound: 0..7 11 .L100: 12 0014 7C095000 cmpw cr0, r9, r10 13 0018 41810040 bt 1, .L101 14 001c 7CE95214 add r7, r9, r10 15 0020 7CE50E70 srawi r5, r7, 1 16 0024 7CA50194 addze r5, r5 17 # annotation: r5 = 0..99 18 0028 54A8103A rlwinm r8, r5, 2, 0, 29 # 0xfffffffc 19 002c 7CC3402E lwzx r6, r3, r8 20 0030 7C062000 cmpw cr0, r6, r4 21 0034 4180001C bt 0, .L102 ...
One can see that CompCert does not generate machine code for the two __builtin_annot source statements. Instead, CompCert produces assembly comments at the exact program points corresponding to those in the source function. These comments consist of the string literal carried by the annotation, where positional parameters %1, %2, etc, are replaced by the locations (processor registers or stack slots) of the corresponding variable arguments.
Generalizing from the example above, we see that __builtin_annot statements offer a flexible mechanism to mark program points and local variables in C source files, then track them all the way to assembly language. Besides helping with static analysis at the machine code level, this mechanism also facilitates manual traceability analysis between C and assembly code.
Reference description CompCert’s annotation mechanism is presented by the following two pseudo built-in functions.
The formal semantics of __builtin_annot is that of a pro forma “print” statement: the compiler assumes that every time a __builtin_annot is executed, the message and the values of the additional arguments are displayed on an hypothetical output device. In other words, an invocation of __builtin_annot is treated as an observable event in the program execution. The compiler, therefore, guarantees that __builtin_annot statements are never removed, duplicated, nor reordered; instead, they always execute at the times prescribed by the semantics of the source program, and in the same order relative to other observable events such as calls to I/O functions or volatile memory accesses.
As described in the motivational example above, the actual effect of a __builtin_annot statement is simply to generate a comment in the assembly code. This comment consists of the message carried by the annotation, where %n sequences are replaced by the machine location containing the value of the n-th additional argument, or by its value if the n-th additional argument is a compile-time constant expression of numerical type.
The location of an argument is either a machine register, an integer or floating-point constant, the name of a global variable, or a stack memory location of the form mem(sp + offset, size) where sp is the stack pointer register, offset a byte offset relative to the value of sp, and size the size in bytes of the argument. For example, on the PowerPC, register locations are R3…R31 and F0…F31, and stack locations are of the form mem(R1 + offset, size).
If all additional arguments are non-volatile C variables or compile-time constant expressions, it is guaranteed that no code (other than the assembly comment) will be generated for __builtin_annot. Moreover, the location displayed as a replacement for the %n sequence is guaranteed to be the location where the corresponding variable resides.
If one or several additional arguments are complex expressions (neither non-volatile variables nor constant expressions), useless code is generated to compute their values and leave them in temporary registers or stack locations. The location displayed as a replacement for the %n sequence is that of the corresponding temporary. This behavior is not particularly useful for static analysis at the machine level, and can generate useless code. It is therefore highly recommended to use only non-volatile variable names or constant expressions as additional parameters to __builtin_annot.
int x = tbl[__builtin_annot_intval("%1 = 0..99", (lo + hi)/ 2)];
The formal semantics of __builtin_annot_intval is also the pro forma effect of displaying the message msg and the value of the integer argument x on a hypothetical output device. In addition, the value of the second argument x is returned unchanged as the result of __builtin_annot_intval.
In the compiled code, __builtin_annot_intval evaluates its argument x to some temporary register, then inserts an assembly comment equal to the text msg where occurrences of %1 are replaced by the name of this register.
Like in GCC and Clang, inline assembly code using the asm statement can be parameterized by C expressions as operands. The actual locations of the operands (registers and memory locations), as determined during compilation, are inserted in the given assembly code fragment.
Warning: Indiscriminate use of asm statements can ruin all the semantic preservation guarantees of CompCert. For this reason, support for asm statements is turned off by default and must be activated through the -finline-asm or -fall options. For the generated code to behave properly, it is the responsibility of the programmer to list (in the asm statement) the registers and memory that are modified by the assembly code, and to avoid modifying memory that is private to the compiled code, such as return addresses stored in the stack.
Examples Here is how to use the PowerPC mulhw instruction to compute the high 32 bits of the 64-bit integer product x * y:
int prod; asm ("mulhw %0,%1,%2" : "=r"(prod) : "r"(x), "r"(y));
The two arguments x and y are evaluated into registers, which get substituted for %1 and %2 (respectively) in the assembly template. Likewise, %0 is substituted by the register associated with variable prod. The three r constraints indicate that all three operands are expected in registers. The = constraint in =r means that the operand is an output.
To designate operands, symbolic names can be used instead of operand numbers. Using named operands, the mulhw example above can be written as:
asm ("mulhw %[res],%[arg1],%[arg2]" : [res]"=r"(prod) : [arg1]"r"(x), [arg2]"r"(y));
Sometimes, inline assembly code has no result value, but modifies memory. An example is the dcba instruction of PowerPC, which allocates a cache block without reading its contents from main memory:
asm volatile ("dcba 0, %[addr]" : : [addr]"r"(p) : "memory");
Note the absence of output operand, the volatile modifier indicating that the assembly code has side effects, and the "memory" annotation indicating that the assembly code modifies memory in ways that are not predictable by the compiler. CompCert treats all asm statements as volatile and clobbering memory, but other compilers need these annotations to produce correct code.
Some instructions operate over memory locations instead of registers. In this case, the m constraint should be used instead of r. For example, the x86 instruction fnstsw stores the FP control word in a memory location:
unsigned short cw; asm ("fnstsw %0" : "=m" (cw));
Note that this asm statement has no inputs, hence the second colon can be omitted.
Other instructions demand arguments that are integer constants, instead of registers or memory locations. In this case, the i constraint should be used, and the corresponding argument must be a compile-time constant expression. For example, here is how to use the PowerPC mfspr and mtspr instructions to read and write special registers:
#define read_spr(regno,res) \ asm ("mfspr %0,%1" : "=r"(res) : "i"(regno)) #define write_spr(regno,val) \ asm volatile ("mtspr %0,%1" : : "i"(regno), "r"(val))
We already used the "memory" annotation telling the compiler that the assembly code “clobbers” (modifies unpredictably) the memory state. Likewise, if the assembly code modifies registers other than that of the output operand, the names of those registers must be given in the clobber list. For example, here is an implementation of atomic test-and-set using the x86 locked-exchange instruction:
int testandset(int * p) { // store 1 in *p and return the previous value of *p int res; asm volatile ("movl $1, %%eax\n\t" "xchgl (%[addr]), %%eax\n\t" "movl %%eax, %[oldval]" : [oldval]"=r"(res) : [addr]"r"(p) : "eax", "memory"); return res; }
Note that the assembly template can contain several instructions, using \n\t in the template to separate the instructions. Also note the use of %% in the assembly template to stand for a single % character in the generated assembly code.
Since the template uses eax as a temporary register, we must list register eax as clobbered. Besides informing the compiler that the previous contents of eax are destroyed, this clobber annotation also ensures that none of the asm operands (res and p) are allocated to eax.
On 32-bit target platforms, register operands of type long long or unsigned long long (64-bit integers) need special handling, since CompCert allocates them into pairs of 32-bit registers. The assembly template must use %R to refer to the most significant half of the register pair, and %Q to refer to the least significant half. For example, here is how to use the x86 rdtsc instruction to read the time stamp counter as an unsigned 64-bit integer:
unsigned long long rdtsc(void) { unsigned long long res; asm("rdtsc\n\t" "movl %%eax, %Q0\n\t" "movl %%edx, %R0\n\t" : "=r" (res) : : "eax", "edx"); return res; }
Reference description The syntax of extended inline assembly is as follows:
|
An asm statement carries an assembly template (a string literal) and up to 3 lists of operands separated by colons: a list of zero or one output expressions (results produced by the assembly code); a list of zero, one or several input expressions (arguments used by the assembly code); and a list of zero, one or several resources (e.g. processor registers) that are “clobbered” by the assembly code.
The assembly template is a string literal possibly containing placeholders marked with a % (percent) character: either numbered placeholders (%0, %1, etc) or named placeholders (%[name]). During compilation, the placeholders are replaced by the locations of the corresponding operands, then the resulting text is given to the assembler as is. A %% sequence in the template is translated to a single % character in the text. Operands are numbered consecutively starting with %0 for the first output operand (if any) and continuing with the input operands. Instead of numbers, the template can also refer to operands by their optional names, using the %[name] notation.
The assembly outputs and the inputs are comma-separated, possibly empty lists of C expressions preceded by an optional name between brackets and a mandatory constraint (a string literal). CompCert can handle zero or one output; two outputs or more cause a compile-time error. For inputs, the following constraints are supported:
Input constraint | Meaning |
"r" | Register. The expression is evaluated into a processor register, whose name is inserted in the assembly template. |
"m" | Memory location. The expression is evaluated into a memory location, whose address is inserted (as a valid processor addressing mode) in the assembly template. |
"i" | Integer immediate. The expression must be a compile-time constant. Its value is inserted in the assembly template as a decimal literal. |
For outputs, the C expressions must be l-values, and the following constraints are supported:
Output constraint | Meaning |
"=r" | Register. This is either the register allocated to the output expression (if it is a local variable allocated to a register), or a temporary register chosen by CompCert, whose value is assigned to the expression just after the assembly code. |
"=m" | Memory location. This is the memory location of the output expression. Its address is inserted (as a valid processor addressing mode) in the assembly template. |
The fourth, optional component of an asm statement is a comma-separated list of resources that are “clobbered” by the assembly code fragment, i.e. set to unpredictable values. The resources, given as string literals, are either processor register names or the special resources "memory" and "cc", denoting unpredictable changes to the memory and the processor’s condition codes, respectively. CompCert always assumes that inline assembly can modify memory and condition codes in unpredictable ways, even if the "memory" and "cc" clobbers are not specified. The register names are case-insensitive and depend on the target processor, as follows:
Processor | Register names |
ARM | integer registers: "r0", "r1", …, "r12", "r14" |
VFP registers: "f0", "f1", …, "f15" | |
PowerPC | integer registers: "r0", "r3", …, "r12", "r14", …, "r31" |
FP registers: "f0", "f1", "f2", …, "f31" | |
x86 32 bits | integer registers: "eax", "ebx", "ecx", "edx", "esi", "edi", "ebp" |
XMM registers: "xmm0", "xmm1", …, "xmm7" | |
x86 64 bits | integer registers: "rax", "rbx", "rcx", "rdx", "rsi", "rdi", "rbp", |
integer registers: "r8", …, "r15" | |
XMM registers: "xmm0", "xmm1", …, "xmm15" | |
RISC-V | integer registers: "x5", "x6", …, "x29", "x30" |
FP registers: "f0", "f1", …, "f30", "f31" | |
AArch64 | integer registers: "x0", "x1", …, "x30" |
FP registers: "d0", "d1", …, "d31" |
Registers not listed above are reserved for use by the compiler and must not be modified by inline assembly code. For example, the stack pointer register (r13 for ARM, r1 for PowerPC, esp/rsp for x86, x2 for RISC-V) must be preserved.