The whole compiler and its proof of semantic preservation
Libraries.
Require Import String.
Require Import Coqlib Errors.
Require Import AST Linking Smallstep.
Languages (syntax and semantics).
Require Ctypes Csyntax Csem Cstrategy Cexec.
Require Clight.
Require Csharpminor.
Require Cminor.
Require CminorSel.
Require RTL.
Require LTL.
Require Linear.
Require Mach.
Require Asm.
Translation passes.
Require Initializers.
Require SimplExpr.
Require SimplLocals.
Require Cshmgen.
Require Cminorgen.
Require Selection.
Require RTLgen.
Require Tailcall.
Require Inlining.
Require Renumber.
Require Constprop.
Require CSE.
Require Deadcode.
Require Unusedglob.
Require Allocation.
Require Tunneling.
Require Linearize.
Require CleanupLabels.
Require Debugvar.
Require Stacking.
Require Asmgen.
Proofs of semantic preservation.
Require SimplExprproof.
Require SimplLocalsproof.
Require Cshmgenproof.
Require Cminorgenproof.
Require Selectionproof.
Require RTLgenproof.
Require Tailcallproof.
Require Inliningproof.
Require Renumberproof.
Require Constpropproof.
Require CSEproof.
Require Deadcodeproof.
Require Unusedglobproof.
Require Allocproof.
Require Tunnelingproof.
Require Linearizeproof.
Require CleanupLabelsproof.
Require Debugvarproof.
Require Stackingproof.
Require Asmgenproof.
Command-line flags.
Require Import Compopts.
Pretty-printers (defined in Caml).
Parameter print_Clight:
Clight.program ->
unit.
Parameter print_Cminor:
Cminor.program ->
unit.
Parameter print_RTL:
Z ->
RTL.program ->
unit.
Parameter print_LTL:
LTL.program ->
unit.
Parameter print_Mach:
Mach.program ->
unit.
Local Open Scope string_scope.
Composing the translation passes
We first define useful monadic composition operators,
along with funny (but convenient) notations.
Definition apply_total (
A B:
Type) (
x:
res A) (
f:
A ->
B) :
res B :=
match x with Error msg =>
Error msg |
OK x1 =>
OK (
f x1)
end.
Definition apply_partial (
A B:
Type)
(
x:
res A) (
f:
A ->
res B) :
res B :=
match x with Error msg =>
Error msg |
OK x1 =>
f x1 end.
Notation "a @@@ b" :=
(
apply_partial _ _
a b) (
at level 50,
left associativity).
Notation "a @@ b" :=
(
apply_total _ _
a b) (
at level 50,
left associativity).
Definition print {
A:
Type} (
printer:
A ->
unit) (
prog:
A) :
A :=
let unused :=
printer prog in prog.
Definition time {
A B:
Type} (
name:
string) (
f:
A ->
B) :
A ->
B :=
f.
Definition total_if {
A:
Type}
(
flag:
unit ->
bool) (
f:
A ->
A) (
prog:
A) :
A :=
if flag tt then f prog else prog.
Definition partial_if {
A:
Type}
(
flag:
unit ->
bool) (
f:
A ->
res A) (
prog:
A) :
res A :=
if flag tt then f prog else OK prog.
We define three translation functions for whole programs: one
starting with a C program, one with a Cminor program, one with an
RTL program. The three translations produce Asm programs ready for
pretty-printing and assembling.
Definition transf_rtl_program (
f:
RTL.program) :
res Asm.program :=
OK f
@@
print (
print_RTL 0)
@@
total_if Compopts.optim_tailcalls (
time "Tail calls" Tailcall.transf_program)
@@
print (
print_RTL 1)
@@@
time "Inlining" Inlining.transf_program
@@
print (
print_RTL 2)
@@
time "Renumbering" Renumber.transf_program
@@
print (
print_RTL 3)
@@
total_if Compopts.optim_constprop (
time "Constant propagation" Constprop.transf_program)
@@
print (
print_RTL 4)
@@
total_if Compopts.optim_constprop (
time "Renumbering" Renumber.transf_program)
@@
print (
print_RTL 5)
@@@
partial_if Compopts.optim_CSE (
time "CSE" CSE.transf_program)
@@
print (
print_RTL 6)
@@@
partial_if Compopts.optim_redundancy (
time "Redundancy elimination" Deadcode.transf_program)
@@
print (
print_RTL 7)
@@@
time "Unused globals" Unusedglob.transform_program
@@
print (
print_RTL 8)
@@@
time "Register allocation" Allocation.transf_program
@@
print print_LTL
@@
time "Branch tunneling" Tunneling.tunnel_program
@@@
time "CFG linearization" Linearize.transf_program
@@
time "Label cleanup" CleanupLabels.transf_program
@@@
partial_if Compopts.debug (
time "Debugging info for local variables" Debugvar.transf_program)
@@@
time "Mach generation" Stacking.transf_program
@@
print print_Mach
@@@
time "Asm generation" Asmgen.transf_program.
Definition transf_cminor_program (
p:
Cminor.program) :
res Asm.program :=
OK p
@@
print print_Cminor
@@@
time "Instruction selection" Selection.sel_program
@@@
time "RTL generation" RTLgen.transl_program
@@@
transf_rtl_program.
Definition transf_clight_program (
p:
Clight.program) :
res Asm.program :=
OK p
@@
print print_Clight
@@@
time "Simplification of locals" SimplLocals.transf_program
@@@
time "C#minor generation" Cshmgen.transl_program
@@@
time "Cminor generation" Cminorgen.transl_program
@@@
transf_cminor_program.
Definition transf_c_program (
p:
Csyntax.program) :
res Asm.program :=
OK p
@@@
time "Clight generation" SimplExpr.transl_program
@@@
transf_clight_program.
Force Initializers and Cexec to be extracted as well.
Definition transl_init :=
Initializers.transl_init.
Definition cexec_do_step :=
Cexec.do_step.
The following lemmas help reason over compositions of passes.
Lemma print_identity:
forall (
A:
Type) (
printer:
A ->
unit) (
prog:
A),
print printer prog =
prog.
Proof.
intros;
unfold print.
destruct (
printer prog);
auto.
Qed.
Lemma compose_print_identity:
forall (
A:
Type) (
x:
res A) (
f:
A ->
unit),
x @@
print f =
x.
Proof.
Relational specification of compilation
Definition match_if {
A:
Type} (
flag:
unit ->
bool) (
R:
A ->
A ->
Prop):
A ->
A ->
Prop :=
if flag tt then R else eq.
Lemma total_if_match:
forall (
A:
Type) (
flag:
unit ->
bool) (
f:
A ->
A) (
rel:
A ->
A ->
Prop) (
prog:
A),
(
forall p,
rel p (
f p)) ->
match_if flag rel prog (
total_if flag f prog).
Proof.
Lemma partial_if_match:
forall (
A:
Type) (
flag:
unit ->
bool) (
f:
A ->
res A) (
rel:
A ->
A ->
Prop) (
prog tprog:
A),
(
forall p tp,
f p =
OK tp ->
rel p tp) ->
partial_if flag f prog =
OK tprog ->
match_if flag rel prog tprog.
Proof.
Global Instance TransfIfLink {
A:
Type} {
LA:
Linker A}
(
flag:
unit ->
bool) (
transf:
A ->
A ->
Prop) (
TL:
TransfLink transf)
:
TransfLink (
match_if flag transf).
Proof.
unfold match_if.
destruct (
flag tt).
-
auto.
-
red;
intros.
subst tp1 tp2.
exists p;
auto.
Qed.
This is the list of compilation passes of CompCert in relational style.
Each pass is characterized by a match_prog relation between its
input code and its output code. The mkpass and ::: combinators,
defined in module Linking, ensure that the passes are composable
(the output language of a pass is the input language of the next pass)
and that they commute with linking (property TransfLink, inferred
by the type class mechanism of Coq).
Local Open Scope linking_scope.
Definition CompCert's_passes :=
mkpass SimplExprproof.match_prog
:::
mkpass SimplLocalsproof.match_prog
:::
mkpass Cshmgenproof.match_prog
:::
mkpass Cminorgenproof.match_prog
:::
mkpass Selectionproof.match_prog
:::
mkpass RTLgenproof.match_prog
:::
mkpass (
match_if Compopts.optim_tailcalls Tailcallproof.match_prog)
:::
mkpass Inliningproof.match_prog
:::
mkpass Renumberproof.match_prog
:::
mkpass (
match_if Compopts.optim_constprop Constpropproof.match_prog)
:::
mkpass (
match_if Compopts.optim_constprop Renumberproof.match_prog)
:::
mkpass (
match_if Compopts.optim_CSE CSEproof.match_prog)
:::
mkpass (
match_if Compopts.optim_redundancy Deadcodeproof.match_prog)
:::
mkpass Unusedglobproof.match_prog
:::
mkpass Allocproof.match_prog
:::
mkpass Tunnelingproof.match_prog
:::
mkpass Linearizeproof.match_prog
:::
mkpass CleanupLabelsproof.match_prog
:::
mkpass (
match_if Compopts.debug Debugvarproof.match_prog)
:::
mkpass Stackingproof.match_prog
:::
mkpass Asmgenproof.match_prog
:::
pass_nil _.
Composing the match_prog relations above, we obtain the relation
between CompCert C sources and Asm code that characterize CompCert's
compilation.
Definition match_prog:
Csyntax.program ->
Asm.program ->
Prop :=
pass_match (
compose_passes CompCert's_passes).
The transf_c_program function, when successful, produces
assembly code that is in the match_prog relation with the source C program.
Theorem transf_c_program_match:
forall p tp,
transf_c_program p =
OK tp ->
match_prog p tp.
Proof.
Semantic preservation
We now prove that the whole CompCert compiler (as characterized by the
match_prog relation) preserves semantics by constructing
the following simulations:
-
Forward simulations from Cstrategy to Asm
(composition of the forward simulations for each pass).
-
Backward simulations for the same languages
(derived from the forward simulation, using receptiveness of the source
language and determinacy of Asm).
-
Backward simulation from Csem to Asm
(composition of two backward simulations).
Remark forward_simulation_identity:
forall sem,
forward_simulation sem sem.
Proof.
intros.
apply forward_simulation_step with (
fun s1 s2 =>
s2 =
s1);
intros.
-
auto.
-
exists s1;
auto.
-
subst s2;
auto.
-
subst s2.
exists s1';
auto.
Qed.
Lemma match_if_simulation:
forall (
A:
Type) (
sem:
A ->
semantics) (
flag:
unit ->
bool) (
transf:
A ->
A ->
Prop) (
prog tprog:
A),
match_if flag transf prog tprog ->
(
forall p tp,
transf p tp ->
forward_simulation (
sem p) (
sem tp)) ->
forward_simulation (
sem prog) (
sem tprog).
Proof.
Theorem cstrategy_semantic_preservation:
forall p tp,
match_prog p tp ->
forward_simulation (
Cstrategy.semantics p) (
Asm.semantics tp)
/\
backward_simulation (
atomic (
Cstrategy.semantics p)) (
Asm.semantics tp).
Proof.
Theorem c_semantic_preservation:
forall p tp,
match_prog p tp ->
backward_simulation (
Csem.semantics p) (
Asm.semantics tp).
Proof.
Correctness of the CompCert compiler
Combining the results above, we obtain semantic preservation for two
usage scenarios of CompCert: compilation of a single monolithic program,
and separate compilation of multiple source files followed by linking.
In the monolithic case, we have a whole C program p that is
compiled in one run of CompCert to a whole Asm program tp.
Then, tp preserves the semantics of p, in the sense that there
exists a backward simulation of the dynamic semantics of p
by the dynamic semantics of tp.
Theorem transf_c_program_correct:
forall p tp,
transf_c_program p =
OK tp ->
backward_simulation (
Csem.semantics p) (
Asm.semantics tp).
Proof.
Here is the separate compilation case. Consider a nonempty list c_units
of C source files (compilation units), C1 ,,, Cn. Assume that every
C compilation unit Ci is successfully compiled by CompCert, obtaining
an Asm compilation unit Ai. Let asm_unit be the nonempty list
A1 ... An. Further assume that the C units C1 ... Cn can be linked
together to produce a whole C program c_program. Then, the generated
Asm units can be linked together, producing a whole Asm program
asm_program. Moreover, asm_program preserves the semantics of
c_program, in the sense that there exists a backward simulation of
the dynamic semantics of asm_program by the dynamic semantics of c_program.
Theorem separate_transf_c_program_correct:
forall c_units asm_units c_program,
nlist_forall2 (
fun cu tcu =>
transf_c_program cu =
OK tcu)
c_units asm_units ->
link_list c_units =
Some c_program ->
exists asm_program,
link_list asm_units =
Some asm_program
/\
backward_simulation (
Csem.semantics c_program) (
Asm.semantics asm_program).
Proof.