Library Parmov
Translation of parallel moves into sequences of individual moves
The ``parallel move'' problem, also known as ``parallel assignment'',
is the following. We are given a list of (source, destination) pairs
of locations. The goal is to find a sequence of elementary
moves (
The parallel move problem is trivial if the sources and destinations do not overlap. For instance,
However, arbitrary overlap is allowed between sources and destinations. This requires some care in ordering the individual moves, as in
Worse, cycles (permutations) can require the use of temporaries, as in
An amazing fact is that for any parallel move problem, at most one temporary (or in our case one integer temporary and one float temporary) is needed.
The development in this section was designed by Laurence Rideau and Bernard Serpette. It is described in the paper ``Tilting at windmills with Coq: Formal verification of a compilation algorithm for parallel moves'', http://hal.inria.fr/inria-00176007
loc <- loc
assignments) such that, at the end of the sequence,
location dst
contains the value of location src
at the beginning of
the sequence, for each (src
, dst
) pairs in the given problem.
Moreover, other locations should keep their values, except one register
of each type, which can be used as temporaries in the generated sequences.
The parallel move problem is trivial if the sources and destinations do not overlap. For instance,
(R1, R2) <- (R3, R4) becomes R1 <- R3; R2 <- R4
However, arbitrary overlap is allowed between sources and destinations. This requires some care in ordering the individual moves, as in
(R1, R2) <- (R3, R1) becomes R2 <- R1; R1 <- R3
Worse, cycles (permutations) can require the use of temporaries, as in
(R1, R2, R3) <- (R2, R3, R1) becomes T <- R1; R1 <- R2; R2 <- R3; R3 <- T;
An amazing fact is that for any parallel move problem, at most one temporary (or in our case one integer temporary and one float temporary) is needed.
The development in this section was designed by Laurence Rideau and Bernard Serpette. It is described in the paper ``Tilting at windmills with Coq: Formal verification of a compilation algorithm for parallel moves'', http://hal.inria.fr/inria-00176007
Require Import Relations.
Require Import Coqlib.
Require Recdef.
Section PARMOV.
The development is parameterized by the type of registers,
equipped with a decidable equality.
Variable reg: Type.
Variable reg_eq : forall (r1 r2: reg), {r1=r2} + {r1<>r2}.
The
temp
function maps every register r
to the register that
should be used to save the value of r
temporarily while performing
a cyclic assignment involving r
. In the simplest case, there is
only one such temporary register rtemp
and temp
is the constant
function fun r => rtemp
. In more realistic cases, different temporary
registers can be used to hold different values. In the case of Compcert,
we have two temporary registers, one for integer values and the other
for floating-point values, and temp
returns the appropriate temporary
depending on the type of its argument.
Variable temp: reg -> reg.
Definition is_temp (r: reg): Prop := exists s, r = temp s.
A set of moves is a list of pairs of registers, of the form
(source, destination).
Definition moves := (list (reg * reg))%type.
Definition srcs (m: moves) := List.map (@fst reg reg) m.
Definition dests (m: moves) := List.map (@snd reg reg) m.
The dynamic semantics of moves is given in terms of environments.
An environment is a mapping of registers to their current values.
Variable val: Type.
Definition env := reg -> val.
Lemma env_ext:
forall (e1 e2: env),
(forall r, e1 r = e2 r) -> e1 = e2.
Proof (extensionality reg val).
The main operation over environments is update: it assigns
a value
v
to a register r
and preserves the values of other
registers.
Definition update (r: reg) (v: val) (e: env) : env :=
fun r' => if reg_eq r' r then v else e r'.
Lemma update_s:
forall r v e, update r v e r = v.
Lemma update_o:
forall r v e r', r' <> r -> update r v e r' = e r'.
Lemma update_ident:
forall r e, update r (e r) e = e.
Lemma update_commut:
forall r1 v1 r2 v2 e,
r1 <> r2 ->
update r1 v1 (update r2 v2 e) = update r2 v2 (update r1 v1 e).
Lemma update_twice:
forall r v e,
update r v (update r v e) = update r v e.
A list of moves
(src1, dst1), ..., (srcN, dstN)
can be given
three different semantics, as environment transformers.
The first semantics corresponds to a parallel assignment:
dst1, ..., dstN
are set to the initial values of src1, ..., srcN
.
Fixpoint exec_par (m: moves) (e: env) {struct m}: env :=
match m with
| nil => e
| (s, d) :: m' => update d (e s) (exec_par m' e)
end.
The second semantics corresponds to a sequence of individual
assignments: first,
dst1
is set to the initial value of src1
;
then, dst2
is set to the current value of src2
after the first
assignment; etc.
Fixpoint exec_seq (m: moves) (e: env) {struct m}: env :=
match m with
| nil => e
| (s, d) :: m' => exec_seq m' (update d (e s) e)
end.
The third semantics is also sequential, but executes the moves
in reverse order, starting with
srcN, dstN
and finishing with
src1, dst1
.
Fixpoint exec_seq_rev (m: moves) (e: env) {struct m}: env :=
match m with
| nil => e
| (s, d) :: m' =>
let e' := exec_seq_rev m' e in
update d (e' s) e'
end.
Rideau and Serpette's parallel move compilation algorithm is first
specified as a non-deterministic set of rewriting rules operating
over triples
(mu, sigma, tau)
of moves. We define these rules
as an inductive predicate transition
relating triples of moves,
and its reflexive transitive closure transitions
.
Inductive state: Type :=
State (mu: moves) (sigma: moves) (tau: moves) : state.
Definition no_read (mu: moves) (d: reg) : Prop :=
~In d (srcs mu).
Inductive transition: state -> state -> Prop :=
| tr_nop: forall mu1 r mu2 sigma tau,
transition (State (mu1 ++ (r, r) :: mu2) sigma tau)
(State (mu1 ++ mu2) sigma tau)
| tr_start: forall mu1 s d mu2 tau,
transition (State (mu1 ++ (s, d) :: mu2) nil tau)
(State (mu1 ++ mu2) ((s, d) :: nil) tau)
| tr_push: forall mu1 d r mu2 s sigma tau,
transition (State (mu1 ++ (d, r) :: mu2) ((s, d) :: sigma) tau)
(State (mu1 ++ mu2) ((d, r) :: (s, d) :: sigma) tau)
| tr_loop: forall mu sigma s d tau,
transition (State mu (sigma ++ (s, d) :: nil) tau)
(State mu (sigma ++ (temp s, d) :: nil) ((s, temp s) :: tau))
| tr_pop: forall mu s0 d0 s1 d1 sigma tau,
no_read mu d1 -> d1 <> s0 ->
transition (State mu ((s1, d1) :: sigma ++ (s0, d0) :: nil) tau)
(State mu (sigma ++ (s0, d0) :: nil) ((s1, d1) :: tau))
| tr_last: forall mu s d tau,
no_read mu d ->
transition (State mu ((s, d) :: nil) tau)
(State mu nil ((s, d) :: tau)).
Definition transitions: state -> state -> Prop :=
clos_refl_trans state transition.
A state
mu, sigma, tau
is well-formed if the following properties hold:
-
mu
concatenated withsigma
is a ``mill'', that is, no destination appears twice (predicateis_mill
below). - No temporary register appears in
mu
(predicatemove_no_temp
). - No temporary register appears in
sigma
except perhaps as the source of the last move insigma
(predicatetemp_last
). -
sigma
is a ``path'', that is, the source of a move is the destination of the following move.
Definition is_mill (m: moves) : Prop :=
list_norepet (dests m).
Definition is_not_temp (r: reg) : Prop :=
forall d, r <> temp d.
Definition move_no_temp (m: moves) : Prop :=
forall s d, In (s, d) m -> is_not_temp s /\ is_not_temp d.
Definition temp_last (m: moves) : Prop :=
match List.rev m with
| nil => True
| (s, d) :: m' => is_not_temp d /\ move_no_temp m'
end.
Definition is_first_dest (m: moves) (d: reg) : Prop :=
match m with
| nil => True
| (s0, d0) :: _ => d = d0
end.
Inductive is_path: moves -> Prop :=
| is_path_nil:
is_path nil
| is_path_cons: forall s d m,
is_first_dest m s ->
is_path m ->
is_path ((s, d) :: m).
Inductive state_wf: state -> Prop :=
| state_wf_intro:
forall mu sigma tau,
is_mill (mu ++ sigma) ->
move_no_temp mu ->
temp_last sigma ->
is_path sigma ->
state_wf (State mu sigma tau).
Some trivial properties of srcs and dests.
Lemma dests_append:
forall m1 m2, dests (m1 ++ m2) = dests m1 ++ dests m2.
Lemma dests_decomp:
forall m1 s d m2, dests (m1 ++ (s, d) :: m2) = dests m1 ++ d :: dests m2.
Lemma srcs_append:
forall m1 m2, srcs (m1 ++ m2) = srcs m1 ++ srcs m2.
Lemma srcs_decomp:
forall m1 s d m2, srcs (m1 ++ (s, d) :: m2) = srcs m1 ++ s :: srcs m2.
Lemma srcs_dests_combine:
forall s d,
List.length s = List.length d ->
srcs (List.combine s d) = s /\
dests (List.combine s d) = d.
Some properties of
is_mill
and dests_disjoint
.
Definition dests_disjoint (m1 m2: moves) : Prop :=
list_disjoint (dests m1) (dests m2).
Lemma dests_disjoint_sym:
forall m1 m2,
dests_disjoint m1 m2 <-> dests_disjoint m2 m1.
Lemma dests_disjoint_cons_left:
forall m1 s d m2,
dests_disjoint ((s, d) :: m1) m2 <->
dests_disjoint m1 m2 /\ ~In d (dests m2).
Lemma dests_disjoint_cons_right:
forall m1 s d m2,
dests_disjoint m1 ((s, d) :: m2) <->
dests_disjoint m1 m2 /\ ~In d (dests m1).
Lemma dests_disjoint_append_left:
forall m1 m2 m3,
dests_disjoint (m1 ++ m2) m3 <->
dests_disjoint m1 m3 /\ dests_disjoint m2 m3.
Lemma dests_disjoint_append_right:
forall m1 m2 m3,
dests_disjoint m1 (m2 ++ m3) <->
dests_disjoint m1 m2 /\ dests_disjoint m1 m3.
Lemma is_mill_cons:
forall s d m,
is_mill ((s, d) :: m) <->
is_mill m /\ ~In d (dests m).
Lemma is_mill_append:
forall m1 m2,
is_mill (m1 ++ m2) <->
is_mill m1 /\ is_mill m2 /\ dests_disjoint m1 m2.
Some properties of
move_no_temp
.
Lemma move_no_temp_append:
forall m1 m2,
move_no_temp m1 -> move_no_temp m2 -> move_no_temp (m1 ++ m2).
Lemma move_no_temp_rev:
forall m, move_no_temp (List.rev m) -> move_no_temp m.
Some properties of
temp_last
.
Lemma temp_last_change_last_source:
forall s d s' sigma,
temp_last (sigma ++ (s, d) :: nil) ->
temp_last (sigma ++ (s', d) :: nil).
Lemma temp_last_push:
forall s1 d1 s2 d2 sigma,
temp_last ((s2, d2) :: sigma) ->
is_not_temp s1 -> is_not_temp d1 ->
temp_last ((s1, d1) :: (s2, d2) :: sigma).
Lemma temp_last_pop:
forall s1 d1 sigma s2 d2,
temp_last ((s1, d1) :: sigma ++ (s2, d2) :: nil) ->
temp_last (sigma ++ (s2, d2) :: nil).
Some properties of
is_path
.
Lemma is_path_pop:
forall s d m,
is_path ((s, d) :: m) -> is_path m.
Lemma is_path_change_last_source:
forall s s' d sigma,
is_path (sigma ++ (s, d) :: nil) ->
is_path (sigma ++ (s', d) :: nil).
Lemma path_sources_dests:
forall s0 d0 sigma,
is_path (sigma ++ (s0, d0) :: nil) ->
List.incl (srcs (sigma ++ (s0, d0) :: nil))
(s0 :: dests (sigma ++ (s0, d0) :: nil)).
Lemma no_read_path:
forall d1 sigma s0 d0,
d1 <> s0 ->
is_path (sigma ++ (s0, d0) :: nil) ->
~In d1 (dests (sigma ++ (s0, d0) :: nil)) ->
no_read (sigma ++ (s0, d0) :: nil) d1.
To benefit from some proof automation, we populate a rewrite database
with some of the properties above.
Lemma notin_dests_cons:
forall x s d m,
~In x (dests ((s, d) :: m)) <-> x <> d /\ ~In x (dests m).
Lemma notin_dests_append:
forall d m1 m2,
~In d (dests (m1 ++ m2)) <-> ~In d (dests m1) /\ ~In d (dests m2).
Hint Rewrite is_mill_cons is_mill_append
dests_disjoint_cons_left dests_disjoint_cons_right
dests_disjoint_append_left dests_disjoint_append_right
notin_dests_cons notin_dests_append: pmov.
Lemma transition_preserves_wf:
forall st st',
transition st st' -> state_wf st -> state_wf st'.
Lemma transitions_preserve_wf:
forall st st', transitions st st' -> state_wf st -> state_wf st'.
We define the semantics of states as follows.
For a triple
mu, sigma, tau
, we perform the moves tau
in
reverse sequential order, then the moves mu ++ sigma
in parallel.
Definition statemove (st: state) (e: env) :=
match st with
| State mu sigma tau => exec_par (mu ++ sigma) (exec_seq_rev tau e)
end.
In preparation to showing that transitions preserve semantics,
we prove various properties of the parallel and sequential interpretations
of moves.
Lemma exec_par_outside:
forall m e r, ~In r (dests m) -> exec_par m e r = e r.
Lemma exec_par_lift:
forall m1 s d m2 e,
~In d (dests m1) ->
exec_par (m1 ++ (s, d) :: m2) e = exec_par ((s, d) :: m1 ++ m2) e.
Lemma exec_par_ident:
forall m1 r m2 e,
is_mill (m1 ++ (r, r) :: m2) ->
exec_par (m1 ++ (r, r) :: m2) e = exec_par (m1 ++ m2) e.
Lemma exec_seq_app:
forall m1 m2 e,
exec_seq (m1 ++ m2) e = exec_seq m2 (exec_seq m1 e).
Lemma exec_seq_rev_app:
forall m1 m2 e,
exec_seq_rev (m1 ++ m2) e = exec_seq_rev m1 (exec_seq_rev m2 e).
Lemma exec_seq_exec_seq_rev:
forall m e,
exec_seq_rev m e = exec_seq (List.rev m) e.
Lemma exec_seq_rev_exec_seq:
forall m e,
exec_seq m e = exec_seq_rev (List.rev m) e.
Lemma exec_par_update_no_read:
forall s d m e,
no_read m d ->
~In d (dests m) ->
exec_par m (update d (e s) e) =
update d (e s) (exec_par m e).
Lemma exec_par_append_eq:
forall m1 m2 m3 e2 e3,
exec_par m2 e2 = exec_par m3 e3 ->
(forall r, In r (srcs m1) -> e2 r = e3 r) ->
exec_par (m1 ++ m2) e2 = exec_par (m1 ++ m3) e3.
Lemma exec_par_combine:
forall e sl dl,
List.length sl = List.length dl ->
list_norepet dl ->
let e' := exec_par (combine sl dl) e in
List.map e' dl = List.map e sl /\
(forall l, ~In l dl -> e' l = e l).
env_equiv
is an equivalence relation between environments
capturing the fact that two environments assign the same values to
non-temporary registers, but can disagree on the values of temporary
registers.
Definition env_equiv (e1 e2: env) : Prop :=
forall r, is_not_temp r -> e1 r = e2 r.
Lemma env_equiv_refl:
forall e, env_equiv e e.
Lemma env_equiv_refl':
forall e1 e2, e1 = e2 -> env_equiv e1 e2.
Lemma env_equiv_sym:
forall e1 e2, env_equiv e1 e2 -> env_equiv e2 e1.
Lemma env_equiv_trans:
forall e1 e2 e3, env_equiv e1 e2 -> env_equiv e2 e3 -> env_equiv e1 e3.
Lemma exec_par_env_equiv:
forall m e1 e2,
move_no_temp m ->
env_equiv e1 e2 ->
env_equiv (exec_par m e1) (exec_par m e2).
The proof that transitions preserve semantics (up to the values of
temporary registers) follows.
Lemma transition_preserves_semantics:
forall st st' e,
transition st st' -> state_wf st ->
env_equiv (statemove st' e) (statemove st e).
Lemma transitions_preserve_semantics:
forall st st' e,
transitions st st' -> state_wf st ->
env_equiv (statemove st' e) (statemove st e).
Lemma state_wf_start:
forall mu,
move_no_temp mu ->
is_mill mu ->
state_wf (State mu nil nil).
The main correctness result in this section is the following:
if we can transition repeatedly from an initial state
mu, nil, nil
to a final state nil, nil, tau
, then the sequential execution
of the moves rev tau
is semantically equivalent to the parallel
execution of the moves mu
.
Theorem transitions_correctness:
forall mu tau,
move_no_temp mu ->
is_mill mu ->
transitions (State mu nil nil) (State nil nil tau) ->
forall e, env_equiv (exec_seq (List.rev tau) e) (exec_par mu e).
We now define a deterministic variant
dtransition
of the
transition relation transition
. dtransition
is deterministic
in the sense that at most one transition applies to a given state.
Inductive dtransition: state -> state -> Prop :=
| dtr_nop: forall r mu tau,
dtransition (State ((r, r) :: mu) nil tau)
(State mu nil tau)
| dtr_start: forall s d mu tau,
s <> d ->
dtransition (State ((s, d) :: mu) nil tau)
(State mu ((s, d) :: nil) tau)
| dtr_push: forall mu1 d r mu2 s sigma tau,
no_read mu1 d ->
dtransition (State (mu1 ++ (d, r) :: mu2) ((s, d) :: sigma) tau)
(State (mu1 ++ mu2) ((d, r) :: (s, d) :: sigma) tau)
| dtr_loop_pop: forall mu s r0 d sigma tau,
no_read mu r0 ->
dtransition (State mu ((s, r0) :: sigma ++ (r0, d) :: nil) tau)
(State mu (sigma ++ (temp r0, d) :: nil) ((s, r0) :: (r0, temp r0) :: tau))
| dtr_pop: forall mu s0 d0 s1 d1 sigma tau,
no_read mu d1 -> d1 <> s0 ->
dtransition (State mu ((s1, d1) :: sigma ++ (s0, d0) :: nil) tau)
(State mu (sigma ++ (s0, d0) :: nil) ((s1, d1) :: tau))
| dtr_last: forall mu s d tau,
no_read mu d ->
dtransition (State mu ((s, d) :: nil) tau)
(State mu nil ((s, d) :: tau)).
Definition dtransitions: state -> state -> Prop :=
clos_refl_trans state dtransition.
Every deterministic transition corresponds to a sequence of
non-deterministic transitions.
Lemma transition_determ:
forall st st',
dtransition st st' ->
state_wf st ->
transitions st st'.
Lemma transitions_determ:
forall st st',
dtransitions st st' ->
state_wf st ->
transitions st st'.
The semantic correctness of deterministic transitions follows.
Theorem dtransitions_correctness:
forall mu tau,
move_no_temp mu ->
is_mill mu ->
dtransitions (State mu nil nil) (State nil nil tau) ->
forall e, env_equiv (exec_seq (List.rev tau) e) (exec_par mu e).
We now define a function that takes a well-formed parallel move
mu
and returns a sequence of elementary moves tau
that is semantically
equivalent. We start by defining a number of auxiliary functions.
Function split_move (m: moves) (r: reg) {struct m} : option (moves * reg * moves) :=
match m with
| nil => None
| (s, d) :: tl =>
if reg_eq s r
then Some (nil, d, tl)
else match split_move tl r with
| None => None
| Some (before, d', after) => Some ((s, d) :: before, d', after)
end
end.
Function is_last_source (r: reg) (m: moves) {struct m} : bool :=
match m with
| nil => false
| (s, d) :: nil =>
if reg_eq s r then true else false
| (s, d) :: tl =>
is_last_source r tl
end.
Function replace_last_source (r: reg) (m: moves) {struct m} : moves :=
match m with
| nil => nil
| (s, d) :: nil => (r, d) :: nil
| s_d :: tl => s_d :: replace_last_source r tl
end.
Definition final_state (st: state) : bool :=
match st with
| State nil nil _ => true
| _ => false
end.
Definition parmove_step (st: state) : state :=
match st with
| State nil nil _ => st
| State ((s, d) :: tl) nil l =>
if reg_eq s d
then State tl nil l
else State tl ((s, d) :: nil) l
| State t ((s, d) :: b) l =>
match split_move t d with
| Some (t1, r, t2) =>
State (t1 ++ t2) ((d, r) :: (s, d) :: b) l
| None =>
match b with
| nil => State t nil ((s, d) :: l)
| _ =>
if is_last_source d b
then State t (replace_last_source (temp d) b) ((s, d) :: (d, temp d) :: l)
else State t b ((s, d) :: l)
end
end
end.
Here are the main correctness properties of these functions.
Lemma split_move_charact:
forall m r,
match split_move m r with
| Some (before, d, after) => m = before ++ (r, d) :: after /\ no_read before r
| None => no_read m r
end.
Lemma is_last_source_charact:
forall r s d m,
if is_last_source r (m ++ (s, d) :: nil)
then s = r
else s <> r.
Lemma replace_last_source_charact:
forall s d s' m,
replace_last_source s' (m ++ (s, d) :: nil) =
m ++ (s', d) :: nil.
Lemma parmove_step_compatible:
forall st,
final_state st = false ->
dtransition st (parmove_step st).
The compilation function is obtained by iterating the
parmov_step
function. To show that this iteration always terminates, we introduce
the following measure over states.
Open Scope nat_scope.
Definition measure (st: state) : nat :=
match st with
| State mu sigma tau => 2 * List.length mu + List.length sigma
end.
Lemma measure_decreasing_1:
forall st st',
dtransition st st' -> measure st' < measure st.
Lemma measure_decreasing_2:
forall st,
final_state st = false ->
measure (parmove_step st) < measure st.
Compilation function for parallel moves.
Function parmove_aux (st: state) {measure measure st} : moves :=
if final_state st
then match st with State _ _ tau => tau end
else parmove_aux (parmove_step st).
Lemma parmove_aux_transitions:
forall st,
dtransitions st (State nil nil (parmove_aux st)).
Definition parmove (mu: moves) : moves :=
List.rev (parmove_aux (State mu nil nil)).
This is the main correctness theorem: the sequence of elementary
moves
parmove mu
is semantically equivalent to the parallel move
mu
if the latter is well-formed.
Theorem parmove_correctness:
forall mu,
move_no_temp mu -> is_mill mu ->
forall e,
env_equiv (exec_seq (parmove mu) e) (exec_par mu e).
Here is an alternate formulation of
parmove
, where the
parallel move problem is given as two separate lists of sources
and destinations.
Definition parmove2 (sl dl: list reg) : moves :=
parmove (List.combine sl dl).
Theorem parmove2_correctness:
forall sl dl,
List.length sl = List.length dl ->
list_norepet dl ->
(forall r, In r sl -> is_not_temp r) ->
(forall r, In r dl -> is_not_temp r) ->
forall e,
let e' := exec_seq (parmove2 sl dl) e in
List.map e' dl = List.map e sl /\
forall r, ~In r dl -> is_not_temp r -> e' r = e r.
We now show an additional property of the sequence of moves
generated by
parmove mu
: any such move s -> d
is either
already present in mu
, or of the form temp s -> d
or s -> temp s
for some s -> d
in mu
. This syntactic property is useful
to show that parmove
preserves register classes, for instance.
Section PROPERTIES.
Variable initial_move: moves.
Inductive wf_move: reg -> reg -> Prop :=
| wf_move_same: forall s d,
In (s, d) initial_move -> wf_move s d
| wf_move_temp_left: forall s d,
wf_move s d -> wf_move (temp s) d
| wf_move_temp_right: forall s d,
wf_move s d -> wf_move s (temp s).
Definition wf_moves (m: moves) : Prop :=
forall s d, In (s, d) m -> wf_move s d.
Lemma wf_moves_cons: forall s d m,
wf_moves ((s, d) :: m) <-> wf_move s d /\ wf_moves m.
Lemma wf_moves_append: forall m1 m2,
wf_moves (m1 ++ m2) <-> wf_moves m1 /\ wf_moves m2.
Hint Rewrite wf_moves_cons wf_moves_append: pmov.
Inductive wf_state: state -> Prop :=
| wf_state_intro: forall mu sigma tau,
wf_moves mu -> wf_moves sigma -> wf_moves tau ->
wf_state (State mu sigma tau).
Lemma dtransition_preserves_wf_state:
forall st st',
dtransition st st' -> wf_state st -> wf_state st'.
Lemma dtransitions_preserve_wf_state:
forall st st',
dtransitions st st' -> wf_state st -> wf_state st'.
End PROPERTIES.
Lemma parmove_wf_moves:
forall mu, wf_moves mu (parmove mu).
Lemma parmove2_wf_moves:
forall sl dl, wf_moves (List.combine sl dl) (parmove2 sl dl).
As a corollary, we show that all sources of
parmove mu
are sources of mu
or temporaries,
and likewise all destinations of parmove mu
are destinations of mu
or temporaries.
Remark wf_move_initial_reg_or_temp:
forall mu s d,
wf_move mu s d ->
(In s (srcs mu) \/ is_temp s) /\ (In d (dests mu) \/ is_temp d).
Lemma parmove_initial_reg_or_temp:
forall mu s d,
In (s, d) (parmove mu) ->
(In s (srcs mu) \/ is_temp s) /\ (In d (dests mu) \/ is_temp d).
Remark in_srcs:
forall mu s, In s (srcs mu) -> exists d, In (s, d) mu.
Remark in_dests:
forall mu d, In d (dests mu) -> exists s, In (s, d) mu.
Lemma parmove_srcs_initial_reg_or_temp:
forall mu s,
In s (srcs (parmove mu)) -> In s (srcs mu) \/ is_temp s.
Lemma parmove_dests_initial_reg_or_temp:
forall mu d,
In d (dests (parmove mu)) -> In d (dests mu) \/ is_temp d.
As a second corollary, we show that
parmov
preserves register
classes, in the sense made precise below.
Section REGISTER_CLASSES.
Variable class: Type.
Variable regclass: reg -> class.
Hypothesis temp_preserves_class: forall r, regclass (temp r) = regclass r.
Definition is_class_compatible (mu: moves) : Prop :=
forall s d, In (s, d) mu -> regclass s = regclass d.
Lemma parmove_preserves_register_classes:
forall mu,
is_class_compatible mu ->
is_class_compatible (parmove mu).
End REGISTER_CLASSES.
We now extend the previous results to the case where distinct
registers can partially overlap, so that assigning to one register
changes the value of the other. We asuume given a disjointness relation
disjoint
between registers.
Variable disjoint: reg -> reg -> Prop.
Hypothesis disjoint_sym:
forall r1 r2, disjoint r1 r2 -> disjoint r2 r1.
Hypothesis disjoint_not_equal:
forall r1 r2, disjoint r1 r2 -> r1 <> r2.
Two registers partially overlap if they are different and not disjoint.
For the Coq development, it is easier to define the complement:
two registers do not partially overlap if they are identical or disjoint.
Definition no_overlap (r1 r2: reg) : Prop :=
r1 = r2 \/ disjoint r1 r2.
Lemma no_overlap_sym:
forall r1 r2, no_overlap r1 r2 -> no_overlap r2 r1.
We axiomatize the effect of assigning a value to a register over an
execution environment. The target register is set to the given value
(property
weak_update_s
), and registers disjoint from the target
keep their previous values (property weak_update_d
). The values of
other registers are undefined after the assignment.
Variable weak_update: reg -> val -> env -> env.
Hypothesis weak_update_s:
forall r v e, weak_update r v e r = v.
Hypothesis weak_update_d:
forall r1 v e r2, disjoint r1 r2 -> weak_update r1 v e r2 = e r2.
Fixpoint weak_exec_seq (m: moves) (e: env) {struct m}: env :=
match m with
| nil => e
| (s, d) :: m' => weak_exec_seq m' (weak_update d (e s) e)
end.
Definition disjoint_list (r: reg) (l: list reg) : Prop :=
forall r', In r' l -> disjoint r r'.
Inductive pairwise_disjoint: list reg -> Prop :=
| pairwise_disjoint_nil:
pairwise_disjoint nil
| pairwise_disjoint_cons: forall r l,
disjoint_list r l ->
pairwise_disjoint l ->
pairwise_disjoint (r :: l).
Definition disjoint_temps (r: reg) : Prop :=
forall t, is_temp t -> disjoint r t.
Section OVERLAP.
We consider a parallel move problem
mu
that satisfies the following
conditions, which are stronger, overlap-aware variants of the
move_no_temp mu
and is_mill mu
conditions used previously.
Variable mu: moves.
Sources and destinations are disjoint from all temporary registers.
Hypothesis mu_no_temporaries_src:
forall s, In s (srcs mu) -> disjoint_temps s.
Hypothesis mu_no_temporaries_dst:
forall d, In d (dests mu) -> disjoint_temps d.
Destinations are pairwise disjoint.
Hypothesis mu_dest_pairwise_disjoint:
pairwise_disjoint (dests mu).
Sources and destinations do not partially overlap.
Hypothesis mu_src_dst_no_overlap:
forall s d, In s (srcs mu) -> In d (dests mu) -> no_overlap s d.
Distinct temporaries do not partially overlap.
Hypothesis temps_no_overlap:
forall t1 t2, is_temp t1 -> is_temp t2 -> no_overlap t1 t2.
The following lemmas show that
mu
is a windmill and does not
contain temporary registers.
Lemma disjoint_list_notin:
forall r l, disjoint_list r l -> ~In r l.
Lemma pairwise_disjoint_norepet:
forall l, pairwise_disjoint l -> list_norepet l.
Lemma disjoint_temps_not_temp:
forall r, disjoint_temps r -> is_not_temp r.
Lemma mu_is_mill:
is_mill mu.
Lemma mu_move_no_temp:
move_no_temp mu.
We define the ``adherence'' of the problem
mu
as the set of
registers that partially overlap with one of the registers
possibly assigned by the parallel move: destinations and temporaries.
Again, we define the complement of the ``adherence'' set, which is
more convenient for Coq reasoning.
Definition no_adherence (r: reg) : Prop :=
forall x, In x (dests mu) \/ is_temp x -> no_overlap x r.
As a consequence of the hypotheses on
mu
, none of the destination
registers, source registers, and temporaries belong to the adherence.
Lemma no_overlap_pairwise:
forall r1 r2 m, pairwise_disjoint m -> In r1 m -> In r2 m -> no_overlap r1 r2.
Lemma no_adherence_dst:
forall d, In d (dests mu) -> no_adherence d.
Lemma no_adherence_src:
forall s, In s (srcs mu) -> no_adherence s.
Lemma no_adherence_tmp:
forall t, is_temp t -> no_adherence t.
The relation
env_match
holds between two environments e1
and e2
if they assign the same values to all registers not in the adherence set.
Definition env_match (e1 e2: env) : Prop :=
forall r, no_adherence r -> e1 r = e2 r.
The following lemmas relate the effect of executing moves
using normal, overlap-unaware update and weak, overlap-aware update.
Lemma weak_update_match:
forall e1 e2 s d,
(In s (srcs mu) \/ is_temp s) ->
(In d (dests mu) \/ is_temp d) ->
env_match e1 e2 ->
env_match (update d (e1 s) e1)
(weak_update d (e2 s) e2).
Lemma weak_exec_seq_match:
forall m e1 e2,
(forall s, In s (srcs m) -> In s (srcs mu) \/ is_temp s) ->
(forall d, In d (dests m) -> In d (dests mu) \/ is_temp d) ->
env_match e1 e2 ->
env_match (exec_seq m e1) (weak_exec_seq m e2).
End OVERLAP.
These lemmas imply the following correctness theorem for the
parmove2
function, taking partial register overlap into account.
Theorem parmove2_correctness_with_overlap:
forall sl dl,
List.length sl = List.length dl ->
(forall r, In r sl -> disjoint_temps r) ->
(forall r, In r dl -> disjoint_temps r) ->
pairwise_disjoint dl ->
(forall s d, In s sl -> In d dl -> no_overlap s d) ->
(forall t1 t2, is_temp t1 -> is_temp t2 -> no_overlap t1 t2) ->
forall e,
let e' := weak_exec_seq (parmove2 sl dl) e in
List.map e' dl = List.map e sl /\
forall r, disjoint_list r dl -> disjoint_temps r -> e' r = e r.
End PARMOV.