Library Parmov

Translation of parallel moves into sequences of individual moves

The ``parallel move'' problem, also known as ``parallel assignment'', is the following. We are given a list of (source, destination) pairs of locations. The goal is to find a sequence of elementary moves (loc <- loc assignments) such that, at the end of the sequence, location dst contains the value of location src at the beginning of the sequence, for each (src, dst) pairs in the given problem. Moreover, other locations should keep their values, except one register of each type, which can be used as temporaries in the generated sequences.

The parallel move problem is trivial if the sources and destinations do not overlap. For instance,
  (R1, R2) <- (R3, R4)     becomes    R1 <- R3; R2 <- R4


However, arbitrary overlap is allowed between sources and destinations. This requires some care in ordering the individual moves, as in
  (R1, R2) <- (R3, R1)     becomes    R2 <- R1; R1 <- R3


Worse, cycles (permutations) can require the use of temporaries, as in
  (R1, R2, R3) <- (R2, R3, R1)   becomes   T <- R1; R1 <- R2; R2 <- R3; R3 <- T;


An amazing fact is that for any parallel move problem, at most one temporary (or in our case one integer temporary and one float temporary) is needed.

The development in this section was designed by Laurence Rideau and Bernard Serpette. It is described in the paper ``Tilting at windmills with Coq: Formal verification of a compilation algorithm for parallel moves'', http://hal.inria.fr/inria-00176007

Require Import Relations.
Require Import Coqlib.
Require Recdef.

Section PARMOV.

Registers, moves, and their semantics


The development is parameterized by the type of registers, equipped with a decidable equality.

Variable reg: Type.
Variable reg_eq : forall (r1 r2: reg), {r1=r2} + {r1<>r2}.

The temp function maps every register r to the register that should be used to save the value of r temporarily while performing a cyclic assignment involving r. In the simplest case, there is only one such temporary register rtemp and temp is the constant function fun r => rtemp. In more realistic cases, different temporary registers can be used to hold different values. In the case of Compcert, we have two temporary registers, one for integer values and the other for floating-point values, and temp returns the appropriate temporary depending on the type of its argument.

Variable temp: reg -> reg.

Definition is_temp (r: reg): Prop := exists s, r = temp s.

A set of moves is a list of pairs of registers, of the form (source, destination).

Definition moves := (list (reg * reg))%type.
Definition srcs (m: moves) := List.map (@fst reg reg) m.
Definition dests (m: moves) := List.map (@snd reg reg) m.

Semantics of moves


The dynamic semantics of moves is given in terms of environments. An environment is a mapping of registers to their current values.

Variable val: Type.
Definition env := reg -> val.

Lemma env_ext:
  forall (e1 e2: env),
  (forall r, e1 r = e2 r) -> e1 = e2.
Proof (extensionality reg val).

The main operation over environments is update: it assigns a value v to a register r and preserves the values of other registers.

Definition update (r: reg) (v: val) (e: env) : env :=
  fun r' => if reg_eq r' r then v else e r'.

Lemma update_s:
  forall r v e, update r v e r = v.


Lemma update_o:
  forall r v e r', r' <> r -> update r v e r' = e r'.


Lemma update_ident:
  forall r e, update r (e r) e = e.


Lemma update_commut:
  forall r1 v1 r2 v2 e,
  r1 <> r2 ->
  update r1 v1 (update r2 v2 e) = update r2 v2 (update r1 v1 e).


Lemma update_twice:
  forall r v e,
  update r v (update r v e) = update r v e.


A list of moves (src1, dst1), ..., (srcN, dstN) can be given three different semantics, as environment transformers. The first semantics corresponds to a parallel assignment: dst1, ..., dstN are set to the initial values of src1, ..., srcN.

Fixpoint exec_par (m: moves) (e: env) {struct m}: env :=
  match m with
  | nil => e
  | (s, d) :: m' => update d (e s) (exec_par m' e)
  end.

The second semantics corresponds to a sequence of individual assignments: first, dst1 is set to the initial value of src1; then, dst2 is set to the current value of src2 after the first assignment; etc.

Fixpoint exec_seq (m: moves) (e: env) {struct m}: env :=
  match m with
  | nil => e
  | (s, d) :: m' => exec_seq m' (update d (e s) e)
  end.

The third semantics is also sequential, but executes the moves in reverse order, starting with srcN, dstN and finishing with src1, dst1.

Fixpoint exec_seq_rev (m: moves) (e: env) {struct m}: env :=
  match m with
  | nil => e
  | (s, d) :: m' =>
      let e' := exec_seq_rev m' e in
      update d (e' s) e'
  end.

Specification of the non-deterministic algorithm


Rideau and Serpette's parallel move compilation algorithm is first specified as a non-deterministic set of rewriting rules operating over triples (mu, sigma, tau) of moves. We define these rules as an inductive predicate transition relating triples of moves, and its reflexive transitive closure transitions.

Inductive state: Type :=
  State (mu: moves) (sigma: moves) (tau: moves) : state.

Definition no_read (mu: moves) (d: reg) : Prop :=
  ~In d (srcs mu).

Inductive transition: state -> state -> Prop :=
  | tr_nop: forall mu1 r mu2 sigma tau,
      transition (State (mu1 ++ (r, r) :: mu2) sigma tau)
                 (State (mu1 ++ mu2) sigma tau)
  | tr_start: forall mu1 s d mu2 tau,
      transition (State (mu1 ++ (s, d) :: mu2) nil tau)
                 (State (mu1 ++ mu2) ((s, d) :: nil) tau)
  | tr_push: forall mu1 d r mu2 s sigma tau,
      transition (State (mu1 ++ (d, r) :: mu2) ((s, d) :: sigma) tau)
                 (State (mu1 ++ mu2) ((d, r) :: (s, d) :: sigma) tau)
  | tr_loop: forall mu sigma s d tau,
      transition (State mu (sigma ++ (s, d) :: nil) tau)
                 (State mu (sigma ++ (temp s, d) :: nil) ((s, temp s) :: tau))
  | tr_pop: forall mu s0 d0 s1 d1 sigma tau,
      no_read mu d1 -> d1 <> s0 ->
      transition (State mu ((s1, d1) :: sigma ++ (s0, d0) :: nil) tau)
                 (State mu (sigma ++ (s0, d0) :: nil) ((s1, d1) :: tau))
  | tr_last: forall mu s d tau,
      no_read mu d ->
      transition (State mu ((s, d) :: nil) tau)
                 (State mu nil ((s, d) :: tau)).

Definition transitions: state -> state -> Prop :=
  clos_refl_trans state transition.

Well-formedness properties of states


A state mu, sigma, tau is well-formed if the following properties hold:

  • mu concatenated with sigma is a ``mill'', that is, no destination appears twice (predicate is_mill below).
  • No temporary register appears in mu (predicate move_no_temp).
  • No temporary register appears in sigma except perhaps as the source of the last move in sigma (predicate temp_last).
  • sigma is a ``path'', that is, the source of a move is the destination of the following move.

Definition is_mill (m: moves) : Prop :=
  list_norepet (dests m).

Definition is_not_temp (r: reg) : Prop :=
  forall d, r <> temp d.

Definition move_no_temp (m: moves) : Prop :=
  forall s d, In (s, d) m -> is_not_temp s /\ is_not_temp d.

Definition temp_last (m: moves) : Prop :=
  match List.rev m with
  | nil => True
  | (s, d) :: m' => is_not_temp d /\ move_no_temp m'
  end.

Definition is_first_dest (m: moves) (d: reg) : Prop :=
  match m with
  | nil => True
  | (s0, d0) :: _ => d = d0
  end.

Inductive is_path: moves -> Prop :=
  | is_path_nil:
      is_path nil
  | is_path_cons: forall s d m,
      is_first_dest m s ->
      is_path m ->
      is_path ((s, d) :: m).

Inductive state_wf: state -> Prop :=
  | state_wf_intro:
      forall mu sigma tau,
      is_mill (mu ++ sigma) ->
      move_no_temp mu ->
      temp_last sigma ->
      is_path sigma ->
      state_wf (State mu sigma tau).

Some trivial properties of srcs and dests.

Lemma dests_append:
  forall m1 m2, dests (m1 ++ m2) = dests m1 ++ dests m2.


Lemma dests_decomp:
  forall m1 s d m2, dests (m1 ++ (s, d) :: m2) = dests m1 ++ d :: dests m2.


Lemma srcs_append:
  forall m1 m2, srcs (m1 ++ m2) = srcs m1 ++ srcs m2.


Lemma srcs_decomp:
  forall m1 s d m2, srcs (m1 ++ (s, d) :: m2) = srcs m1 ++ s :: srcs m2.


Lemma srcs_dests_combine:
  forall s d,
  List.length s = List.length d ->
  srcs (List.combine s d) = s /\
  dests (List.combine s d) = d.


Some properties of is_mill and dests_disjoint.

Definition dests_disjoint (m1 m2: moves) : Prop :=
  list_disjoint (dests m1) (dests m2).

Lemma dests_disjoint_sym:
  forall m1 m2,
  dests_disjoint m1 m2 <-> dests_disjoint m2 m1.


Lemma dests_disjoint_cons_left:
  forall m1 s d m2,
  dests_disjoint ((s, d) :: m1) m2 <->
  dests_disjoint m1 m2 /\ ~In d (dests m2).


Lemma dests_disjoint_cons_right:
  forall m1 s d m2,
  dests_disjoint m1 ((s, d) :: m2) <->
  dests_disjoint m1 m2 /\ ~In d (dests m1).


Lemma dests_disjoint_append_left:
  forall m1 m2 m3,
  dests_disjoint (m1 ++ m2) m3 <->
  dests_disjoint m1 m3 /\ dests_disjoint m2 m3.


Lemma dests_disjoint_append_right:
  forall m1 m2 m3,
  dests_disjoint m1 (m2 ++ m3) <->
  dests_disjoint m1 m2 /\ dests_disjoint m1 m3.


Lemma is_mill_cons:
  forall s d m,
  is_mill ((s, d) :: m) <->
  is_mill m /\ ~In d (dests m).


Lemma is_mill_append:
  forall m1 m2,
  is_mill (m1 ++ m2) <->
  is_mill m1 /\ is_mill m2 /\ dests_disjoint m1 m2.


Some properties of move_no_temp.

Lemma move_no_temp_append:
  forall m1 m2,
  move_no_temp m1 -> move_no_temp m2 -> move_no_temp (m1 ++ m2).


Lemma move_no_temp_rev:
  forall m, move_no_temp (List.rev m) -> move_no_temp m.


Some properties of temp_last.

Lemma temp_last_change_last_source:
  forall s d s' sigma,
  temp_last (sigma ++ (s, d) :: nil) ->
  temp_last (sigma ++ (s', d) :: nil).


Lemma temp_last_push:
  forall s1 d1 s2 d2 sigma,
  temp_last ((s2, d2) :: sigma) ->
  is_not_temp s1 -> is_not_temp d1 ->
  temp_last ((s1, d1) :: (s2, d2) :: sigma).


Lemma temp_last_pop:
  forall s1 d1 sigma s2 d2,
  temp_last ((s1, d1) :: sigma ++ (s2, d2) :: nil) ->
  temp_last (sigma ++ (s2, d2) :: nil).


Some properties of is_path.

Lemma is_path_pop:
  forall s d m,
  is_path ((s, d) :: m) -> is_path m.


Lemma is_path_change_last_source:
  forall s s' d sigma,
  is_path (sigma ++ (s, d) :: nil) ->
  is_path (sigma ++ (s', d) :: nil).


Lemma path_sources_dests:
  forall s0 d0 sigma,
  is_path (sigma ++ (s0, d0) :: nil) ->
  List.incl (srcs (sigma ++ (s0, d0) :: nil))
            (s0 :: dests (sigma ++ (s0, d0) :: nil)).


Lemma no_read_path:
  forall d1 sigma s0 d0,
  d1 <> s0 ->
  is_path (sigma ++ (s0, d0) :: nil) ->
  ~In d1 (dests (sigma ++ (s0, d0) :: nil)) ->
  no_read (sigma ++ (s0, d0) :: nil) d1.


To benefit from some proof automation, we populate a rewrite database with some of the properties above.

Lemma notin_dests_cons:
  forall x s d m,
  ~In x (dests ((s, d) :: m)) <-> x <> d /\ ~In x (dests m).


Lemma notin_dests_append:
  forall d m1 m2,
  ~In d (dests (m1 ++ m2)) <-> ~In d (dests m1) /\ ~In d (dests m2).


Hint Rewrite is_mill_cons is_mill_append
  dests_disjoint_cons_left dests_disjoint_cons_right
  dests_disjoint_append_left dests_disjoint_append_right
  notin_dests_cons notin_dests_append: pmov.

Transitions preserve well-formedness


Lemma transition_preserves_wf:
  forall st st',
  transition st st' -> state_wf st -> state_wf st'.


Lemma transitions_preserve_wf:
  forall st st', transitions st st' -> state_wf st -> state_wf st'.


Transitions preserve semantics


We define the semantics of states as follows. For a triple mu, sigma, tau, we perform the moves tau in reverse sequential order, then the moves mu ++ sigma in parallel.

Definition statemove (st: state) (e: env) :=
  match st with
  | State mu sigma tau => exec_par (mu ++ sigma) (exec_seq_rev tau e)
  end.

In preparation to showing that transitions preserve semantics, we prove various properties of the parallel and sequential interpretations of moves.

Lemma exec_par_outside:
  forall m e r, ~In r (dests m) -> exec_par m e r = e r.


Lemma exec_par_lift:
  forall m1 s d m2 e,
  ~In d (dests m1) ->
  exec_par (m1 ++ (s, d) :: m2) e = exec_par ((s, d) :: m1 ++ m2) e.


Lemma exec_par_ident:
  forall m1 r m2 e,
  is_mill (m1 ++ (r, r) :: m2) ->
  exec_par (m1 ++ (r, r) :: m2) e = exec_par (m1 ++ m2) e.


Lemma exec_seq_app:
  forall m1 m2 e,
  exec_seq (m1 ++ m2) e = exec_seq m2 (exec_seq m1 e).


Lemma exec_seq_rev_app:
  forall m1 m2 e,
  exec_seq_rev (m1 ++ m2) e = exec_seq_rev m1 (exec_seq_rev m2 e).


Lemma exec_seq_exec_seq_rev:
  forall m e,
  exec_seq_rev m e = exec_seq (List.rev m) e.


Lemma exec_seq_rev_exec_seq:
  forall m e,
  exec_seq m e = exec_seq_rev (List.rev m) e.


Lemma exec_par_update_no_read:
  forall s d m e,
  no_read m d ->
  ~In d (dests m) ->
  exec_par m (update d (e s) e) =
  update d (e s) (exec_par m e).


Lemma exec_par_append_eq:
  forall m1 m2 m3 e2 e3,
  exec_par m2 e2 = exec_par m3 e3 ->
  (forall r, In r (srcs m1) -> e2 r = e3 r) ->
  exec_par (m1 ++ m2) e2 = exec_par (m1 ++ m3) e3.


Lemma exec_par_combine:
  forall e sl dl,
  List.length sl = List.length dl ->
  list_norepet dl ->
  let e' := exec_par (combine sl dl) e in
  List.map e' dl = List.map e sl /\
  (forall l, ~In l dl -> e' l = e l).


env_equiv is an equivalence relation between environments capturing the fact that two environments assign the same values to non-temporary registers, but can disagree on the values of temporary registers.

Definition env_equiv (e1 e2: env) : Prop :=
  forall r, is_not_temp r -> e1 r = e2 r.

Lemma env_equiv_refl:
  forall e, env_equiv e e.


Lemma env_equiv_refl':
  forall e1 e2, e1 = e2 -> env_equiv e1 e2.


Lemma env_equiv_sym:
  forall e1 e2, env_equiv e1 e2 -> env_equiv e2 e1.


Lemma env_equiv_trans:
  forall e1 e2 e3, env_equiv e1 e2 -> env_equiv e2 e3 -> env_equiv e1 e3.


Lemma exec_par_env_equiv:
  forall m e1 e2,
  move_no_temp m ->
  env_equiv e1 e2 ->
  env_equiv (exec_par m e1) (exec_par m e2).


The proof that transitions preserve semantics (up to the values of temporary registers) follows.

Lemma transition_preserves_semantics:
  forall st st' e,
  transition st st' -> state_wf st ->
  env_equiv (statemove st' e) (statemove st e).


Lemma transitions_preserve_semantics:
  forall st st' e,
  transitions st st' -> state_wf st ->
  env_equiv (statemove st' e) (statemove st e).


Lemma state_wf_start:
  forall mu,
  move_no_temp mu ->
  is_mill mu ->
  state_wf (State mu nil nil).


The main correctness result in this section is the following: if we can transition repeatedly from an initial state mu, nil, nil to a final state nil, nil, tau, then the sequential execution of the moves rev tau is semantically equivalent to the parallel execution of the moves mu.

Theorem transitions_correctness:
  forall mu tau,
  move_no_temp mu ->
  is_mill mu ->
  transitions (State mu nil nil) (State nil nil tau) ->
  forall e, env_equiv (exec_seq (List.rev tau) e) (exec_par mu e).


Determinisation of the transition relation


We now define a deterministic variant dtransition of the transition relation transition. dtransition is deterministic in the sense that at most one transition applies to a given state.

Inductive dtransition: state -> state -> Prop :=
  | dtr_nop: forall r mu tau,
      dtransition (State ((r, r) :: mu) nil tau)
                  (State mu nil tau)
  | dtr_start: forall s d mu tau,
      s <> d ->
      dtransition (State ((s, d) :: mu) nil tau)
                  (State mu ((s, d) :: nil) tau)
  | dtr_push: forall mu1 d r mu2 s sigma tau,
      no_read mu1 d ->
      dtransition (State (mu1 ++ (d, r) :: mu2) ((s, d) :: sigma) tau)
                  (State (mu1 ++ mu2) ((d, r) :: (s, d) :: sigma) tau)
  | dtr_loop_pop: forall mu s r0 d sigma tau,
      no_read mu r0 ->
      dtransition (State mu ((s, r0) :: sigma ++ (r0, d) :: nil) tau)
                  (State mu (sigma ++ (temp r0, d) :: nil) ((s, r0) :: (r0, temp r0) :: tau))
  | dtr_pop: forall mu s0 d0 s1 d1 sigma tau,
      no_read mu d1 -> d1 <> s0 ->
      dtransition (State mu ((s1, d1) :: sigma ++ (s0, d0) :: nil) tau)
                  (State mu (sigma ++ (s0, d0) :: nil) ((s1, d1) :: tau))
  | dtr_last: forall mu s d tau,
      no_read mu d ->
      dtransition (State mu ((s, d) :: nil) tau)
                  (State mu nil ((s, d) :: tau)).

Definition dtransitions: state -> state -> Prop :=
  clos_refl_trans state dtransition.

Every deterministic transition corresponds to a sequence of non-deterministic transitions.

Lemma transition_determ:
  forall st st',
  dtransition st st' ->
  state_wf st ->
  transitions st st'.


Lemma transitions_determ:
  forall st st',
  dtransitions st st' ->
  state_wf st ->
  transitions st st'.


The semantic correctness of deterministic transitions follows.

Theorem dtransitions_correctness:
  forall mu tau,
  move_no_temp mu ->
  is_mill mu ->
  dtransitions (State mu nil nil) (State nil nil tau) ->
  forall e, env_equiv (exec_seq (List.rev tau) e) (exec_par mu e).


The compilation function


We now define a function that takes a well-formed parallel move mu and returns a sequence of elementary moves tau that is semantically equivalent. We start by defining a number of auxiliary functions.

Function split_move (m: moves) (r: reg) {struct m} : option (moves * reg * moves) :=
  match m with
  | nil => None
  | (s, d) :: tl =>
      if reg_eq s r
      then Some (nil, d, tl)
      else match split_move tl r with
           | None => None
           | Some (before, d', after) => Some ((s, d) :: before, d', after)
           end
  end.

Function is_last_source (r: reg) (m: moves) {struct m} : bool :=
  match m with
  | nil => false
  | (s, d) :: nil =>
      if reg_eq s r then true else false
  | (s, d) :: tl =>
      is_last_source r tl
  end.

Function replace_last_source (r: reg) (m: moves) {struct m} : moves :=
  match m with
  | nil => nil
  | (s, d) :: nil => (r, d) :: nil
  | s_d :: tl => s_d :: replace_last_source r tl
  end.

Definition final_state (st: state) : bool :=
  match st with
  | State nil nil _ => true
  | _ => false
  end.

Definition parmove_step (st: state) : state :=
  match st with
  | State nil nil _ => st
  | State ((s, d) :: tl) nil l =>
      if reg_eq s d
      then State tl nil l
      else State tl ((s, d) :: nil) l
  | State t ((s, d) :: b) l =>
      match split_move t d with
      | Some (t1, r, t2) =>
          State (t1 ++ t2) ((d, r) :: (s, d) :: b) l
      | None =>
          match b with
          | nil => State t nil ((s, d) :: l)
          | _ =>
              if is_last_source d b
              then State t (replace_last_source (temp d) b) ((s, d) :: (d, temp d) :: l)
              else State t b ((s, d) :: l)
          end
      end
  end.

Here are the main correctness properties of these functions.

Lemma split_move_charact:
  forall m r,
  match split_move m r with
  | Some (before, d, after) => m = before ++ (r, d) :: after /\ no_read before r
  | None => no_read m r
  end.


Lemma is_last_source_charact:
  forall r s d m,
  if is_last_source r (m ++ (s, d) :: nil)
  then s = r
  else s <> r.


Lemma replace_last_source_charact:
  forall s d s' m,
  replace_last_source s' (m ++ (s, d) :: nil) =
  m ++ (s', d) :: nil.


Lemma parmove_step_compatible:
  forall st,
  final_state st = false ->
  dtransition st (parmove_step st).


The compilation function is obtained by iterating the parmov_step function. To show that this iteration always terminates, we introduce the following measure over states.

Open Scope nat_scope.

Definition measure (st: state) : nat :=
  match st with
  | State mu sigma tau => 2 * List.length mu + List.length sigma
  end.

Lemma measure_decreasing_1:
  forall st st',
  dtransition st st' -> measure st' < measure st.


Lemma measure_decreasing_2:
  forall st,
  final_state st = false ->
  measure (parmove_step st) < measure st.


Compilation function for parallel moves.

Function parmove_aux (st: state) {measure measure st} : moves :=
  if final_state st
  then match st with State _ _ tau => tau end
  else parmove_aux (parmove_step st).


Lemma parmove_aux_transitions:
  forall st,
  dtransitions st (State nil nil (parmove_aux st)).


Definition parmove (mu: moves) : moves :=
  List.rev (parmove_aux (State mu nil nil)).

This is the main correctness theorem: the sequence of elementary moves parmove mu is semantically equivalent to the parallel move mu if the latter is well-formed.

Theorem parmove_correctness:
  forall mu,
  move_no_temp mu -> is_mill mu ->
  forall e,
  env_equiv (exec_seq (parmove mu) e) (exec_par mu e).


Here is an alternate formulation of parmove, where the parallel move problem is given as two separate lists of sources and destinations.

Definition parmove2 (sl dl: list reg) : moves :=
  parmove (List.combine sl dl).

Theorem parmove2_correctness:
  forall sl dl,
  List.length sl = List.length dl ->
  list_norepet dl ->
  (forall r, In r sl -> is_not_temp r) ->
  (forall r, In r dl -> is_not_temp r) ->
  forall e,
  let e' := exec_seq (parmove2 sl dl) e in
  List.map e' dl = List.map e sl /\
  forall r, ~In r dl -> is_not_temp r -> e' r = e r.


Additional syntactic properties


We now show an additional property of the sequence of moves generated by parmove mu: any such move s -> d is either already present in mu, or of the form temp s -> d or s -> temp s for some s -> d in mu. This syntactic property is useful to show that parmove preserves register classes, for instance.

Section PROPERTIES.

Variable initial_move: moves.

Inductive wf_move: reg -> reg -> Prop :=
  | wf_move_same: forall s d,
      In (s, d) initial_move -> wf_move s d
  | wf_move_temp_left: forall s d,
      wf_move s d -> wf_move (temp s) d
  | wf_move_temp_right: forall s d,
      wf_move s d -> wf_move s (temp s).

Definition wf_moves (m: moves) : Prop :=
  forall s d, In (s, d) m -> wf_move s d.

Lemma wf_moves_cons: forall s d m,
  wf_moves ((s, d) :: m) <-> wf_move s d /\ wf_moves m.


Lemma wf_moves_append: forall m1 m2,
  wf_moves (m1 ++ m2) <-> wf_moves m1 /\ wf_moves m2.


Hint Rewrite wf_moves_cons wf_moves_append: pmov.

Inductive wf_state: state -> Prop :=
  | wf_state_intro: forall mu sigma tau,
      wf_moves mu -> wf_moves sigma -> wf_moves tau ->
      wf_state (State mu sigma tau).

Lemma dtransition_preserves_wf_state:
  forall st st',
  dtransition st st' -> wf_state st -> wf_state st'.


Lemma dtransitions_preserve_wf_state:
  forall st st',
  dtransitions st st' -> wf_state st -> wf_state st'.


End PROPERTIES.

Lemma parmove_wf_moves:
  forall mu, wf_moves mu (parmove mu).


Lemma parmove2_wf_moves:
  forall sl dl, wf_moves (List.combine sl dl) (parmove2 sl dl).


As a corollary, we show that all sources of parmove mu are sources of mu or temporaries, and likewise all destinations of parmove mu are destinations of mu or temporaries.

Remark wf_move_initial_reg_or_temp:
  forall mu s d,
  wf_move mu s d ->
  (In s (srcs mu) \/ is_temp s) /\ (In d (dests mu) \/ is_temp d).


Lemma parmove_initial_reg_or_temp:
  forall mu s d,
  In (s, d) (parmove mu) ->
  (In s (srcs mu) \/ is_temp s) /\ (In d (dests mu) \/ is_temp d).


Remark in_srcs:
  forall mu s, In s (srcs mu) -> exists d, In (s, d) mu.


Remark in_dests:
  forall mu d, In d (dests mu) -> exists s, In (s, d) mu.


Lemma parmove_srcs_initial_reg_or_temp:
  forall mu s,
  In s (srcs (parmove mu)) -> In s (srcs mu) \/ is_temp s.


Lemma parmove_dests_initial_reg_or_temp:
  forall mu d,
  In d (dests (parmove mu)) -> In d (dests mu) \/ is_temp d.


As a second corollary, we show that parmov preserves register classes, in the sense made precise below.

Section REGISTER_CLASSES.

Variable class: Type.
Variable regclass: reg -> class.
Hypothesis temp_preserves_class: forall r, regclass (temp r) = regclass r.

Definition is_class_compatible (mu: moves) : Prop :=
  forall s d, In (s, d) mu -> regclass s = regclass d.

Lemma parmove_preserves_register_classes:
  forall mu,
  is_class_compatible mu ->
  is_class_compatible (parmove mu).


End REGISTER_CLASSES.

Extension to partially overlapping registers


We now extend the previous results to the case where distinct registers can partially overlap, so that assigning to one register changes the value of the other. We asuume given a disjointness relation disjoint between registers.

Variable disjoint: reg -> reg -> Prop.

Hypothesis disjoint_sym:
  forall r1 r2, disjoint r1 r2 -> disjoint r2 r1.

Hypothesis disjoint_not_equal:
  forall r1 r2, disjoint r1 r2 -> r1 <> r2.

Two registers partially overlap if they are different and not disjoint. For the Coq development, it is easier to define the complement: two registers do not partially overlap if they are identical or disjoint.

Definition no_overlap (r1 r2: reg) : Prop :=
  r1 = r2 \/ disjoint r1 r2.

Lemma no_overlap_sym:
  forall r1 r2, no_overlap r1 r2 -> no_overlap r2 r1.


We axiomatize the effect of assigning a value to a register over an execution environment. The target register is set to the given value (property weak_update_s), and registers disjoint from the target keep their previous values (property weak_update_d). The values of other registers are undefined after the assignment.

Variable weak_update: reg -> val -> env -> env.

Hypothesis weak_update_s:
  forall r v e, weak_update r v e r = v.

Hypothesis weak_update_d:
  forall r1 v e r2, disjoint r1 r2 -> weak_update r1 v e r2 = e r2.

Fixpoint weak_exec_seq (m: moves) (e: env) {struct m}: env :=
  match m with
  | nil => e
  | (s, d) :: m' => weak_exec_seq m' (weak_update d (e s) e)
  end.

Definition disjoint_list (r: reg) (l: list reg) : Prop :=
  forall r', In r' l -> disjoint r r'.

Inductive pairwise_disjoint: list reg -> Prop :=
  | pairwise_disjoint_nil:
      pairwise_disjoint nil
  | pairwise_disjoint_cons: forall r l,
      disjoint_list r l ->
      pairwise_disjoint l ->
      pairwise_disjoint (r :: l).

Definition disjoint_temps (r: reg) : Prop :=
  forall t, is_temp t -> disjoint r t.

Section OVERLAP.

We consider a parallel move problem mu that satisfies the following conditions, which are stronger, overlap-aware variants of the move_no_temp mu and is_mill mu conditions used previously.

Variable mu: moves.

Sources and destinations are disjoint from all temporary registers.

Hypothesis mu_no_temporaries_src:
  forall s, In s (srcs mu) -> disjoint_temps s.

Hypothesis mu_no_temporaries_dst:
  forall d, In d (dests mu) -> disjoint_temps d.

Destinations are pairwise disjoint.

Hypothesis mu_dest_pairwise_disjoint:
  pairwise_disjoint (dests mu).

Sources and destinations do not partially overlap.

Hypothesis mu_src_dst_no_overlap:
  forall s d, In s (srcs mu) -> In d (dests mu) -> no_overlap s d.

Distinct temporaries do not partially overlap.

Hypothesis temps_no_overlap:
  forall t1 t2, is_temp t1 -> is_temp t2 -> no_overlap t1 t2.

The following lemmas show that mu is a windmill and does not contain temporary registers.

Lemma disjoint_list_notin:
  forall r l, disjoint_list r l -> ~In r l.


Lemma pairwise_disjoint_norepet:
  forall l, pairwise_disjoint l -> list_norepet l.


Lemma disjoint_temps_not_temp:
  forall r, disjoint_temps r -> is_not_temp r.


Lemma mu_is_mill:
  is_mill mu.


Lemma mu_move_no_temp:
  move_no_temp mu.


We define the ``adherence'' of the problem mu as the set of registers that partially overlap with one of the registers possibly assigned by the parallel move: destinations and temporaries. Again, we define the complement of the ``adherence'' set, which is more convenient for Coq reasoning.

Definition no_adherence (r: reg) : Prop :=
  forall x, In x (dests mu) \/ is_temp x -> no_overlap x r.

As a consequence of the hypotheses on mu, none of the destination registers, source registers, and temporaries belong to the adherence.

Lemma no_overlap_pairwise:
  forall r1 r2 m, pairwise_disjoint m -> In r1 m -> In r2 m -> no_overlap r1 r2.


Lemma no_adherence_dst:
  forall d, In d (dests mu) -> no_adherence d.


Lemma no_adherence_src:
  forall s, In s (srcs mu) -> no_adherence s.


Lemma no_adherence_tmp:
  forall t, is_temp t -> no_adherence t.


The relation env_match holds between two environments e1 and e2 if they assign the same values to all registers not in the adherence set.

Definition env_match (e1 e2: env) : Prop :=
  forall r, no_adherence r -> e1 r = e2 r.

The following lemmas relate the effect of executing moves using normal, overlap-unaware update and weak, overlap-aware update.

Lemma weak_update_match:
  forall e1 e2 s d,
  (In s (srcs mu) \/ is_temp s) ->
  (In d (dests mu) \/ is_temp d) ->
  env_match e1 e2 ->
  env_match (update d (e1 s) e1)
            (weak_update d (e2 s) e2).


Lemma weak_exec_seq_match:
  forall m e1 e2,
  (forall s, In s (srcs m) -> In s (srcs mu) \/ is_temp s) ->
  (forall d, In d (dests m) -> In d (dests mu) \/ is_temp d) ->
  env_match e1 e2 ->
  env_match (exec_seq m e1) (weak_exec_seq m e2).


End OVERLAP.

These lemmas imply the following correctness theorem for the parmove2 function, taking partial register overlap into account.

Theorem parmove2_correctness_with_overlap:
  forall sl dl,
  List.length sl = List.length dl ->
  (forall r, In r sl -> disjoint_temps r) ->
  (forall r, In r dl -> disjoint_temps r) ->
  pairwise_disjoint dl ->
  (forall s d, In s sl -> In d dl -> no_overlap s d) ->
  (forall t1 t2, is_temp t1 -> is_temp t2 -> no_overlap t1 t2) ->
  forall e,
  let e' := weak_exec_seq (parmove2 sl dl) e in
  List.map e' dl = List.map e sl /\
  forall r, disjoint_list r dl -> disjoint_temps r -> e' r = e r.


End PARMOV.