-- This module defines a simple programming language with a static -- type system and proves that this language is type safe, i.e. that -- all programs, which are accepted by the type checker, will never -- cause a runtime error when executed. -- -- The language is very simple supporting only expressions built from -- constant values (booleans and naturals) and some primitive -- operations (+, <, ∧). For example the program -- -- ((2 + 3) < 5) ∧ true -- -- evaluates to the constant boolean value -- -- false -- -- In particular, the language does not support variables or -- functions, which will be covered in the first chapter of the second -- part of the PLFA book. -------------------------------------------------------------------------------- -- 1. Syntax -- The Syntax of a language describes the structure of its programs. -- -- In the literature, this is usually presented as a grammar in ENBF -- notation, which for our language would look like this: -- -- n ::= 0 | suc n (Natural Numbers) -- b ::= false | true (Booleans) -- e ::= n | b | e + e | e < e | e ∧ e (Expressions) -- -- In Agda, the EBNF notations directly translate to inductive datatypes: -- Natural Numbers data ℕ : Set where zero : ℕ suc : ℕ → ℕ -- Booleans data 𝔹 : Set where false : 𝔹 true : 𝔹 -- Expressions data Expr : Set where nat : ℕ → Expr bool : 𝔹 → Expr _+'_ : Expr → Expr → Expr _<'_ : Expr → Expr → Expr _∧'_ : Expr → Expr → Expr {-# BUILTIN NATURAL ℕ #-} -- allows us to write `2` instead of `suc (suc zero)` -- Example: The syntax tree representing the expression "(2 + 3) < 5" ex₁ : Expr ex₁ = (nat 2 +' nat 3) <' nat 5 -------------------------------------------------------------------------------- -- 2. Semantics -- The semantics of a language describes how its programs are -- evaluated. -- -- There are multiple techniques of defining the semantics of a -- language, which yield different trade-offs in how easy it is to -- prove certain properties about the languages. -- -- In this tutorial, we specify the semantics as a so called -- *small-step operational semantics*. The key idea here is to view -- the evaluation of a program `e` as a sequence of small computation -- steps -- -- e ↪ e₁ ↪ e₂ ↪ e₃ ↪ … -- -- where each step `eₙ ↪ eₙ₊₁` transforms the program into a -- potentially simpler program. This either goes on forever -- (describing a non-terminating program) or after a finite amount of -- steps a program is reached to which no further computation steps can be -- applied. The latter case then can mean one of two things: either -- - the program has successfully terminated by being transformed into -- a value, like `true` or `5`; or -- - the program reached an invalid state like `true + 5` where no -- computation rule can be applied, which in an implementation would -- cause a runtime error. -- -- For example in our language, we want the following sequence of -- computation steps to be possible -- -- ((2 + 3) < 5) ↪ (5 < 5) ↪ false -- -- To model this in Agda, we define a binary relation between our -- programs -- -- _↪_ : Expr → Expr → Set -- -- such that `e₁ ↪ e₂` holds if `e₁` can be transformed into `e₂` by -- performing one computation step. -- The `Value` Relation defines which expressions we consider "values", -- i.e. results of successful computations. data Value : Expr → Set where -- Natural numbers are values v-nat : ∀ {n : ℕ} → Value (nat n) -- Booleans are values v-bool : ∀ {b : 𝔹} → Value (bool b) -- Expressions like `(nat 2 +' nat 3)` are *not* values, since they -- represent unfinished computations. -- Next we define the semantics of our three operators _+'_, _<'_ and _∧'_ -- since we need that to define _↪_ _+_ : ℕ → ℕ → ℕ zero + n = n suc m + n = suc (m + n) _<_ : ℕ → ℕ → 𝔹 zero < zero = false zero < suc n = true suc m < zero = false suc m < suc n = m < n _∧_ : 𝔹 → 𝔹 → 𝔹 true ∧ true = true _ ∧ _ = false -- The actual semantics relation is defined by two kinds of rules: data _↪_ : Expr → Expr → Set where -- The β-rules describe how the actual computation happens. This is -- where our helper functions _+_, _<_ and _∧_ are used. ↪-β-+ : ∀ {n₁ n₂ : ℕ} → (nat n₁ +' nat n₂) ↪ nat (n₁ + n₂) ↪-β-< : ∀ {n₁ n₂ : ℕ} → (nat n₁ <' nat n₂) ↪ bool (n₁ < n₂) ↪-β-∧ : ∀ {b₁ b₂ : 𝔹} → (bool b₁ ∧' bool b₂) ↪ bool (b₁ ∧ b₂) -- The ξ-rules allow for the β-rules to be applied in sub -- expressions (see examples below). -- -- We also use the ξ-rules to specify an evaluation order, i.e. that -- in `e₁ +' e₂` we always evaluate `e₁` first before we evaluate -- `e₂`. This is achieved through the `Value e₁` assumptions in -- some of the rules. Specifiying an evaluation order like this, -- also implies that `_↪_` is deterministic, i.e. that whenever -- `e ↪ e₁` and `e ↪ e₂` then `e₁ ≡ e₂`. ↪-ξ-+₁ : ∀ {e₁ e₂ e₁' : Expr} → e₁ ↪ e₁' → (e₁ +' e₂) ↪ (e₁' +' e₂) ↪-ξ-+₂ : ∀ {e₁ e₂ e₂' : Expr} → Value e₁ → e₂ ↪ e₂' → (e₁ +' e₂) ↪ (e₁ +' e₂') ↪-ξ-<₁ : ∀ {e₁ e₂ e₁' : Expr} → e₁ ↪ e₁' → (e₁ <' e₂) ↪ (e₁' <' e₂) ↪-ξ-<₂ : ∀ {e₁ e₂ e₂' : Expr} → Value e₁ → e₂ ↪ e₂' → (e₁ <' e₂) ↪ (e₁ <' e₂') ↪-ξ-∧₁ : ∀ {e₁ e₂ e₁' : Expr} → e₁ ↪ e₁' → (e₁ ∧' e₂) ↪ (e₁' ∧' e₂) ↪-ξ-∧₂ : ∀ {e₁ e₂ e₂' : Expr} → Value e₁ → e₂ ↪ e₂' → (e₁ ∧' e₂) ↪ (e₁ ∧' e₂') -- Examples -- Here the computation happens on the outside, so we can directly -- apply a β-rule. _ : (nat 2 +' nat 2) ↪ nat 4 _ = ↪-β-+ -- Here the computation happens inside the left sub expression, so we -- need a ξ-rule to lift the β-rule inside the sub expression. _ : ((nat 2 +' nat 0) +' nat 2) ↪ (nat 2 +' nat 2) _ = ↪-ξ-+₁ ↪-β-+ -------------------------------------------------------------------------------- -- 3. Type System -- The syntax of types data Type : Set where Nat : Type Bool : Type -- The typing relation: `e ⦂ t` states that expression `e` has type `t`. data _⦂_ : Expr → Type → Set where -- All natural number values have type `Nat`. ⦂-nat : ∀ {n : ℕ} → ----------- nat n ⦂ Nat -- All boolean values have type `Bool`. ⦂-bool : ∀ {b : 𝔹} → ----------- bool b ⦂ Bool -- An expression `e₁ +' e₂` has type `Nat`, -- if both sub expressions `e₁` and `e₂` have type `Nat`. ⦂-+ : ∀ {e₁ e₂ : Expr} → e₁ ⦂ Nat → e₂ ⦂ Nat → ---------------- (e₁ +' e₂) ⦂ Nat -- Analogous to ⦂-+ ⦂-∧ : ∀ {e₁ e₂ : Expr} → e₁ ⦂ Bool → e₂ ⦂ Bool → ---------------- (e₁ ∧' e₂) ⦂ Bool -- Analogous to ⦂-+ ⦂-< : ∀ {e₁ e₂ : Expr} → e₁ ⦂ Nat → e₂ ⦂ Nat → ---------------- (e₁ <' e₂) ⦂ Bool -- Examples _ : nat 2 ⦂ Nat _ = ⦂-nat _ : (nat 2 +' nat 3) ⦂ Nat _ = ⦂-+ ⦂-nat ⦂-nat _ : ((nat 2 +' nat 3) <' nat 5) ⦂ Bool _ = ⦂-< (⦂-+ ⦂-nat ⦂-nat) ⦂-nat -- However, the following statements are not true: -- -- (nat 2 +' bool true) ⦂ Nat -- (nat 2 +' bool true) ⦂ Bool -------------------------------------------------------------------------------- -- 4. Type Safety -- Now that we defined both the languages semantics and type system, -- the question arises if the type system is actually correct, i.e. if -- it really rejects all programs that cause runtime errors, and that -- the programs it accepts actually behave accordingly to their type. -- -- In other words: if e ⦂ t then either e will either run forever, or -- after some number of computation steps it has been transformed into -- a value e' with e' ⦂ t. -- -- Instead of proving this statement directly, people usually prove -- the following two lemmas instead, which easily allow to conclude the -- actual statement: -- -- - `preservation` states that if a program has a type and we apply a -- single computation step to it, then the resulting program still -- has the same type; -- -- - `progress` states that if a program has a type, then either it is -- a value or it is possible to perform another computation step. -- -- So if e ⦂ t, then due to `progress` we know that either e is -- already a value of the right type, or we can take another -- computation step e ↪ e'. In the latter case, we can apply -- `preservation` to get `e' ⦂ t`, on which we then can apply -- `progress` again, and so on. preservation : ∀ {e e' t} → e ⦂ t → e ↪ e' → -------- e' ⦂ t -- The proof is by induction on `e ↪ e'` followed by case analysis of `e ⦂ t`: preservation (⦂-+ e₁⦂Nat e₂⦂Nat) ↪-β-+ = ⦂-nat preservation (⦂-< e₁⦂Nat e₂⦂Nat) ↪-β-< = ⦂-bool preservation (⦂-∧ e₁⦂Bool e₂⦂Bool) ↪-β-∧ = ⦂-bool preservation (⦂-+ e₁⦂Nat e₂⦂Nat) (↪-ξ-+₁ e₁↪e₁') = ⦂-+ (preservation e₁⦂Nat e₁↪e₁') e₂⦂Nat preservation (⦂-+ e₁⦂Nat e₂⦂Nat) (↪-ξ-+₂ val-e₁ e₂↪e₂') = ⦂-+ e₁⦂Nat (preservation e₂⦂Nat e₂↪e₂') preservation (⦂-< e₁⦂Nat e₂⦂Nat) (↪-ξ-<₁ e₁↪e₁') = ⦂-< (preservation e₁⦂Nat e₁↪e₁') e₂⦂Nat preservation (⦂-< e₁⦂Nat e₂⦂Nat) (↪-ξ-<₂ val-e₁ e₂↪e₂') = ⦂-< e₁⦂Nat (preservation e₂⦂Nat e₂↪e₂') preservation (⦂-∧ e₁⦂Bool e₂⦂Bool) (↪-ξ-∧₁ e₁↪e₁') = ⦂-∧ (preservation e₁⦂Bool e₁↪e₁') e₂⦂Bool preservation (⦂-∧ e₁⦂Bool e₂⦂Bool) (↪-ξ-∧₂ val-e₁ e₂↪e₂') = ⦂-∧ e₁⦂Bool (preservation e₂⦂Bool e₂↪e₂') data Progress : Expr → Set where is-value : ∀ {e : Expr} → Value e → Progress e is-reducible : ∀ {e e' : Expr} → e ↪ e' → Progress e progress : ∀ {e t} → e ⦂ t → ---------- Progress e progress ⦂-nat = is-value v-nat progress ⦂-bool = is-value v-bool progress (⦂-+ e₁⦂Nat e₂⦂Nat) with progress e₁⦂Nat | progress e₂⦂Nat ... | is-value v-nat | is-value v-nat = is-reducible ↪-β-+ ... | is-value val-e₁ | is-reducible e₂↪e₂' = is-reducible (↪-ξ-+₂ val-e₁ e₂↪e₂') ... | is-reducible e₁↪e₁' | _ = is-reducible (↪-ξ-+₁ e₁↪e₁') progress (⦂-< e₁⦂Nat e₂⦂Nat) with progress e₁⦂Nat | progress e₂⦂Nat ... | is-value v-nat | is-value v-nat = is-reducible ↪-β-< ... | is-value val-e₁ | is-reducible e₂↪e₂' = is-reducible (↪-ξ-<₂ val-e₁ e₂↪e₂') ... | is-reducible e₁↪e₁' | _ = is-reducible (↪-ξ-<₁ e₁↪e₁') progress (⦂-∧ e₁⦂Bool e₂⦂Bool) with progress e₁⦂Bool | progress e₂⦂Bool ... | is-value v-bool | is-value v-bool = is-reducible ↪-β-∧ ... | is-value val-e₁ | is-reducible e₂↪e₂' = is-reducible (↪-ξ-∧₂ val-e₁ e₂↪e₂') ... | is-reducible e₁↪e₁' | _ = is-reducible (↪-ξ-∧₁ e₁↪e₁')