De Bruijn indices, in TAPL or PLFA style?

Posted on 2024-03-20

Much research about de Bruijn indices focus on the definition of subst function, and how it commutes with shift/rename operation, mostly in the context of untyped lambda calculus.

In this article, we look at it from a different perspective: how to define and generalise the weakening lemma, and how the generalistion affects the design choices about shifting and contexts.

We demonstrate two different approaches in STLC using Agda with well-scoped syntax. The reason of choosing Agda is author’s familiarity; the reason of choosing well-scoped syntax is that it is less error-prone to write code without losing scalability. One more virtue perhaps is the types of many operations are self-explanatory.

Common Syntax

We first introduce the syntax that doesn’t differ in the two ways.

Types are two instances: Int and Function type.

data Type : Set where
  Int    : Type
  _`→_   : (A : Type)  (B : Type)  Type

Terms are indexed by a number, which specifies the number of free variables can be used in the term. There’re four instances:

  • literal numbers: it is a term with arbitrary number of free variables;
  • variables: the number can only be less than the index, we use Fin n to do this;
  • lambdas: the body has one more free variable than the expression since one variable is introduced by the binder;
  • applications: both subterms have the same number of free variables.
data Term : Set where
  lit   : (i :)  Term n
  `_    : (x : Fin n)  Term n
  ƛ__  : (A : Type)  (e : Term (1 + n))  Term n
  _·_   : (e₁ : Term n)  (e₂ : Term n)  Term n

Typing Contexts

We reach a point that have some options. We could define a context as datatype with two instances, cloest to the paper presentation:

Datatype Definition

  • A empty context has no free variables;
  • A context followed by a type has one more free variable;

The contexts are accessed by a number, which defines on simutanious induction of the context and the number until we reach the #0. Note that the type Context n → Fin n → Type ensures that we can always find the type of a variable in a context.

infix 6 _,_

data Context : Set where
: Context 0
  _,_   : Context n  (A : Type)  Context (1 + n)

infix 3 ___
data ___ : Context n  Fin n  Type  Set where
  Z :  {Γ : Context n} {A}
     Γ , A ∋ #0 ⦂ A
  S :  {Γ : Context n} {k A B}
     Γ ∋ k ⦂ A
     Γ , B ∋ #S k ⦂ A

Vector Definition

We could use a Vector (lengh-indexed version of List, which should be similar to the datatype definition, we don’t show it here.

Function Definition

Contexts appear in several situations:

  • Lookup: given a number, we need to find its associated type; which is used in typing rule for variables;
  • Extension: we need to insert a type into the context, which is used in typing rule for lambdas;

Those two operations can be functionlised as:

context : Set
context = Type

Then the lookup is function application and the extension is to wrap this function with more arguments. A similiar idea can be found at Software Foundatoins.

Typing

Typing rules have small differences according to the context definition. We use the datatype definition here to demonstrate.

There’re four instances:

  • literals are always of type Int;
  • variables has a type A by lookup x in the context;
  • lambdas have a type A → B if the body has type B with the extended context;
  • applications have type B if the function has type A → B and the argument has type A.
data ___ : Context n  Term n  Type  Set where

  ⊢lit :  {i}
     Γ ⊢ (lit i) ⦂ Int

  ⊢var :  {x A}
     Γ ∋ x ⦂ A
     Γ ⊢ ` x ⦂ A

  ⊢abs :  {e A B}
     Γ , A ⊢ e ⦂ B
     Γ ⊢ ƛ A ⇒ e ⦂ A `→ B

  ⊢app :  {e₁ e₂ A B}
     Γ ⊢ e₁ ⦂ A `→ B
     Γ ⊢ e₂ ⦂ A
     Γ ⊢ e₁ · e₂ ⦂ B

Weakening

Right now we reach the decision-making point: how to define the general version of the weakening lemma, and how it affects the design of the shift operation.

The most concrete definition is:

weaken0 :  {Γ : Context n} {e A B}
   Γ ⊢ e ⦂ A
   Γ , B ⊢ ↑tm0 e ⦂ A

If a term e has the type A in the context Γ, then the shifted term ↑tm0 e has the same type A in the extended context Γ, B. Specifically, ↑tm0 e means we shift e by one, in the lowest level. In other words, each free variable in term e will be increased by one. This is intuitive: we extend the context at the end, free variables pointing to the context becomes one more “mile” away from its original distance.

However weaken0 is too specific to prove, we usually need a general version of it, which differs in two ways:

  • generalise the level we increase the indices: we could allow term to be shifted by level k; which means free variables only equal/bigger than k will be increased by one. This design corresponds to the idea we insert the type B at the middle of the context: at the k position, thus only types after this position will be affected.

  • generalise the operation we manipulate the indices: we could define a operation ρ : ℕ → ℕ to change the indices, but still at the level 0. This corresponds “how many types” we insert into the end of the contexts (an almost correct explanation).

Those two design choices will affect the definition of the shift operations. I call them TAPL style and PLFA style to distinguish them.

TAPL style of Shift and Weakening Lemma

The shift operation is defined as below. Just like mentioned, we only shift indices equal and bigger than k. punchIn is a Agda library function that is f(i,j) = if j≥i then j+1 else j.

↑tm : Fin (suc n)  Term n  Term (suc n)
↑tm k (lit i)    = lit i
↑tm k (` x)      = ` (punchIn k x)
↑tm k (ƛ A ⇒ e)  = ƛ A ⇒ (↑tm (#S k) e)
↑tm k (e₁ · e₂)  = ↑tm k e₁ · ↑tm k e₂

↑tm0 : Term n  Term (suc n)
↑tm0 = ↑tm #0

The weakening lemma further requires an operation that inserts a type into the kth position.

infix 4 [_<<_]_
[_<<_]_ : Context n  Type  Fin (suc n)  Context (suc n)
[ Γ << B ] #0 = Γ , B
[ Γ , A << B ] #S k = ([ Γ << B ] k) , A

Then the generalised weakening lemma is defined as:

weaken :  {Γ : Context n} {k : Fin (suc n)} {e A B}
   Γ ⊢ e ⦂ A
   [ Γ << B ] k ⊢ ↑tm k e ⦂ A

PLFA style of Shift and Weakening Lemma

PLFA style, on the other hand, generalises the operation we manipulate the indices, we call this operation Renaming. And rename function is the generalised form of shifting, which applies the Renaming recursively to the term.

Renaming : Set
Renaming n m = Fin n  Fin m

rename : Renaming n m  Term n  Term m
rename ρ (lit i) = lit i
rename ρ (` x) = ` (ρ x)
rename ρ (ƛ A ⇒ e) = ƛ A ⇒ rename (ext ρ) e
rename ρ (e₁ · e₂) = rename ρ e₁ · rename ρ e₂

We could get our ↑tm0 by instantiating ρ. The signature is very useful here: we shift a term from n level to 1 + n level.

ρ-incr : Renaming n (1 + n)
ρ-incr x = #S x

_: Term n  Term (1 + n)
_= rename ρ-incr

However, the trouble we then encounter is hard to find the general version of weakening lemma, in the datatype definition of context. The intuition should be a context following a sequence of types reflecting the ρ operation.

The easiest approach, which I learnt from coq-community/autosubst, is to define contexts as functions, then the new context is just a composition of the old context and ρ. The details can be found here: ty_ren.

Conclusion

I believe that two styles just generalise the shift function in two diamensions. No one is better than the other, but for different conditions, we need to make a choice. For example, I often need to use weakening/strenghening lemma and need to directly manipulate the context, then TAPL style is more suitable for me. However, if I need to frequently commute subst/rename operations, then PLFA style is more suitable for me.

Note: In PLFA book, the weakening lemma can be defined directly on the datatype definition, but I belive it benifits from the intrinsic approach, which does not scale to plain or well-scoped approach.

Appendix: Subtitution

The substitution operatoins of two styles are put here for reference.

-- TAPL style
_[_:=_] : Term (suc n)  Term n  Fin (suc n)  Term n
lit i [ v := k ] = lit i
(` x) [ v := k ] with k #≟ x
... | yes refl = v
... | no ¬p    = ` (punchOut {i = k} {j = x} ¬p)
(ƛ A ⇒ e) [ v := k ] = ƛ A ⇒ (e [ ↑0 v := #S k ])
(e₁ · e₂) [ v := k ] = e₁ [ v := k ] · e₂ [ v := k ]

_[_] : Term (suc n)  Term n  Term n
e [ v ] = _[_:=_] e v #0

-- PLFA style
exts : Substitution n m  Substitution (1 + n) (1 + m)
exts σ #0 = ` #0
exts σ (#S x) = rename ρ-incr (σ x)

subst : Substitution n m  Term n  Term m
subst σ (lit i) = lit i
subst σ (` x) = σ x
subst σ (ƛ A ⇒ e) = ƛ A ⇒ subst (exts σ) e
subst σ (e₁ · e₂) = subst σ e₁ · subst σ e₂

subst-zero : Term n  Substitution (1 + n) n
subst-zero e #0 = e
subst-zero e (#S x) = ` x

References and Further Reading

We list the choices of some formalisation take:

Side Story of de Bruijn

POPLmark, gives some opinions on the de Bruijn approach at the beginning:

Another popular concrete representation is de Bruijn’s nameless representa- tion. De Bruijn indices are easy to understand and support the full range of induction principles needed to reason over terms. In our experience, however, de Bruijn representations have two major flaws. First, the statements of theo- rems require complicated clauses involving “shifted” terms and contexts. These extra clauses make it difficult to see the correspondence between informal and formal versions of the same theorem—there is no question of simply typesetting the formal statement and pasting it into a paper. Second, while the notational clutter is manageable for “toy” examples of the size of the simply-typed lambda calculus, we have found it becomes quite a heavy burden even for fairly small languages like F<:.

However, in the POPLmark 15 Year Retrospective Panel, survey results showed that more and more people seem to be sticking with “good old deBruijn”.