The Why of Juvix: Ingredients & Architecture
*Image source: Aditya via Wikimedia Commons, CC-BY-SA 3.0
In addition to protocol development, in the past several months METASTATE has embarked upon a new project: research & development of a novel smart contract language, Juvix.
Juvix is designed to address the problems that we have experienced while trying to write & deploy decentralised applications and that we observe in the ecosystem at large: the difficulty of effective verification, the ceiling of compositional complexity, the illegibility of execution costs, and the lock-in to particular backends.
In order to do so, Juvix draws upon and aims to productionise a deep reservoir of prior academic research in programming language design & type theory which we believe has a high degree of applicability to these problems.
There should be a substantial bar to meet before electing to write a new language. After investigating many simpler approaches and developing distributed ledgers & smart contracts ourselves, we've decided that this bar, for the use-case of smart contracts on public ledgers, is met — there are many unique, fundamentally difficult problems which can be convincingly solved at the language level, but only by designing & engineering a language and compiler stack from scratch.
This post, the second part of a two-part series, enumerates the various theoretical ingredients in Juvix that we think will enable it to meet these challenges, describes the current state of specification and implementation, and provides a list of further resources & instructions should you wish to learn more.
Key ingredients
- How does Juvix aim to realise these goals?
- Which key ingredients are drawn from the theoretical literature to provide the requisite capabilities?
Dependent types
The first and most essential ingredient is that of dependent types: a typesystem expressive enough to allow terms written in Juvix to express properties about the computation behaviour of other terms.
For example (not yet in Juvix), the fairness of Tendermint proposer election has been partially verified in Idris, which has the same kind of dependent typesystem:
fairlyProportional :
(idA : ProposerId) -> (idB : ProposerId) ->
(wA : ProposerWeight) -> (wB : ProposerWeight) ->
(pA : ProposerPriority) -> (pB: ProposerPriority) ->
(n : Nat) ->
(wA >= 0 = True) -> (wB >= 0 = True) ->
(abs(pA - pB) <= (wA + wB) = True) ->
((count idA (snd (incrementElectMany n ((idA, wA, pA), (idB, wB, pB)))))
>= ((n * (wA / (wA + wB))) - 1) = True,
(count idA (snd (incrementElectMany n ((idA, wA, pA), (idB, wB, pB)))))
<= ((n * (wA / (wA + wB))) + 1) = True)
incrementElectMany
, the proposer-election function, is the term whose behaviour we want to reason about, and inhabiting the type of fairlyProportional
with a valid term proves that it behaves in the fashion we intend.
In English, this proof could be read as "a validator, in a sequence of proposer elections where no other power changes take place, proposes no fewer blocks than the total blocks in the epoch multiplied by its fraction of stake less one, and proposes no more blocks than the total blocks in the epoch multiplied by its fraction of stake plus one".
Dependent types as implemented in Juvix satisfy all three dimensions of the lambda cube: terms can depend on types, types can depend on types, and types can depend on terms. In accordance with the Curry-Howard correspondence, types in Juvix can express propositions, or properties of terms, and programs which inhabit those types serve as proofs of the propositions.
There are several different formal verification techniques, and they each have different trade-offs. Dependent types are quite expressive, since the conversion rule is just beta reduction, and allow for property verification in the same language, which removes the need for an error-prone translation between a specification and implementation. They allow for progressive verification — contracts can initially be written "unverified", and proofs can be added over time as the logic of an application-in-development solidifies. We have found that dependent types are elegant and become reasonably intuitive with practice, but more tooling support will undoubtedly be necessary to smooth the learning curve and ease some of the proof bureaucracy, which can be onerous at times.
Usage quantisation
The second theoretical ingredient Juvix brings to the table, used both to allow developers to
reason even more precisely about the behaviour of their programs and to ensure that using dependent types for formal verification incurs no runtime cost, is usage quantisation, drawn from the Quantitative Type Theory paper and extended by us.
Typing judgements in Juvix are annotated with precise usages:
x_1 \overset{ρ_1}{:} S_1, \dots , x_n \overset{ρ_n}{:} S_n \vdash\ M \overset{σ}{:} T
These usage annotations indicate how many times the subject of the judgement can be "computed with", or used. This is a form of linear types, but much more expressive: usage annotations are numerical and can depend on terms just like types can. Usage annotations are checked at compile-time for correctness. Since proofs of properties need not be computed over and can be constructed in the zero-usage fragment, Juvix can then erase proofs at compile time after checking them, ensuring that property verification creates no runtime overhead. Detailed usage accounting also allows Juvix to avoid garbage collection, optimise stack usage, and safely modify values in-place, in various situations depending on the types and the granularity of usage annotations provided.
Whole-program optimisation & efficient execution
Juvix takes advantage of the unique trade-offs of the smart contract use case — small source-code sizes and infrequent deployment, which allow for longer compile times and wider optimisation searches — to enable several kinds of whole-program optimisations. Whole-program optimisation encapsulates a collection of techniques & optimisation passes rather than any specific one, drawn from multiple inspirations including GRIN, STOKE and Morte, including compile-time evaluation, inlining, fusion, case simplification, common sub-expression elimination, various dead code elimination checks, and rewriting provably equivalent terms.
For certain backends, Juvix can evaluate lambda calculus terms using the abstract model of interaction nets, as outlined by Lamping, Asperti, and Guerrini and inspired by prior work.
Interaction nets minimise the number of beta reductions by sharing evaluations, perform runtime fusion, support automatic parallelism, and handle large closures efficiently. They come with some overhead, but this can be mitigated with bespoke encoding — generating custom rewrite rules at compile-time — for small, performance-critical functions.
Resource verification
Leveraging the dependent typechecker already in place, Juvix uses techniques from a 2008 POPL paper to check computational resource consumption of terms without executing them by encapsulating computations in a cost monad, where costs can depend on terms in the usual dependent fashion. This requires some annotations by the developer, but tooling can be improved over time, and cost bounds can be added in a progressive fashion just like proofs of properties.
In the future, if ledgers elect to integrate a limited version of the typechecker directly into their state machines, this resource verification mechanism can be used to eliminate runtime gas metering entirely: each contract would have a cost function, computable depending on arguments to the call, and the gas (exactly computable) could be deducted at the beginning of the function prior to execution. This resource verification model can also easily incorporate different denominations of resources (compute, storage, etc.) by tracking several costs at once.
Backend parameterisation
Juvix parameterises the type theory & core language over a set of primitive data types and primitive values, which can include native data types such as strings, integers, or sets, and native functions such as addition, subtraction, string concatenation, set membership, etc.
The language & typechecker can then be instantiated over a particular backend which provides concrete sets of primitives and a primitive type-checking relation. Backends are integrated into the language itself as special values, so that the standard library can nicely abstract primitives with different underlying implementations into common typeclasses for numbers, strings, etc. and resolve operators to the correct backend at compile time as instructed by the developer.
This will allow contract authors to write a single contract in the frontend language and target different backends with compiler switches or compile to more efficient upgraded versions of a virtual machine later on, with minimal or no code changes, and also eases the path towards integrated typechecking of applications with components on multiple backends (perhaps multiple ledgers and a zkSNARK circuit, for example).
Current state of affairs
Engineering a language & compiler stack from scratch is tricky business, and undoubtedly we'll make mistakes and change our mind along the way. In order to avoid following too many false leads, we've taken a specification-first approach with Juvix, trying first to reason through all the thorny problems and write down exactly what we plan to do and how the language will work. Specification-wise, Juvix currently has a detailed language reference, the start of a type theory formalisation in Agda, and detailed plans for future work on resource verification, whole-program optimisation, and other features. Implementation-wise, Juvix has a typechecker for the first version of the type theory, a core language and several intermediate forms, erasure of zero-usage terms, a backend parameterisation system, and the start of two backends: one for Michelson and one for LLVM / WASM (through LLVM).
Juvix is not yet ready for end-user or developer use. We expect further work on the frontend language and an initial public release in Q1/Q2 2020. Contributions & critiques alike are welcome anytime.
Learn more
- Juvix repository
- Juvix language reference
- Learn more about dependent types in Agda
- Two other interesting projects applying type theory to distributed ledger use-cases: Formality & Statebox
Written by Christopher Goes, co-founder of Metastate.
Metastate has ceased its activities on the 31st of January 2021. The team members have joined a new company and working on a new project. If you're interested in programming language theory, functional programming, type theory, formal verification, or compiler engineering positions, check out the open positions at Heliax.
If you have any feedback or questions on this article, please do not hesitate to contact us: hello@heliax.dev.