---
title: Talk at Formal Methods for Statistical Software workshop, Martin Bodin
---
[Martin Bodin](https://vtss.doc.ic.ac.uk/people/bodin.html) gave a talk “A Trustworthy Mechanized Formalization of R“ at the 
[Formal Methods for Statistical Software (FMfSS 2019)](https://samate.nist.gov/FMSwVRodeo/FMfSS2019.html) 
workshop, to present his work on big-step operational semantics for R.  The abstract for the talk is below. 

The FMfSS workshop aims to identify formal methods, tools, and techniques that 
can be used to gain higher assurance of statistical software. The workshop is 
co-located with the [High Confidence Software and Systems Conference](https://cps-vo.org/group/hcss_conference) 
and is organised by the U.S. Census Bureau and the [National Institute of Standards and Technology (NIST)]( https://www.nist.gov/)

Abstract: The R programming language is very popular for developing statistical 
software and data analysis, thanks to rich libraries, concise and expressive syntax, 
and support for interactive programming. Yet, the semantics of R is fairly complex, 
contains many subtle corner cases, and is not formally specified. This makes 
it difficult to reason about R programs.
In this work, we develop a big-step operational semantics for R in the form of 
an interpreter written in the Coq proof assistant. We ensure the trustworthiness 
of the formalization by introducing a monadic encoding that allows the Coq interpreter, 
CoqR, to be in direct visual correspondence with the reference R interpreter, 
GNU R. Additionally, we provide a testing framework that supports systematic 
comparison of CoqR and GNU R. In its current state, CoqR covers the nucleus 
of the R language as well as numerous additional features, making it pass 
a significant number of realistic test cases from the GNU R and FastR projects. 
To exercise the formal specification, we prove in Coq the preservation 
of memory invariants in selected parts of the interpreter. 
This work is an important first step towards a robust environment for 
formal verification of R programs.