Rust’s Safety Net: A Formally Verified Compiler Takes Shape

Author: Denis Avetisyan

Researchers have built RustCompCert, a compiler leveraging formal verification techniques to guarantee memory safety and semantic correctness for a significant subset of the Rust programming language.

RustCompCert is an end-to-end verified compiler built on CompCert, formalizing key Rust features like drop elaboration and borrow checking to ensure code correctness.

Despite growing demand for memory-safe systems, ensuring the correctness of modern compilers remains a significant challenge. This paper details the development of ‘RustCompCert: A Verified and Verifying Compiler for a Sequential Subset of Rust’, an end-to-end verified compiler built upon the formally verified CompCert framework. RustCompCert provides both semantics preservation-guaranteeing source and target code exhibit equivalent behavior-and memory safety through verified compilation, including a formalized borrow checking pass. By enabling rigorous verification of key compilation stages, can this approach unlock new levels of trust and reliability in safety-critical Rust applications?

The Inevitable Decay of Systems: A Necessary Pursuit

Contemporary software systems, increasingly integrated into critical infrastructure and daily life, require demonstrably higher levels of reliability than ever before. Traditional testing methodologies, while valuable, fundamentally struggle to provide this assurance due to the inherent difficulty of achieving exhaustive coverage. The sheer complexity of modern applications – millions of lines of code with intricate interactions – creates a vast state space that is practically impossible to fully explore through testing alone. Even with sophisticated test suites and extensive automation, subtle bugs and edge cases inevitably remain hidden, posing potential risks ranging from minor inconveniences to catastrophic failures. This limitation drives the need for complementary approaches, such as formal verification, which aim to mathematically prove the correctness of software rather than relying on empirical observation, though these approaches come with their own set of challenges regarding scalability and practical application.

Formal verification, a technique employing rigorous mathematical methods to prove the absence of bugs in software, presents a compelling, though difficult, route to achieving guaranteed correctness. Unlike traditional testing, which can only demonstrate the presence of errors, verification aims to establish that a program adheres to its specification with absolute certainty. However, the application of formal methods is often hampered by significant scalability and complexity challenges; as systems grow in size and intricacy, the computational resources and human effort required to perform verification increase dramatically. This is due to the inherent state-space explosion problem-the number of possible execution paths grows exponentially with program size-and the difficulty in formally specifying complex system requirements. Consequently, while promising for critical systems where failure is unacceptable, practical application of formal verification often necessitates compromises, such as verifying only specific components or simplifying the system model to a manageable scale.

Memory safety – the absence of vulnerabilities like buffer overflows and dangling pointers – represents a critical challenge in modern software development. While techniques like borrow checking, prominently featured in languages such as Rust, offer significant improvements by enforcing memory safety at compile time, they are not without limitations. These systems often struggle with complex data structures and dynamic behavior, requiring developers to navigate a sometimes-restrictive set of rules or resort to unsafe code blocks. Moreover, the static analysis required for borrow checking can introduce performance overhead, impacting runtime efficiency. Current approaches frequently demand significant programmer effort to satisfy the verifier, and may lack the expressiveness needed to model real-world applications accurately, thus necessitating a continued search for more scalable and flexible solutions that guarantee memory safety without compromising performance or usability.

Polonius: Charting a More Precise Course to Memory Safety

Polonius introduces a novel borrow checking algorithm designed to guarantee memory safety in programs by preventing data races and dangling pointers. This approach diverges from traditional borrow checking systems, such as those used in Rust, by employing a different analytical methodology for tracking memory access and ownership. Rather than relying solely on lifetime annotations and static analysis of code structure, Polonius focuses on dynamically inferring relationships between memory locations during program execution, allowing for potentially more flexible and precise verification. The algorithm aims to establish a formal proof of memory safety, ensuring that all memory accesses are valid and consistent throughout the program’s lifespan, thereby preventing runtime errors related to memory management.

Polonius utilizes alias analysis as a core component of its borrow checking process to establish relationships between distinct memory regions. This analysis determines whether multiple pointers or references could potentially point to the same memory location at any given time. By precisely identifying these aliasing relationships, Polonius can accurately track data flow and prevent data races or invalid memory access. The system constructs an alias graph representing these connections, enabling it to reason about the lifetimes and mutability of data without requiring the programmer to provide explicit lifetime annotations in many cases. This foundational step allows Polonius to verify memory safety by ensuring that accesses to memory are always valid based on the established relationships and ownership rules.

Polonius distinguishes itself from Non-Lexical Lifetime (NLL) borrow checking through its analytical approach, specifically targeting enhanced precision in identifying valid memory access patterns. While NLL relies on lexical scoping and region-based analysis, Polonius employs a different algorithm focused on more granular alias analysis to determine relationships between memory locations. This approach seeks to reduce false negatives – situations where NLL incorrectly flags valid code as unsafe – thereby potentially enabling more optimizations by the compiler. The intended outcome of this increased precision is not only improved code safety but also performance gains through reduced overhead from unnecessary runtime checks or conservative memory management.

Under the Hood: The Mechanics of Formal Verification

The Polonius borrow checker utilizes a Union-Find structure, also known as a disjoint-set data structure, to track memory region equivalence. Each memory region is initially considered a separate set. During analysis, when a potential alias relationship is detected, the corresponding sets are merged using the ‘union’ operation. The ‘find’ operation efficiently determines the representative element, or root, of the set a given region belongs to. This allows Polonius to quickly ascertain if two memory regions are aliases – meaning they belong to the same set – which is crucial for verifying the absence of data races and ensuring memory safety. The amortized time complexity of both union and find operations is nearly constant, enabling efficient scaling to large codebases.

The Polonius borrow checker utilizes the Kildall Framework, a formal methodology for dataflow analysis. This framework defines analysis as iteratively propagating information about program state through control-flow paths. The core of the Kildall approach involves computing meet operations to combine information from different paths, ensuring the most precise abstract state is maintained. This iterative process continues until a fixed point is reached, indicating that further analysis will not yield new information; at this point, the analysis is considered complete and the program’s properties can be verified based on the computed state. The framework provides a structured approach to ensure both soundness and completeness in the borrow checking process.

Transfer functions are central to Polonius’s borrow checking process, mathematically formalizing the effects of each Rust language construct on the abstract state representing variable ownership and mutability. These functions accept a pre-state, which encapsulates information about variable lifetimes and borrowing at a specific program point, and produce a post-state reflecting the changes introduced by the current instruction. Specifically, they model how borrows are acquired, released, and potentially conflict, updating the abstract state accordingly. The correctness of the borrow checker fundamentally relies on the precision of these transfer functions; accurate modeling ensures that only valid borrow patterns are accepted, preventing data races and dangling pointers, while overly conservative functions might lead to unnecessary restrictions on valid code.

RustCompCert: A Step Towards Provably Correct Compilation

RustCompCert represents a significant step toward building a fully verified Rust compiler, utilizing the Polonius framework as a central component to achieve this ambitious goal. This system doesn’t simply check for common errors; it aims to mathematically prove that the compiled code behaves identically to the original Rust source, a property known as semantic preservation. By rigorously verifying each stage of the compilation process – from Rustlight, a formally defined subset of Rust, to Clight, an intermediate language designed for verification – RustCompCert ensures that the safety guarantees inherent in Rust’s design are maintained throughout. The successful demonstration of full semantic preservation signifies a major advancement in compiler correctness, offering a pathway towards building highly reliable and secure software systems where code behavior can be formally guaranteed, not just empirically tested.

RustCompCert employs a strategy of compiling a rigorously defined subset of Rust, known as Rustlight, into Clight, a well-established intermediate language specifically designed for formal verification. This deliberate reduction in complexity isn’t a limitation, but rather a cornerstone of the project’s approach to achieving a fully verified compiler. By targeting Clight, the system leverages existing formal methods and tools developed for Clight’s verification, circumventing the need to build an entirely new verification infrastructure from scratch. Rustlight captures the essential safety features of Rust while simplifying the compilation process, enabling a tractable path towards proving the correctness of the entire compilation pipeline – from source code to executable assembly. This approach ensures that any program successfully compiled by RustCompCert using Rustlight will behave as expected, adhering to the formal semantics defined for the language.

RustCompCert’s architecture strategically integrates CompCert, a formally verified optimizing compiler, to bridge the gap between formally proven Rustlight and machine code. This approach allows the system to leverage CompCert’s established correctness guarantees during the crucial code generation phase, ensuring that optimizations do not introduce errors. Furthermore, the incorporation of CompCertO facilitates compositional compilation, meaning that the compiler can be broken down into independently verifiable components. This modular design significantly simplifies the verification process and enhances confidence in the overall system’s reliability, ultimately generating verified assembly code with a strong foundation in formal methods and compositional reasoning.

A critical component of RustCompCert’s verification strategy is backward simulation, a technique designed to rigorously guarantee the preservation of program semantics throughout the compilation pipeline. This process mathematically establishes that if a program $Ms$ is transformed into $Mt$ by RustCompCert, then the meaning of $Mt$ is less than or equal to the meaning of $Ms$ , denoted as $RustCompCert(Ms)=Mt\Rightarrow⟦Mt⟧⩽⟦Ms⟧$ . Essentially, this ensures no information is lost or altered during compilation; the verified code behaves as the original source code intended. This preservation is achieved by systematically stepping backward through each compilation stage, confirming that each transformation maintains semantic equivalence, and provides a strong foundation for trusting the correctness and safety of the generated machine code.

A critical component of RustCompCert’s verification process lies in the formalization of Rust’s borrow checking mechanism, ensuring memory safety throughout compilation. This is achieved through the assertion that if the borrow checker successfully validates a program $M$ , denoted as $BorrowCheck(M)✓$ , then the resulting Rust intermediate representation $⟦M⟧RustIR$ is both semantically equivalent to a formally specified version $⟦M⟧RustIRspec$ , and demonstrably safe-meaning it adheres to the specified safety properties. This relationship, expressed as $BorrowCheck(M)✓\Rightarrow⟦M⟧RustIR⩽⟦M⟧RustIRspec\landsafe⟦M⟧RustIRspec$ , provides a rigorous guarantee that memory access is controlled, preventing data races and dangling pointers, and ultimately validating the safety of the compiled code with a formal, mathematical basis.

The Path Forward: Refining and Expanding Formal Guarantees

Despite the sophistication of Rust’s borrow checking system-enhanced by algorithms like Polonius-complete program safety remains an ongoing pursuit. While borrow checking effectively prevents data races and many memory safety errors at compile time, it offers partial safety because undefined behaviors beyond memory safety-such as integer overflows or panics-can still occur. These issues, though not directly related to memory management, represent potential vulnerabilities, and are not systematically addressed by the borrow checker itself. Therefore, even rigorously checked Rust code requires careful consideration of these remaining sources of undefined behavior, highlighting the need for complementary static analysis and runtime monitoring techniques to achieve truly robust and secure software.

The construction of the Mid-level Intermediate Representation (MIR) and its formal specification, RustIRspec, represent a pivotal step towards enabling rigorous verification of Rust programs. By establishing a well-defined and unambiguous intermediate representation, developers gain a crucial foothold for applying formal methods. RustIRspec doesn’t merely describe the MIR; it provides a precise, mathematical semantics that allows automated tools to reason about program behavior. This formalized representation serves as a bridge between the high-level Rust source code and the eventual machine code, enabling the systematic checking of safety properties at each stage of compilation. Consequently, errors can be detected and addressed much earlier in the development cycle, increasing confidence in the reliability and security of compiled software, and facilitating the creation of certified, trustworthy systems.

Ongoing investigations are dedicated to refining formal verification methodologies to address the challenges posed by increasingly complex software. Current research explores techniques to improve the precision of these methods, reducing the incidence of false positives and ensuring that only genuine safety violations are flagged. Simultaneously, significant effort is directed towards scalability – enabling the verification of larger and more intricate systems that were previously intractable. This includes the development of novel algorithms, data structures, and abstraction techniques to manage the state space explosion often encountered during verification. Ultimately, the goal is to move beyond verifying small, isolated components and towards providing comprehensive, end-to-end assurance for real-world software applications, fostering greater reliability and security in critical systems.

Move checking represents a vital refinement to Rust’s established borrow checking system, specifically addressing resource safety concerns that borrowing alone cannot fully resolve. While borrow checking expertly manages aliasing and prevents data races, it doesn’t inherently track ownership transfer – the precise moment a resource shifts from one part of the program to another. Move checking steps in to meticulously monitor these ownership transfers, guaranteeing that a resource is used only once after it’s been moved, thereby preventing use-after-free errors and double-free vulnerabilities. This complementary approach – borrow checking establishing safe access, and move checking tracking ownership – creates a robust system for preventing a wider range of memory safety issues and enhances the overall reliability of Rust programs, particularly in complex scenarios involving data structures and concurrent operations.

A key achievement lies in the demonstrated end-to-end verification process, formally expressed as RustCompCert(Ms)=Mt⇒safe⟦Ms⟧RustIRspec⇒safe⟦Mt⟧Asm. This rigorously proves that a semantically equivalent transformation (Ms=Mt) preserves safety properties throughout the entire compilation pipeline. Specifically, if a source program $Ms$ is deemed safe according to a defined specification, its translation to the Rust intermediate representation $RustIRspec$ remains safe, and this safety is then carried forward to the final assembly code $Asm$ . This chain of verification-from source to intermediate representation and ultimately to machine code-establishes a high degree of confidence in the reliability and security of compiled Rust programs, minimizing the risk of undefined behavior manifesting in deployed software.

The development of RustCompCert, as detailed within, mirrors a fundamental principle of system evolution: every iteration refines, but never truly halts, the march toward robustness. It’s a commitment to graceful decay, ensuring that even as the compiler transforms code, semantic preservation remains paramount. This pursuit resonates with the observation of Henri Poincaré: “Mathematics is the art of giving reasons.” In this case, the ‘reasons’ are formally verified, providing an unshakeable foundation for the compiler’s transformations, and addressing the critical challenge of ensuring memory safety through meticulous formalization of key compilation passes like borrow checking. Each version builds upon the last, creating an annals of increasingly trustworthy code.

What Lies Ahead?

RustCompCert represents a significant, though predictably constrained, step toward a fully trustworthy software stack. The choice to focus on a sequential subset, while pragmatic, underscores an inherent tension: simplification always incurs a future cost. Each abstraction laid down is a debt accruing interest, and the true measure of this work will be the complexity of extending it to a concurrent world-a world where memory safety is not merely about ownership, but about temporal reasoning. The compiler isn’t simply checking code; it’s remembering the guarantees made on its behalf.

Formal verification, even at this scale, is not a destination but a process. The semantics preservation arguments detailed within are, in essence, a careful accounting of transformations. As the system evolves, maintaining this accounting will require constant vigilance. The challenge isn’t simply to verify more features, but to devise methodologies for verifying change itself – to build compilers that can reason about their own evolution. The ideal isn’t perfection, but graceful decay.

Ultimately, RustCompCert serves as a potent reminder that the pursuit of correctness is a continuous negotiation with complexity. It doesn’t solve the problem of software reliability; it merely shifts the burden – from runtime errors to the meticulous labor of formal proof. The question, then, isn’t whether such systems are possible, but whether the accrued technical debt will ever outweigh the benefits of increased assurance.

Original article: https://arxiv.org/pdf/2602.07455.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/