Beyond Software Trust: Securing Embedded Systems from the Ground Up

Author: Denis Avetisyan

A new hardware-software co-design eliminates the need for a runtime software trusted computing base, offering a fundamentally more secure approach to embedded system protection.

Skadi’s architecture compartmentalizes subsystems and employs a specialized loader to establish a robust security model; trusted components-including a fault handler designed for controlled system halts and time-limited trapdoor/proxy mechanisms-operate alongside memory-mapped I/O capabilities delegated by the loader, collectively ensuring system integrity through carefully managed privilege and access control as represented by the flow of calls between subsystems.

This paper details Skadi and Bredi, a capability-based architecture achieving subsystem isolation and secure boot without a runtime TCB.

Embedded systems increasingly face sophisticated threats targeting application software, operating system kernels, and peripheral devices, yet existing security approaches typically address only a subset of these vectors. This paper, ‘Trust Nothing: RTOS Security without Run-Time Software TCB (Extended Version)’, introduces Skadi and Bredi, a novel hardware-software co-design that achieves robust security by eliminating the need for a runtime software trusted computing base (TCB). Through a capability-based architecture and a disaggregated, isolated real-time operating system built on Zephyr, all runtime subsystems-including the scheduler, allocator, DMA drivers, and peripherals-are treated as untrusted components. Does this approach represent a viable path toward building truly secure-by-design embedded systems for critical applications?

The Expanding Attack Surface: Beyond Conventional Defenses

The proliferation of interconnected embedded systems – from critical infrastructure and automotive components to medical devices and consumer electronics – has dramatically expanded the attack surface for malicious actors. No longer confined to traditional IT networks, these systems are increasingly targeted by sophisticated threats ranging from data breaches and denial-of-service attacks to physical manipulation and complete system compromise. The growing complexity of these devices, coupled with often limited resources and infrequent security updates, creates a fertile ground for exploitation. Consequently, a proactive and multi-layered approach to security is essential, demanding robust measures that extend beyond conventional software-based protections and address vulnerabilities at every level of the system architecture. The stakes are particularly high, as successful attacks can have far-reaching consequences, impacting safety, privacy, and economic stability.

Contemporary digital security frequently depends on layered software architectures, but this complexity inadvertently creates substantial risk. Each component within these stacks – operating systems, libraries, and applications – represents a potential attack surface, and vulnerabilities in even a single layer can compromise the entire system. Moreover, maintaining these sprawling software ecosystems presents ongoing challenges; patching flaws, addressing compatibility issues, and responding to newly discovered exploits demand significant resources and introduce further opportunities for error. This constant cycle of updates and revisions can inadvertently introduce regressions or create new vulnerabilities, making long-term security a moving target. The inherent difficulty in formally verifying such intricate systems highlights the need for fundamentally different approaches to security that minimize reliance on complex software and prioritize simplicity and provability.

The escalating complexity of modern systems necessitates a reimagining of security foundations, moving beyond software-centric defenses. Increasingly, researchers and developers are recognizing that true trustworthiness demands integrating security directly into the hardware architecture. This involves establishing a system where access to resources isn’t granted based on identity, but on capability – a secure token proving the right to perform a specific action. By minimizing the trusted computing base and reducing the attack surface, capability-based systems dramatically limit the potential damage from compromised software. This approach, fortified by hardware enforcement, creates a more resilient and predictable security posture, offering a pathway towards building systems that are inherently more secure and reliable against evolving threats, rather than constantly patching vulnerabilities in complex software stacks.

The Bredi Capability Operations module utilizes a finite state machine, interfacing with the calling device through memory-mapped I/O (MMIO) registers to manage input and output.

Northcape: A Foundation Built on Capability

The Northcape architecture implements access control through capabilities, which represent unforgeable tokens granting specific rights to resources. Unlike traditional access control lists (ACLs) or role-based access control (RBAC), capabilities are held by the subject requiring access, not centrally managed by an administrator. This decentralized approach allows for fine-grained control, as each capability precisely defines permissible actions on a given resource. A capability isn’t simply an identifier; it is the authority to act. These capabilities are typically small, opaque values, and possession of a capability implicitly grants the associated right, eliminating the need for runtime checks against a central policy database. This model supports delegation of access, where a subject can transfer a capability to another, enabling complex workflows while maintaining strict control over resource access.

Traditional operating system security relies on complex runtime permission checks to verify access rights before granting resource access. This process introduces significant overhead and creates opportunities for vulnerabilities if checks are incomplete or improperly implemented. Northcape’s capability-based architecture eliminates these runtime checks by associating access rights directly with the capability itself. Because access is inherently authorized by possession of the capability, and hardware enforces this possession, the system avoids the need for dynamic permission evaluation. This reduction in runtime checks drastically simplifies the security model and correspondingly minimizes the attack surface by removing a substantial class of potential exploits targeting permission handling logic.

The Northcape Translation Lookaside Buffer (NTLB) is a dedicated hardware cache designed to accelerate the translation of capabilities – the pointers used for access control – into physical addresses. By caching recent capability translations, the NTLB significantly reduces the latency associated with accessing protected resources. This minimizes performance overhead compared to software-based permission checking, which requires traversing access control lists or performing complex comparisons for each resource access. The NTLB’s high hit rate ensures that most capability translations can be resolved directly from the cache, enabling efficient and scalable operation even in systems with a large number of protected resources and frequent access patterns.

The Northcape system integrates diverse capabilities, including <span class="katex-eq" data-katex-display="false"> ext{perception} </span>, <span class="katex-eq" data-katex-display="false"> ext{planning} </span>, and <span class="katex-eq" data-katex-display="false"> ext{control} </span>, to achieve robust performance. — The Northcape system integrates diverse capabilities, including $ext{perception}$ , $ext{planning}$ , and $ext{control}$ , to achieve robust performance.

Bredi SoC: Hardware Enforcement for Provable Security

The Bredi System on Chip (SoC) serves as a dedicated hardware implementation for the Northcape architecture, a capability-based operating system kernel. This implementation leverages custom hardware components to directly enforce the security policies inherent in the Northcape design. Specifically, the Bredi SoC includes dedicated logic for managing and validating capabilities, preventing unauthorized access to system resources. This approach contrasts with software-only implementations, providing a lower-attack surface and enhanced resistance to exploits that attempt to bypass operating system security mechanisms. The hardware design allows for fine-grained access control and isolation, supporting the core principles of capability-based security and creating a demonstrably more secure computing environment.

The Bredi SoC implements capability restrictions by enforcing access control at the hardware level, limiting the privileges of software components to only those resources explicitly authorized. This is achieved through a hardware-enforced capability system that prevents unauthorized access to system memory and peripherals. Specifically, the architecture mitigates Direct Memory Access (DMA) attacks by requiring all DMA transfers to be explicitly authorized via these capabilities; any attempt to initiate a DMA transfer without a valid capability is blocked by the hardware, preventing malicious or faulty components from accessing sensitive data or disrupting system operation. This hardware enforcement provides a robust defense against DMA-based attacks that bypass traditional operating system security mechanisms.

Performance testing of the Bredi SoC implementation reveals soft real-time capabilities, characterized by an interrupt response time of 110 microseconds. This latency is consistent with baseline performance metrics established prior to hardware implementation, indicating minimal overhead introduced by the security features. Analysis of Field-Programmable Gate Array (FPGA) resource utilization demonstrates the practicality of the design; logic resource usage occupied 38% of available resources, memory resource usage was 22%, and DSP slices were utilized at 15%, confirming the implementation’s feasibility within typical FPGA constraints and allowing for potential integration of additional functionality.

The Bredi Capability Resolver utilizes a three-stage pipeline-ID hashing, CMT lookup, and CMT entry parsing-to resolve capabilities.

Skadi RTOS: Capability-Based Compartmentalization in Practice

Skadi represents a novel operating system constructed upon the foundations of the Northcape architecture, fundamentally prioritizing security through capability-based compartmentalization. This approach eschews traditional access control lists in favor of a system where every component – be it an application, driver, or kernel module – operates within a strictly defined scope of permitted actions. Access to system resources isn’t granted based on identity, but rather on possession of unforgeable ‘capabilities’ – essentially, tokens that explicitly authorize specific operations. This design drastically reduces the attack surface, as a compromise of one component doesn’t automatically grant access to the entire system; instead, the malicious actor is confined to the limits defined by the compromised component’s capabilities. By meticulously controlling interactions between components and limiting their privileges, Skadi aims to establish a highly resilient and secure computing environment where the impact of vulnerabilities is significantly contained.

Skadi RTOS employs a robust system of subsystem calls and ID restrictions to govern communication between its isolated components. Rather than relying on traditional inter-process communication methods, each component can only interact with others through explicitly authorized channels. This is achieved by requiring all inter-component communication to occur via subsystem calls – requests directed to a specific subsystem responsible for handling that type of interaction. Critically, these calls are further restricted by subsystem IDs, ensuring that a component can only invoke operations on subsystems it has been granted permission to access. This fine-grained control minimizes the attack surface, as even a compromised component lacks the ability to arbitrarily interact with the entire system, thus confining potential breaches and bolstering overall system security.

Skadi RTOS demonstrates a compelling trade-off between security and performance, achieving sub-millisecond execution for the majority of its core operations. While designed to eliminate the conventional runtime software Trusted Computing Base (TCB), the implementation of capability-based compartmentalization and subsystem calls introduces performance overhead in specific scenarios, notably ICMP and zperf, where execution times increase by a factor of three to seven. This overhead represents the cost of robust security – a functional prototype successfully mitigates threats from malicious software, compromised kernels, and rogue devices – without relying on a traditional, software-based TCB, showcasing a viable path towards heightened system resilience.

Skadi's operating system is decomposed into modular components to enhance flexibility and maintainability. — Skadi’s operating system is decomposed into modular components to enhance flexibility and maintainability.

The pursuit of absolute security, as demonstrated by Skadi and Bredi, necessitates a departure from traditional, runtime-reliant methods. The architecture prioritizes provable correctness over empirical testing, recognizing that vulnerabilities often emerge from the complexity of software TCBs. This echoes Henri Poincaré’s sentiment: “Mathematics is the art of giving reasons.” The system’s reliance on a Capability Architecture and hardware-software co-design isn’t merely about building a secure system; it’s about establishing a mathematically rigorous foundation where security isn’t assumed, but proven through design. The elimination of the runtime TCB, a core achievement of the research, aligns with this principle of mathematical certainty, minimizing the potential for unpredictable behavior and bolstering the system’s resilience against malicious actors.

The Road Ahead

The presented work, while demonstrating a pathway toward eliminating the runtime software TCB, does not, of course, achieve perfection. The inherent complexities of capability-based systems – the management of rights, the prevention of unintended sharing, and the formal verification of policies – remain substantial challenges. Future investigations must address the scalability of Skadi and Bredi beyond the demonstrative examples. Asymptotic analysis of the overhead associated with fine-grained access control, particularly in resource-constrained embedded environments, is crucial; empirical validation is insufficient.

Furthermore, the reliance on secure boot and a trusted hardware root – while pragmatic – introduces its own set of assumptions. A truly robust system requires formal guarantees not merely about the absence of runtime vulnerabilities, but also about the integrity of the initial conditions. Exploration of techniques for remote attestation, verifiable build processes, and supply chain security become, therefore, not tangential concerns, but fundamental necessities.

Ultimately, the elimination of the runtime TCB is less a destination and more a shifting of the problem. The core difficulty – establishing and maintaining trust in a complex system – remains. The future likely lies in a synthesis of formal methods, hardware specialization, and a relentless pursuit of minimal, provable correctness – a tedious, unforgiving path, but the only one worthy of the name ‘security’.

Original article: https://arxiv.org/pdf/2603.08400.pdf

Contact the author: https://www.linkedin.com/in/avetisyan/

The Expanding Attack Surface: Beyond Conventional Defenses

Northcape: A Foundation Built on Capability

Bredi SoC: Hardware Enforcement for Provable Security

Skadi RTOS: Capability-Based Compartmentalization in Practice

The Road Ahead

See also: