The CPS-IoT Week 2026 event is pleased to announce the availability of student travel awards supported by the ACM Special Interest Group on Embedded Systems (SIGBED).
The CPS-IoT Week 2026 event is pleased to announce the availability of student travel awards supported by the ACM Special Interest Group on Embedded Systems (SIGBED).
Application Deadline: March 25, 2026, at 23:59 AoE
Notification Deadline: March 28, 2026
Application form: https://forms.gle/NSPA3L586ZxWGN6M7
The travel grant awards funded by ACM SIGBED are open to both US-based and non-US- based students to attend ACM conferences./p>
Student recipients will be reimbursed (based on valid receipts) up to $2,000 for their expenses (registration, airfare, and hotel) after the conference.
Notes:
The reimbursement may be adjusted based on the budget and the applicant’s location.
This travel award is designed to attract new participants to give them an exposure to research in CPS and IoT. Accordingly, student authors presenting papers, or whose papers acknowledge grants, will be given lower priority. Students who have attendedCPS-IoT Week in the past 2 years will also receive low priority.
The applicant must be a SIGBED member (https://sigbed.org/join-us/) before submitting their application. Priority will be given to applicants (i) whose advisor is also a SIGBED member, and (ii) the candidate works on a CPS-themed funded researchproject OR has published in any top-tier CS conference or journal.
All recipients of the student travel grant awards are required to register for the main conference program by the awardee acceptance deadline, at the full or the student registration rate. The final award is contingent upon the recipient showing proof of registration at CPS-IoT Week 2026.
[Required by deadline] To apply, complete the application form with the following information. The form may be accessed via the following link: https://forms.gle/NSPA3L586ZxWGN6M7
Applicant Information: Full Name, Email Address, Affiliation (university/department), Location (country and city), Current Academic Program (undergraduate/master’s/PhD), Year in the Academic Program, SIGBED membership ID, ACM membership ID (optional).
Advisor’s Information: Full Name, Email Address, Affiliation, Is the advisor a SIGBED Member (along with membership ID)?
Research Information: Brief description of research area (4 lines), How is the research related to CPS/IoT and if it is not then is there a plan to extend it to the CPS-IoT domain (4 lines), Are you currently working on a funded CPS-related research project (yes/no)? If yes: provide project title, funding agency, project ID, and duration of the project.
Publications: List of top-three publications (title, author list, venue, year).
Participation in CPS-IoT Week 2026: Are you an author of a paper accepted to CPS-IoT Week 2026 (yes/no)? If yes: paper title, conference/workshop, Does the accepted paper acknowledge any research grants (yes/no)?
Previous CPS-IoT Week Attendance: Have you attended CPS-IoT Week in the past two years (yes/no)? If yes: which year(s), and did you have an accepted paper?
Motivation Statement: Why do you want to attend CPS-IoT Week 2026? How will you contribute to the CPS/IoT community? Limit your answer to 6 sentences.
Estimated Travel Budget and Plan: Airfare, Hotel Cost, Registration Cost, and travel dates.
Upload full CV, including full list of publications.
Agreement and acknowledgement: (a) I confirm that I am a SIGBED member before submitting this application. (b) I understand that the award is reimbursed after the conference with valid receipts. (c) I understand that I must register for the main conference program if selected. (d) I agree to provide proof of registration by the acceptance deadline.
Awardees will be notified by email and might be asked to present a poster on their ongoing research at CPS-IoT Week 2026.
Please save all the original receipts. You need to upload them later.
Contact: Samarjit Chakrabory, Pavithra Prabhakar, or Zhishan Guo if there are any questions (Emails: samarjit AT cs dot unc dot edu, ppavithra AT unm dot edu, zguo32 AT ncsu dot edu)
Call for Papers: Formal Methods in System Design
Special Issue on the Theoretical Foundations and Applications of Counterexample Guided Abstraction Refinement (CEGAR)
Submission Deadline: February 28, 2026
Springer FMSD link: https://link.springer.com/collections/jcchhbfcgh
More details: https://lnkd.in/gFQcuAaW
PDF CFP: FMSD_CEGAR_2026
Guest editors:
Orna Grumberg, Technion University, Israel (email: [email protected])
Samarjit Chakraborty, UNC Chapel Hill, USA (email: [email protected])
Somesh Jha, UW-Madison, USA (email: [email protected])

We are delighted to announce that the 2025 SIGBED Early Career Researcher Award has been awarded to Dr. Fanxin Kong, an Assistant Professor at University of Notre Dame. This award, established by ACM SIGBED in 2017, recognizes contributions by junior researchers in the area of embedded, real-time, and cyber-physical systems.
The 2025 Award Selection Committee was chaired by Professor Wanli Chang (Hunan University, China) and included Professor Jian-Jia Chen (TU Dortmund, Germany), Professor Andreas Gerstlauer (UT Austin, USA), Professor Pi-Cheng Hsiu (Academia Sinica, Taiwan), Professor Pavithra Prabhakar (University of New Mexico, USA), and Professor Marilyn Wolf (University of Nebraska-Lincoln, USA). The committee carefully considered the nominees and selected Dr. Fanxin Kong in light of his contributions.
Dr. Fanxin Kong’s research centers around assured and intelligent cyber-physical systems with focuses on physical AI, cyber-physical security and safety, and for applications to autonomous systems such as unmanned vehicles and robotic systems. He has published over 80 research papers at top venues including RTSS, RTAS, EMSOFT, ICCPS, DAC, etc, various IEEE/ACM transactions, and books/book chapters. His research is supported by NSF, AFRL, AFOSR, and DARPA. He has received multiple awards including NSF CAREER Award (2025), AFRL Summer Faculty Extension Award (2022, 2024), the champion of Embedded System Software Competition at ESWEEK (2024), etc. He has served as organizing committee members and program committee members for many workshops and conferences.
On behalf of ACM SIGBED, we extend our warmest congratulations to Dr. Fanxin Kong for this distinction and his contributions to embedded, real-time and cyber-physical systems.

We are delighted to announce that the 2025 ACM SIGBED Technical Achievement Award has been awarded to Professor Reinhard Wilhelm of Saarland University. This award, established by ACM SIGBED in 2022, recognizes significant and sustained contributions to research and system implementations in embedded, real-time, and cyber-physical systems. It honors technical achievement whose impact has been long-lasting and deeply felt across the SIGBED-relevant domains.
Although the award was scheduled to be presented at ESWEEK 2025 in Taiwan, Prof. Wilhelm was unfortunately unable to attend the ceremony. SIGBED nevertheless proudly recognizes his towering and enduring contributions.
The 2025 Award Selection Committee was chaired by Professor Tei-Wei Kuo (Delta Electronics & National Taiwan University, Taiwan) and included Professor Nikil Dutt (University of California, Irvine, USA), Professor Joerg Henkel (Karlsruhe Institute of Technology, Germany), Professor Insup Lee (University of Pennsylvania, USA), and Professor Lothar Thiele (ETH Zurich, Switzerland). The committee carefully considered the nominees and selected Professor Wilhelm in light of his outstanding career.
Professor Wilhelm is being honored for his foundational and transformative work on worst-case execution time (WCET) analysis, timing predictability and static program analysis in embedded and cyber-physical systems. His research laid the theoretical and practical basis for estimating safe upper bounds on execution times of tasks in safety-critical real-time systems—a capability without which scheduling, and certification of these systems would simply not be possible. In particular, it is widely recognized that almost all research results and techniques for timing analysis and timing certification of safety-critical systems rely fundamentally on accurate and sound WCET estimates. The work of Professor Wilhelm’s group, that includes the landmark survey “The Worst-Case Execution Time Problem” and his CACM article “Computation Takes Time, But How Much?”, helped clarify the many difficulties of timing analysis on modern processors (that include caches, pipelines, speculation, viz., features for accelerating the average case and not the worst-case timing behaviors of programs) and proposed sound methods rooted in abstract interpretation and static analysis. Moreover, he co-founded the company AbsInt GmbH, today a world-leader in timing and WCET analysis, which has seen deployment of its aiT WCET analyzer in industrial certification settings (for example aviation systems in several Airbus plane lines). In short, the entire real-time systems research community and the industrial practice of timing verification and certification quite literally stands on the shoulders of his results.
Professor Wilhelm’s career spans decades of leadership and innovation. He held the Chair for Programming Languages and Compiler Construction at Saarland University from 1978 until 2014, and during that time he made major contributions across programming languages, compilers, static program analysis and real-time systems. He was the founding Scientific Director of the Leibniz Center for Informatics (Schloss Dagstuhl) from 1990 to 2014. He is a Fellow of the ACM, recipient of the Konrad-Zuse Medal, and is a member of the German National Academy of Sciences Leopoldina. His research has delivered both deep theoretical advances (for example in shape analysis, attribute grammars, code generation) and broad practical impact through industrial tool exploitation. With more than thirty years of work, numerous highly cited papers and leadership in building the field, he is a model of sustained achievement, both for the SIGBED and also for the broader Computer Science research community.
On behalf of ACM SIGBED, we extend our warmest congratulations to Professor Reinhard Wilhelm for this extraordinary distinction and his far-reaching contributions to embedded, real-time and cyber-physical systems.
The photo credit goes to Raphael Reischuk.

When I first encountered an intelligent system missing a deadline—not because of a hardware fault, but due to a poorly scheduled inference—I knew something had to change. That realization sparked a journey that led to more than a decade of research, a vibrant lab at NC State, and ultimately, the honor of receiving the 2025 SIGBED Early Career Researcher Award.
In this post, I’ll share how that journey evolved—through research that sits at the intersection of real-time scheduling, machine learning, and embedded systems—and how I envision the future of intelligent cyber-physical systems (CPS) that are not only capable, but also certifiable, efficient, and resilient.
CPS applications—from autonomous vehicles to wearable health devices—are becoming increasingly intelligent, but with this added intelligence comes added complexity. These systems must respond in real-time under uncertain conditions, with limited resources, and often in safety-critical scenarios. The traditional design principles for embedded systems struggle to scale when modern workloads include perception, learning, and control, all running concurrently.
To address this, my research focuses on three tightly connected levels: real-time scheduling and resource management at the system level, adaptive and lightweight inference at the algorithmic level, and trust and security at the data level.
My early work focused on enhancing predictability in embedded platforms using mixed-criticality (MC) scheduling models. We designed precise scheduling approaches for varying-speed multiprocessors and host-centric architectures, enabling systems to dynamically adapt to critical workloads while maintaining temporal correctness. We extended these ideas to gang tasks, co-scheduling on CPU-GPU systems, and even to systems with hierarchical pacing and data offloading. One of our more recent ideas, IDK-cascades, explores how uncertain classification outcomes in ML pipelines can be deferred or resolved adaptively to minimize the expected time to successful decision-making.
These techniques collectively form a foundation for designing systems that can reason about timing, uncertainty, and resource constraints in a principled way.
Real-world deployments helped push these ideas further. In the F1-Tenth autonomous racing platform, we built a real-time ROS 2 framework with physics-informed scheduling, preemptable executors, and digital-twin-based safety validation. Our scheduling analysis for ROS 2 with resource contention and latency constraints showed that it’s possible to achieve both timing guarantees and runtime adaptability—something long thought difficult in robotics.
In wearable healthcare, we tackled real-time cardiovascular disease detection using embedded deep learning models. Our work introduced attention-based CNN-LSTM architectures that balance inference latency, accuracy, and memory use on resource-limited platforms. These systems adapt model complexity based on a patient’s heart signals, enabling both real-time responsiveness and power efficiency.
We also applied deep learning to wearable kinematic sensing, using compact IMU sensors and multimodal networks to reconstruct lower-limb motion and joint forces in daily environments. This work is particularly relevant for monitoring gait abnormalities and rehabilitation, where lab-grade motion capture is infeasible.
Another critical direction is ensuring trust and safety when ML models are deployed in embedded CPS. We examined the growing risks of backdoor attacks in pretrained models and proposed defenses using Fisher information to guide purification. We also developed domain adaptation techniques that allow systems to adapt securely to changing environments without needing source data—essential for mobile and privacy-sensitive CPS.
Furthermore, we investigated federated learning in real-time systems, enabling multiple edge devices to collaborate without centralized coordination while still meeting strict deadlines. This line of work has implications for collaborative robotics, smart infrastructure, and multi-party sensing systems.
Our lab continues to explore how AI-enabled systems can be made predictable, secure, and efficient. We’re investigating physics-informed control for real-time robots, real-time federated learning protocols for mobile CPS, and adaptive digital twins that enable continuous verification of complex systems. I’m excited to collaborate with others working on the same grand challenge: making intelligent embedded systems trustworthy by design.
I’m very honored to receive the SIGBED Early Career Award, and I remain committed to pushing the boundaries of what embedded and real-time systems can achieve.
Author: Zhishan Guo is a tenured Associate Professor at the Department of Computer Science of North Carolina State University, where he directs the Real-Time and Intelligent Systems lab and the Cyber-Physical Systems (CPS) research group. His research interests are in real-time scheduling theory and machine learning theory with applications to cyber-physical systems. He is a recipient of multiple highly competitive awards, such as Best Paper Awards, Outstanding Paper Awards, and Best Student Papers Awards in EMSOFT, RTAS, and RTSS, Outstanding Undergraduate Teaching Award from UNC-Chapel Hill, etc.
He has been committed to strengthening the embedded and CPS community, such as serving in the IEEE TCRTS executive committee since 2020, and as the Treasurer of ACM SIGBED since 2025, supporting events, student programs, and the continued health of our community’s technical and organizational initiatives. As a co-founder and lead of the CPS Research Focus Group at NC State, he helps bring together about 10 faculty members and more than a hundred students across multiple disciplines to tackle problems at the intersection of sensing, communication, AI, and computation. These roles reflect his belief that technical progress goes hand in hand with community building.
Disclaimer: Any views or opinions represented in this blog are personal, belong solely to the blog post authors, and do not represent those of ACM SIGBED or its parent organization, ACM.
]]>ACM SIGBED SRC is the main student research competition in the real-time, embedded, and cyber-physical systems community. There are two categories: Undergraduate and Graduate. The champions will represent SIGBED and compete against other SIGs in the ACM Grand Finals.
Process:
There are two rounds for the SIGBED SRC. In the first round, a participant submits a 2-page extended abstract describing his/her original work to be reviewed, which may or may not have been published. For the undergraduate category, the participant does not have to be the main contributor of the work, but his/her portion needs to be clearly stated. A participant accepted to the second round will make a live presentation in ESWEEK 2025.
Travel Grant:
Travel grant will be supported to attend ESWEEK 2025.
Awards:
For each category, three overall winners will be recognized. Champions, first runners-up and second runners-up will receive USD $500, $300 and $200, respectively.
Submission and Deadline:
Submission for the first round is already open (https://sigbed25src.hotcrp.com/) and will be closed on 7 September.
Contact:
Questions should be directed to Prof. Wanli Chang ([email protected]).
ACM SIGBED established the Early Career Researcher Award in 2017 to recognize contributions by junior researchers in the area of embedded, real-time, and cyber-physical systems. We are now inviting nominations for the 2025 edition of this award.
The nominee should
The nomination package should consist of a nomination letter summarizing the nominee’s research contributions and explaining their significance and relevance to SIGBED. The letter should also compare the nominee’s research impact and scholarship with others at a similar stage in their career, and make a convincing case for an early career award.
In addition, a summary document, listing in a tabular form, the following information about the nominee, should be included in the nomination package:
Finally, a CV of the nominee is needed, containing a full publication list.
All these three documents (nomination letter, summary, and CV) should be sent as a single PDF file by email to ACM SIGBED ([email protected]), from either the nominator or the nominee. The subject of the email should be “Nomination for ACM SIGBED Early Career Researcher Award 2025”.
Please send in your nomination by 30 June 2025.
The award will be presented at ESWeek 2025, consisting of a plaque engraved with the nominee’s name and a USD 2,000 honorarium. Funding for the award is provided by ACM SIGBED.
The award committee members are not allowed to make nominations. If their former students or colleagues from the same institution are nominated, then appropriate steps will be taken to ensure that the decision on the award remains impartial.
University of Notre Dame
Boston University
North Carolina State University
University of Freiburg
Carnegie Mellon University
Hong Kong Polytechnic University
Scuola Superiore Sant’Anna
Duke University
Max Planck Institute for Software Systems
From autonomous vehicles navigating city streets to robots managing packages in warehouses, robotics is reshaping industries across the globe. The Robot Operating System 2 (ROS 2) plays a pivotal role in this transformation as a powerful middleware framework that simplifies the development of safe, efficient, and scalable robotic systems. With its intuitive tools for creating seamlessly interconnected components, ROS 2 not only accelerates innovation in fields like autonomous driving and industrial logistics but also serves as a vital platform for advancing robotics research worldwide.
As ROS 2 has evolved over time, its design has largely focused on practical needs such as easy of use and flexibility – qualities that have contributed to its widespread adoption. However, this organic growth has left certain aspects underexplored, particularly when it comes to rigorous theoretical analyses and formal specification of system behavior. While scientific efforts have emerged to analyze ROS 2 systems – especially in the context of real-time communication and scheduling – the complexity of retrofitting formal properties into an established framework remains an ongoing challenge.
This blog tells the story of how we uncovered a subtle yet impactful issue in ROS 2’s scheduler – an issue that highlights the challenges of ensuring predictable behavior in complex systems and shows the needs for formal methods and rigorous analysis to build reliable robotic frameworks.
The executor is a fundamental component of ROS 2’s architecture, responsible for determining when tasks are executed. By managing available system resources, it is capable of handling diverse robotic systems without requiring developers to delve deeply into scheduling complexities.
Tasks in ROS 2 fall into two categories:
To manage these tasks, executors use a structure called the wait set, which temporarily stores eligible tasks before they are processed. The scheduling process alternates between two key phases:
Once all tasks have been processed – or no remaining tasks can be executed – a new polling point begins. By default, executors operate using a single thread, and developers can use multiple single-threaded executors to manually distribute workloads across the system. Alternatively, ROS 2 also offers a multi-threaded executor, where multiple threads collaborate to manage and execute tasks simultaneously, enabling parallel processing in complex systems.
The executor is versatile and user-friendly by design, aiming to ensure that all activated tasks eventually execute. However, it has certain limitations – for example, if a timer has a period shorter than the length of processing windows, it may not be executed as frequently as configured. This can result in timing deviations that affect system behavior under specific configurations.
Efficient workflow design in ROS 2 is essential for smooth scheduling and execution. A key principle is to avoid blocking executor threads by waiting synchronously for other tasks or external events.
Blocking behavior can delay ready-to-run tasks due to the executor’s design, reducing system responsiveness and performance. In extreme cases, tasks can block each other indefinitely, causing deadlocks that halt the system entirely.
One example of this challenge is found in robotics applications like autonomous navigation systems that use GPU accelerators for image processing. If a task sends data to the GPU and waits synchronously for results before completing, it prevents other tasks from being scheduled—creating a bottleneck that lowers system efficiency.
To solve this problem, ROS 2 promotes non-blocking designs where dependencies are handled asynchronously. For example, once the GPU finishes processing, another task should retrieve the results without occupying executors unnecessarily.
However, implementing non-blocking designs efficiently introduces new challenges:
In our research, we started by addressing the second challenge: how to poll results efficiently without wasting resources. This led us to uncover the main problem in our work. Let’s first explore possible solutions to this challenge.
Solution 1: Using timers
Our initial approach leveraged ROS 2 timers as part of the executor’s workflow to periodically check whether GPU results were ready. Timers integrate seamlessly into ROS 2’s architecture by activating at predefined intervals without requiring additional signaling mechanisms.
While effective in theory, this solution introduced inherent delays:
To minimize these delays as much as possible, we reduced the timer’s period down to zero – ensuring that it was sampled at every polling point within the executor’s cycle.
However, this approach introduced its own inefficiencies:
Recognizing these inefficiencies, we turned to an alternative approach – dedicating a thread specifically for result-checking
Solution 2: Dedicated thread
To overcome these limitations, we explored an alternative solution using ROS 2’s multi-threaded executor design. Breaking the principle against blocking tasks intentionally, one thread was dedicated exclusively to checking GPU results, while other threads handled unrelated workload tasks concurrently.
This approach improved responsiveness but introduced inefficiencies:
While dedicating threads improves responsiveness, it also introduces concurrency challenges when shared resources were accessed simultaneously. To address these issues effectively, ROS 2 provides callback groups – a mechanism that prevents simultaneous execution of callbacks assigned within the same group. Specifically, if any task of a callback group is executed by a thread, all other tasks of that callback group are blocked. By constraining parallelism through callback groups, we avoided resource conflicts successfully – but this led us toward discovering another significant issue with ROS 2’s multi-threaded executor: “starvation”.
Building on our use of ROS 2’s multi-threaded executor and callback groups to prevent concurrent access, we began testing various configurations to integrate accelerators like GPUs into ROS 2 systems. Initially, we focused on a simple setup: two timer tasks sharing a single callback group. One task was responsible for checking accelerator result availability (with its period set to zero), while the other handled post-processing operations that accessed the results of GPU computations.
However, during testing, an unexpected behavior emerged – only the timer task with a period of zero executed repeatedly, while the post-processing task was completely ignored.
At first, we dismissed this as one of those “mysterious quirks” often attributed to ROS 2 and moved on without further investigation. A colleague urged us to dig deeper into the issue, but we opted for a quick fix instead: splitting the tasks into separate callback groups. This change temporarily resolved our problem; both tasks executed as expected again.
This seemingly plausible solution – separating tasks into different callback groups – proved problematic and reintroduced old challenges, such as race conditions when accessing shared data structures used by both tasks. These were precisely the problems callback groups were designed to prevent in ROS 2 systems. Faced with this trade-off, our only remaining option was to manually introduce mutexes – a solution that bypasses ROS 2’s framework and shifts concurrency management to developers, reintroducing older problems such as blocking callbacks unnecessarily and increasing the risk of deadlocks.
Ultimately, we reverted back to our original configuration where both tasks belonged to one callback group – and found ourselves back at square one: only one task (the timer with a period of zero) was being executed repeatedly while the post-processing task remained untouched.
This persistent issue marked a turning point in our investigation – we conceded to calls for deeper analysis into ROS 2’s multi-threaded executor mechanisms.
Given the design of the single-threaded executor, all activated tasks are guaranteed to eventually execute – behavior that we have confirmed through our experience. Since the multi-threaded executor uses the same underlying structure as the single-threaded version, we expected it to behave similarly. However, our closer examination revealed something surprising: under certain configurations, some activated tasks were consistently ignored by the executor despite being ready for execution. This phenomenon is known as “starvation” – a condition where certain tasks fail to execute due to repeated prioritization of others.
To uncover why certain tasks were being starved in our tests, we developed a complete state machine model to analyze the core mechanisms of ROS 2’s executor (see figure below for an overview).

To focus on the most relevant aspects of this model, we created a simplified version that captures the key operations during its primary phases—polling points and processing windows—and their interaction with tasks in the wait set when callback groups are involved (see figure below).
Polling points occur when either the wait set is empty or all remaining tasks in the wait set are blocked by callback groups. During a polling point (illustrated on the left side of the figure), the executor clears the wait set, determines which tasks are activated, and adds those to the wait set. In the subsequent processing window (shown on the right side of the figure), tasks from this updated wait set are selected for execution, and their corresponding callback groups are marked as blocked during execution.

While these mechanisms generally ensure efficient scheduling, complications arise when multiple threads and callback groups interact – introducing unintended side effects under specific configurations.
To understand how polling points contribute to starvation, we revisit our earlier example involving a single callback group containing two tasks. One thread executes one task from this group while another thread remains idle with the post-processing task still in the wait set.
This process typically works well but breaks down when previously blocked tasks belong to callback groups that remain marked as blocked during polling points.
The following sequence of events illustrates how starvation occurs:
This repetitive cycle causes starvation for certain tasks within systems configured with specific combinations of threads and callback groups – preventing them from ever being executed despite being activated.
Recognizing the critical impact of starvation in ROS 2’s multi-threaded executor, we set out to raise awareness and find solutions. Our first step was engaging with the ROS 2 community through a GitHub pull request – a direct attempt to notify developers and initiate discussions around our findings. Initial responses suggested that this behavior might either be an intentional design choice of the executor or simply a flaw specific to our system configuration.
Confident that this issue extended beyond our initial setup, we validated our observations by testing various system configurations – including underutilized setups where workloads were minimal and task conditions should not have caused problems. Yet consistently, we observed starvation behavior across our tests – reinforcing our belief that the problem lay deeper within the executor’s design. This raised an important question: was starvation indeed an intentional design choice of the executor, or was it an unexpected behavior?
Despite these results, spreading awareness proved challenging. We seemed to be the only ones encountering – or at least recognizing- this issue within our systems. Moreover, documentation for ROS 2’s executor neither mentioned starvation nor provided sufficient detail for developers to anticipate such behavior during implementation. This lack of visibility made detecting and diagnosing starvation particularly difficult.
Typically, tuning parameters like task priorities or splitting tasks into multiple callback groups can help optimize performance in ROS 2 systems; however, no amount of tuning could resolve this particular issue without fundamental changes in how tasks are managed by the multi-threaded executor:
This trade-off created a frustrating situation for developers – including us – working with ROS 2’s multi-threaded executor, forcing them either to accept inefficiencies or gamble with unpredictable behavior. We thus asked ourselves if this issue could be ignored or if it was too critical to overlook.
However, this issue could to catastrophic behavior in safety-critical systems, where missed task execution is unacceptable and compromises safe operation. For instance, if an autonomous vehicle fails to execute collision detection callbacks due to starvation, catastrophic outcomes could result. Given ROS 2’s widespread adoption across industries like autonomous driving and logistics, addressing this problem became even more urgent.
Internally within our research group, we introduced this problem as part of our research discussions – and while there were questions whether fixing it was worthwhile as an academic pursuit (given that resolving it might only require minor code changes), we believed otherwise. When reviewing existing research on ROS 2’s multi-threaded executor, we did not find any discussion about starvation-related issues – even though existing analyses [1,2] implied that starved tasks would eventually execute despite our evidence showing otherwise.
At this point, tackling this issue became not just worthwhile but necessary – not only from an academic perspective but also to ensure robust real-world deployment of ROS 2 systems across diverse industries. We were now convinced that there was a problem. The question was: how should we begin to fix it?
To address the starvation issue in ROS 2’s multi-threaded executor, we identified two key changes required to eliminate the problem. Each change tackles a specific aspect of the executor’s behavior that contributes to starvation:
These two modifications effectively eliminate starvation across all system configurations, guaranteeing that every task in a callback group will eventually execute after activation.
While achieving starvation freedom is critical for reliable scheduling, it is equally important to evaluate potential drawbacks introduced by our changes – particularly in terms of performance overhead. We assessed whether guarding callback groups with mutexes and modifying how the wait set is handled would significantly impact scheduling efficiency. Our evaluation showed negligible performance impact. The additional overhead introduced by these changes remains insignificant compared to the overall scheduling overhead of the executor. Developers can therefore adopt our solution without concerns about compromising system performance.
To validate our approach further, we focused on formally verifying that our proposed executor design is starvation-free. Additionally, we proved that the design is deadlock-free, as the newly introduced mutexes could theoretically lead to deadlocks. To achieve this, we created manual proofs validating both deadlock freedom and starvation freedom and complemented them with model checking using SPIN – a tool for verifying concurrent systems. For this purpose, we developed a custom state machine model representing the new executor design and verified its behavior using SPIN’s capabilities. This combination of manual proofs and automated model checking demonstrates that it is possible to formally verify specific properties of executors in ROS 2.
Beyond implementing these fixes internally, we reached out to authors of existing research on ROS 2’s multi-threaded executor to raise awareness about this issue and its resolution. At the time of writing:
By addressing this critical flaw in ROS 2’s multi-threaded executor, we hope developers will benefit from more predictable and reliable task execution – ensuring robust deployments across industries like autonomous driving, logistics, and beyond.
Our work uncovered a subtle yet impactful issue in ROS 2’s scheduling mechanisms – an issue that highlights the challenges of ensuring predictable behavior in complex systems. By modifying how the wait set is managed and introducing safeguards for callback groups, we ensured starvation freedom as a property of ROS 2’s multi-threaded executor.
This journey underscores the clear need for rigorous theoretical foundations and clearly specified properties for ROS 2. Ensuring that fundamental assumptions about task scheduling hold true – and are rigorously verified – is critical as robotic systems grow increasingly complex and applications demand higher levels of reliability. Formal verification played a key role in validating our solution, providing strong assurances about its correctness and reliability across diverse configurations.
As ROS 2 continues to advance, we hope this approach becomes integral to its development, ensuring more predictable and robust deployments – particularly for safety-critical industries like autonomous driving and logistics. We also hope our findings inspire further collaboration within the ROS 2 community to advance its scheduling mechanisms through rigorous study of its properties – paving the way for innovative solutions that meet evolving needs for safety and reliability in modern robotics.
For those interested in exploring our work in greater detail, we invite you to read our full scientific paper, ‘Thread Carefully: Preventing Starvation in the ROS 2 Multi-Threaded Executor’, presented at EMSOFT 2024 [3], which includes all the technical details, formal proofs, performance evaluations, and insights into achieving starvation-free scheduling in ROS 2 systems.
Harun Teper is a research assistant in the Faculty of Computer Science at TU Dortmund University, working with the Design Automation for Embedded Systems Group.
Daniel Kuhse is a research assistant in the Faculty of Computer Science at TU Dortmund University, working with the Design Automation for Embedded Systems Group.
Mario Günzel is a postdoc researcher in the Faculty of Computer Science at TU Dortmund University, working with the Design Automation for Embedded Systems Group.
Georg von der Brüggen is a senior researcher in the Faculty of Computer Science at TU Dortmund University, working with the Design Automation for Embedded Systems Group.
Falk Howar is a professor in the Faculty of Computer Science at TU Dortmund University, leading the Automated Quality Assurance Group.
Jian-Jia Chen is a professor in the Faculty of Computer Science at TU Dortmund University and also works with the Design Automation for Embedded Systems Group. Additionally, he is affiliated with the Lamarr Institute for Machine Learning and Artificial Intelligence.
References
[1] H. Sobhani, H. Choi and H. Kim, “Timing Analysis and Priority-driven Enhancements of ROS 2 Multi-threaded Executors,”IEEE 29th Real-Time and Embedded Technology and Applications Symposium (RTAS), 2023
[2] X. Jiang, D. Ji, N. Guan, R. Li, Y. Tang and Y. Wang, “Real-Time Scheduling and Analysis of Processing Chains on Multi-threaded Executor in ROS 2,” IEEE Real-Time Systems Symposium (RTSS), 2022
[3] H. Teper, D. Kuhse, M. Günzel, G. v. d. Brüggen, F. Howar and J. -J. Chen, “Thread Carefully: Preventing Starvation in the ROS 2 Multithreaded Executor,” in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2024
Disclaimer: Any views or opinions represented in this blog are personal, belong solely to the blog post authors and do not represent those of ACM SIGBED or its parent organization, ACM.
Call for Nominations
The SIGBED Paul Caspi Memorial Dissertation Award was established in 2013. The award recognizes outstanding doctoral dissertations that significantly advance the state of the art in the science of embedded systems, in the spirit and legacy of Dr. Paul Caspi’s work.
The SIGBED Paul Caspi Memorial Dissertation Award is presented at most once per year, to the author of an outstanding dissertation in the area of Embedded Systems. The author of the winning dissertation is invited to publish a dissertation summary in the ACM SIGBED Newsletter. The award includes an award certificate for the author and an honorarium of 2000 USD sponsored by ACM SIGBED. A public citation for the award paper is placed on the SIGBED website.
The SIGBED Paul Caspi Memorial Dissertation Award is given to an outstanding doctoral dissertation with a topic falling in embedded, real-time, or cyber-physical systems. For the 2024 award, the dissertation should be dated during the period of 1 April 2023 to 31 March 2025.
The award committee will adjudicate conflicts of interest. Members may continuously remain on the committee for not more than two years. The award committee shall be no fewer than five persons in size.
Award Committee for 2024-2025
Nominations are solicited annually via the mailing list. A nomination should consist of an email to [email protected] with subject line “Paul Caspi Award Nomination” and the following items in the body of the e-mail:
The following additional documents should be attached to that email:
The primary selection criterion is the quality of the candidate’s work, with the aim to recognize outstanding doctoral dissertations. The award committee may choose to issue no award in a given year.
Nominations for the 2024-2025 award are due on 30 June 2025.
The award will be presented at ESWEEK 2025.
Call for Nominations
ACM SIGBED established the Technical Achievement Award in 2022 to recognize significant and sustained contributions to research and/or system implementations relevant to SIGBED, such as those over embedded, real-time, and cyber-physical systems. The award is based on the impacts of the research and/or system implementations made by the awardee throughout the lifetime. It consists of a plaque and a citation.
The nominee should be a SIGBED member and have a research focus on topics of relevance to SIGBED. See the SIGBED webpage for SIGBED sponsored conferences to determine relevance and instructions regarding how to join SIGBED.
The nomination package should consist of a nomination letter from a senior researcher in the field summarizing the nominee’s research contributions and explaining their impacts and relevance to SIGBED. The letter should also compare the nominee’s research impacts and make a convincing case for a technical achievement award.
In addition, the following should be included in the package. A document, listing in a tabular form, the following information about the nominee:
All of these three documents (nomination letter, summary, and CV) should be sent as a single PDF file by email to the chair of the award committee (Prof. Tei-Wei Kuo). The subject of the email should be “Nomination for ACM SIGBED Technical Achievement Award 2022”. The email should be sent by the nominator to the Chair of the Award Selection Committee with a cc-copy to the SIGBED Chair. Please contact the chair for any questions.
The award will be presented at ESWEEK 2025.
Saarland University