My primary research focuses on designing and building resilient critical infrastructure. Many sectors of society depend on critical infrastructure such as the power grid, making them essential to the functioning of society. Supervisory Control and Data Acquisition (SCADA) systems provide automated control and remote monitoring of such systems. Traditionally SCADA systems (equipment and software) were designed to be operated in the air-gapped environment without network interconnectivity or exposure. However, over time many factors such as cost-effectiveness and scalability drove vendors and utilities to adopt standard Information Technology (IT) platforms, increasing their interconnectivity to other networks beyond the traditional limits of Operational Technologies (OT). As a result, sophisticated attacks increasingly target power grid SCADA systems. These overwhelmingly increasing threats drive the need to build intrusion-tolerant techniques in resilient SCADA systems.
Real-Time Byzantine Resilience
In the world of increasing cyber threats, a compromised protection relay can put power grid resilience at risk by
irreparably damaging costly power assets (tens of thousands of dollars), causing significant disruptions, or leading to an inconsistent state.
This projects develops Byzantine Fault Tolerant architecture and protocols
to protect bulk power system components (345kV transformer) even when some protection relays (protecting the assets) are compromised.
While developing Byzantine Fault Tolerant protocols is challenging, a power grid substation's worst-case latency requirement is a quarter power cycle,
i.e., four milliseconds, adding a real-time response challenge.
Spire of the Substation is the first Real-time Byzantine Resilient System for the power grid substations is built to maintain correct operation while satisfying the performance and latency requirements, even in the face of successful compromises and network attacks in power grid substations. The work uses proactive recovery and diversity to allow the system to survive unbounded number of compromises over the system lifetime, as long as the number of simultaneous compromises does not exceed a certain threshold. Finally, machine learning-based situational awareness modules supplement the intrusion-tolerant system.
The system is developed and delivered to be deployed in testbeds at General Electric, Siemens and Hitachi Energy.
Severe Impact Resilience
The joint threats of increasingly frequent severe natural disasters and follow-on sophisticated malicious
cyberattacks are becoming increasingly realistic and seriously threaten critical infrastructure systems.
This novel threat model and the impact of such threats on critical infrastructure are not well understood.
The project defines the threat model and develops a framework to assess
the impact of novel compound threats on critical infrastructure with the aim to develop severe impact resilient control systems.
One interesting outcome of initial work is that existing architectures are not resilient to compound threats. This led to many interesting research directions in Resilitient Systems for Critical Infrastructure where we are exploring more dynamic and flexible architectures compared to traditinal architectures.
Open Source Software Releases
The opensource software releases from my work in DSN Lab during my Ph.D.
Spire is an open-source intrusion-tolerant SCADA system for the power grid. Spire is designed to withstand attacks and compromises at both the system level and the network level, while meeting the timeliness requirements of power grid monitoring and control systems (on the order of 100-200ms update latency).
Spines is a generic messaging infrastructure that provides transparent unicast, multicast and anycast communication over dynamic, multi-hop networking environments without the need for expensive router programming environments or low level router coding. It provides automatic reconfiguration and network flexibility required for research and production deployments.
Prime is a Byzantine fault-tolerant replication engine that provides meaningful performance guarantees even after some of the replication servers have been compromised.
Sahiti Bommareddy, Maher Khan, David J Sebastian Cardenas, Carl Miller, Christopher Bonebrake, Yair Amir, Amy Babay
Accepted at International Workshop on Explainability of Real-time Systems and their Analysis at the IEEE Real-Time Systems Symposium (RTSS 2022)
Sahiti Bommareddy, Daniel Qian, Christopher Bonebrake, Paul Skare, Yair Amir
International Symposium on Reliable Distributed Systems, Vienna, Austria, September 2022, pp. 134-144
Sahiti Bommareddy, Benjamin Gilby, Maher khan, Imes Chiu, Mathaios Panteli, John W. van de Lindt, Yair Amir, Amy Babay
Workshop on Data-Centric Dependability and Security (co-located with IEEE/IFIP DSN), Baltimore, USA 2022
I had the great pleasure and opportunity to be Teaching Assistant and Special help on the following courses:
Advanced Distributed Systems (JHU EN.601.717):The course is managed as a few discussion groups, each is focused around a selected research topic. Each group investigates far-reaching ideas, and designs and implements a useful semester-long project related to the topic.
Distributed Systems (JHU EN.601.417/617):The course teaches how to design and implement efficient tools, protocols and systems in a distributed environment.
Fall 2021 , Fall 2019
Software for Resilient Communities (JHU EN.601.310):This is a project-based course focusing on the design and implementation of practical software systems. Students will work in small teams to design and develop useful open-source software products that support our communities. Students will be paired with community partners and will aim to develop software that can be used after the course ends to solve real problems facing those partners today.
Intermediate Programming (JHU EN.601.220 ):Programming in C and C++.
Introduction to AI (Duke University, CS270):The course is algorithms and representations used in artificial intelligence. Introduction to and implementation of algorithms for search, planning, decision, theory, logic, Bayesian networks, robotics and machine learning.
Specialized in application load analysis, performance evaluation, network performance analysis and optimization.
2013 - 2016 @ Aktrix Technologies : Co-Founder and Software Engineer
Led the performance engineering team on multiple client projects.
Ensured application performance Service-Level-Agreement(SLA) by identifying and resolving performance bottlenecks.
Initiated and led development of on an automated performance monitoring system to enable machine learning-based analysis, that reduced cost and time spent on RCA of performance degradation and bottleneck identification by 2x to 5x in each of the issue instances.
2011 - 2012 @ Deloitte : Performance Engineering Analyst
Reduced application transaction latency to bring it within the Service-Level-Agreement(SLA) window by identifying performance bottlenecks.
Received Applause Award in recognition of my application performance optimization efforts at Deloitte for both optimization and performing Root Cause Analysis (RCA) with traffic profiling, CPU, and memory utilization analysis.
Ensured guaranteed performance (backed by SLAs) in geo-distributed systems.