Course: KR17- Service Continuity Module (SCM)

Topic outline

Service Continuity Module (SCM)
Service Continuity Module (SCM)

Service Continuity Module (SCM) in the 7SHIELD Architecture
Outline
Outline
1. Short Description
2. Main Purpose and Benefits
3. Main Functions
4. Integrations with other Tools
5. Infrastructure Requirements
6. Operation Manual
Content
Content
1. Short Description
The current User Manual Tool presents the main functionalities of the Service Continuity Module (SCM), which is a software application that allows the operators of Ground Segments (GSs) to assess the operability of their organization when it is subjected to a combination of cyber and/or physical attacks. SCM was developed in Python in the context of Task 5.6, namely the “Development of service continuity scenarios for cyber and/or physical attacks” and is integrated in the ENGAGE platform to offer GS operators a Graphical User Interface (GUI) for rapid assessment of real or hypothetical disaster scenarios.
2. Main Purpose and Benefits
SCM was developed to allow GS operators to assess the impact of various cyber and/or physical attacks to their organizations in a quantitative and illustrative way. It can be used during the development of comprehensive Business Continuity Plans (BCPs), by realizing hypothetical disaster scenarios to help the users on better understanding the critical components of their business and on developing pre-defined response procedures to enhance business survivability. Moreover, SCM may be used as a crisis management tool by feeding actual downtime data during a real disaster event, allowing GS managers to compare different mitigation actions and ultimately helping them to select an optimal recovery strategy.

The failure propagation algorithm that is supported by SCM is based on the Generic Operational Model (GOM) approach, which involves the discretization of a business into a series of critical physical/cyber Infrastructure Components (ICs) that are interconnected to form the “supply chain” of the organization. GOM is unique for a GS and it can be derived by expert elicitation methods, i.e., by the collaboration between (a) GS operators who fully understand the operational workflow of their organization and (b) Business Continuity (BC) experts who can assist the former to delineate the interconnections between the critical components.
Figure 2‑1 illustrates an example of a GOM for a GS that receives satellite data via an antenna, processes and stores them into data storage racks, and finally makes them available to the external users and costumers using a website application. Each node on the GOM graph corresponds to a unique critical IC, which may consist of several (sub-)assets. For instance, if the GS employs three antennas for the acquisition of satellite data, then the “Antenna” IC has three assets. More details regarding the development of GOMs and the failure propagation methodology can be found in Deliverable 5.6 of the 7SHIELD project.

Figure 2‑1 – Example of a Generic Operational Model (GOM) for a GS
Each of the critical ICs comprising the GOM of a GS are vulnerable to several cyber/physical threats. For instance, disruptions in the “Antenna” IC of Figure 2‑1 are mostly related to physical threats, such as power shutdowns, damages to the fibre cables or to the structural system of the antenna.
On the other hand, ICs such as the “Website cluster” can also be vulnerable to cyber threats, like Distributed Denial-Of-Service (DDOS) or Man-In-The-Middle (MITM) attacks. Thus, after the derivation of the GOM, the GS operators along with the BC experts shall identify a list of potential threats for each critical component, along with the corresponding expected downtimes needed for the organization to fix the issues.

The construction of the GOM, the identification of the potential threats, and the definition of the expected downtimes comprise the initial steps needed to setup the SCM for a specific GS. These three steps are deemed to remain unchanged when performing the service continuity scenarios using SCM, as they require expert BC practitioners to capture the interdependence of essential components and their impact to the operability of the business. Thus, while the tool allows for modifications in the operational structure of the GS, it is recommended that users do not make changes without the guidance of BC experts. On the other hand, SCM offers the opportunity to the GS managers to rapidly evaluate several hypothetical or actual disaster scenarios and to obtain intuitive graphical results of the impact analyses for efficient comparison and assessment. To support SCM users with the easy and rapid execution of such analyses, a GUI module was developed by SATWAYS and integrated in the ENGAGE platform, which will be described in the following sections.

3. Main Functions
There is only one main function available in the SCM. The application receives as input two files, one related to the operational structure of the GS (GOM, threats, etc.) and another containing the disaster scenarios, i.e., the case studies to be tested. Consequently, the failure propagation algorithm is performed for each case study and a folder is created with images and GIFs related to the results of each analysis. When using the GUI module, the definition of the case studies and the display of the results are handled by the ENGAGE platform.
4. Integrations with other Tools
SCM can be used as a standalone application, in which the user has to create the two input files and then call the main function of the program. When using the GUI module integrated in the ENGAGE platform, the user graphically selects the threats and downtimes of each component, while the input files and the call of SCM’s main function are handled by ENGAGE. In this sense, ENGAGE mainly serves as a proxy to help users on creating the input files needed by SCM and displaying the results.
5. Infrastructure Requirements
SCM was developed in Python 3.9.7 and thus it is a cross-platform application that can be run both locally and remotely. When using the GUI module of the ENGAGE platform to visualize the service continuity scenarios, the infrastructure requirements of SCM end up being those of ENGAGE, in which case the users are advised to read the pertinent User Manual.

There are no specific performance requirements to run SCM, as in a typical GS comprising few critical components with simple interconnections, the failure propagation algorithm converges relatively fast. The main time bottleneck of the software is related to the creation of the images and GIFs per impact analysis; however, the users may decrease the number of timesteps for which an image is created to effectively reduce the total runtime of the analyses.
6. Operation Manual
6.1 Set-up
SCM is written in Python 3.9.7 programming language and uses several python libraries that are needed to be installed locally. The required packages are given in the requirement.txt file, which can be installed using pip installer:
pip install -r requirements.txt
When using the GUI module of the ENGAGE platform, no set-up is required by the user.
6.2 Getting Started

After installing the required packages, the application can be executed in standalone mode by calling the main function:
python -main.py -i <inputdir> -o <outputdir>
Where <inputdir> is the directory in which the input files reside and <outputdir> is the directory used to save the analysis results (a new folder is created if the directory does not exist).
When using the GUI module of the ENGAGE platform, the user has to open the SCM tab, realize the disaster scenario by selecting specific threats or downtimes per critical IC, and then click on the “Finish” button to execute the scenario.
6.3 Nominal Operations

6.3.1 Notifications

6.3.2 Data Entry

There are two input files needed by SCM to run the service continuity analyses, which are written in .json format and reside in the folder <inputdir> given by the user. The first one is the BusinessComponents.json, which contains all the data related to the operational model of the GS, namely the GOM, the potential threats, and the expected downtimes. Each critical IC comprising the GOM contains several attributes like ID, short name, full name, and number of assets (Figure 6‑1(a)). Specifically, the “NumAssets” attribute corresponds to the number of individual assets composing an IC(e.g., if there are four antennas in the GS, the “NumAssets” attribute of the “Antenna” component will be equal to 4). The file also contains a list with the potential cyber/physical threats and a table with the expected downtimes per component (Figure 6‑1(b)). As explained before, this file is expected to be constructed by BC experts and thus users should exercise caution when modifying it.

(a)

(b)

Figure 6‑1 – Example of BusinessComponents.json input file, showing (a) the definition of critical components and (b) the potential cyber/physical threats and expected downtimes.
6.3.3 User Inputs

The second input file is the CaseStudies.json, which contains a series of independent disaster scenarios (i.e., case studies) requested by the user. Each case study has a specific tag name, time of occurrence, and a matrix named ENGAGEtable containing the cyber/physical attacks per asset. Figure 6‑2 illustrates an example of a case study by the name “CS1”, which occurred in 20/05/2022 at 16:00. A snapshot of the considered business model is shown in Figure 6‑1(a), in which the first critical IC is the “Antenna” that comprises four assets. As a result, the first four rows of the ENGAGEtable contain information for these four assets. The next critical component is the “Demodulator cluster” that consists of five assets, and thus the next five rows of the ENGAGEtable are related to them, etc.

The first 8 columns of each row correspond to the 8 potential threats that can impact the individual components (Figure 6‑1(b)), with 1 meaning that the asset was impacted by the specific threat and -1 that it was not. For example, in the case study of Figure 6‑2 the first antenna was impacted by the first threat, which corresponds to “Power Supply Shutdown”. The next to last column is used in cases where the user wants to directly set the downtime for a specific asset (e.g., in Figure 6‑2 80 hours for the second antenna, 120 for the third, 160 for the fourth). Finally, the last column is used for the “Business Continuity” module, when the user wants to assess the effect of different mitigation actions and compare them (e.g., how the loss of service will be reduced if the second antenna shutdowns for 40 hours instead of 80, see Figure 6‑2).

Figure 6‑2 – Example of CaseStudies.json input file.
When the GUI module of the ENGAGE platform is used, the user can graphically select the several threats per asset by clicking on the “Threat Configuration” tab and enabling the pertinent checkboxes, as shown in Figure 6‑3. On the other hand, by clicking on the “Downtime Configuration” tab (Figure 6‑4), the user can directly setup the downtime per asset, in which case the selections in the “Threat Configuration” tab are ignored for this specific asset. Like in the .json files, the first column by the name “Downtime” is used to setup the downtime per asset, while the second is used for the “Business Continuity” module, in which different recovery scenarios are compared. Finally, when the user clicks on the “Finish” button, a CaseStudies.json file is created automatically by ENGAGE and SCM is executed.

Figure 6‑3 – Snapshot of the “Threat Configuration” tab in the ENGAGE platform

Figure 6‑4 – Snapshot of the “Downtime Configuration” tab in the ENGAGE platform
6.3.4. User output
When the user finishes with configuring the disaster scenarios, SCM is called (automatically in the case of the GUI module), and then the impact analyses are executed. For each scenario a set of images and a GIF are created and saved in the <outputdir> directory given by the user. These results are also displayed in the ENGAGE platform, as shown in Figure 6‑5. The three figures in the left of the screen show the downtime diagrams of the impacted ICs, in this case study the “Antenna”, “Demodulator cluster”, and “Data cluster” components. The graph in the middle of the screen illustrates the GOM and the propagation of failure in the critical components of the organization, which are measured by the InfraIdx and InputIdx indices (for more details see Deliverable 5.6). Finally, the two figures in the right side are related to the impact of the attacks to the functionality of the GS, using two metrics: (a) the loss of satellite data and (b) the loss of service as experienced by the external users. Moreover, a message is given in the screen to warn the user if the loss of service is Negligible, Low, Moderate, or High, in order to support his/her decisions.

Figure 6‑5 – Snapshot of the SCM results int the ENGAGE platform
Acronyms
Acronyms
BC                                      Business Continuity

BCP                                    Business Continuity Plan

DDoS                                Distributed Denial-Of-Service

GOM                                 General Operational Model

GS                                      Ground Segment

GUI                                     Graphical User Interface

IC                                        Infrastructure Component

MITM                                Man-In-The-Middle

SCM                                   Service Continuity Module
Funding
Funding

Service Continuity Module (SCM)

Topic outline

Service Continuity Module (SCM)

Outline

Content

Acronyms

Funding