An incident response plan is a set of instructions to help IT staff detect, respond to, and recover from network security incidents. These plans address issues like cybercrime, data loss, and service outages that threaten daily work.
An incident response plan's primary objective is to respond to incidents before they become a significant setback. As the frequency and types of data breaches increase, the lack of an incident response plan can lead to longer recovery times, and increased cost.
The security situation gets more critical with the COVID-19 pandemic as employees switched working from home. The vast majority of corporate offices and cloud infrastructure components have been opened to the WAN and have significantly increased organizational security risks.
We have already seen organizations that aren't sufficiently prepared because their incident response plan was outdated, never tested, or even non-existent.
According to Ponemon Institute security researches done in 2018 and 2020, there are reasons to worry:
Of the 26 percent of organizations that don't practice their plan, 64 percent said the reason they don't practice it is that it's not a priority.
The frequency of insider incidents has tripled since 2016 from one to 3.2 per organization.
The three largest industries affected were financial services, services, and technology and software. Financial services organizations include banking, insurance, investment management, and brokerage companies. Companies in financial services, services, and technology and software incurred average costs of $14.05 million, $12.31 million, and $12.30 million, respectively.
At Plexteq, we assist enterprises in conducting security audit for their custom software, infrastructure and building a corporate incident response plan. In this article, we are sharing our experience in this domain.
A process to respond to every security incident that occurs in the organization is called an Incident Response. Incident Response is the ability to prepare for and respond to events that present a negative effect on the organization's network, to minimize any disruption on any business processes. Which directly means to deal with any incident pro-actively or reactively to avoid or minimize the effect of disruption on any processes.
As per our experience, optimal management of incident response should include
A comprehensive plan that covers how to stop the attack and eradicate the underlying cause, recover production systems, and conduct a post-mortem analysis to prevent future attacks.
The right people in place. Usually, the team should consist of an incident response manager, security analyst, IT engineer, threat researcher, legal representative, corporate communications, human resources, risk management, C-level executives, and external security forensic experts. All team members must know what their responsibilities will be in the event of an attack.
Tools to handle security incidents on a large scale. These tools analyze, alert about, and can even help remediate security events that could be missed due to insufficient internal resources. Such tools obtain information for a response via traffic analysis protocols, system logs, endpoint alerts, and identity systems to assess security-related anomalies in the network.
By now, you understand the concept of Incident Response (IR) and know that this methodology will handle breaches, security incidents, ransomware, etc. A good incident response plan is well documented, communicated, trained, and tested annually at a minimum.
Element #1. Incident Response Plan
To deal with any incident we should be having a properly documented approach prepared. We don't invent a bicycle here. Instead, we use the NIST Computer Security Incident Handling Guide (SP 800-61) and SANS Institute’s Incident Handlers Handbook which defines a standard with the steps as such:
Preparation – Advance planning on prevention and handling of incidents or cyber-attacks.
Detection and analysis – This includes actively and proactively monitoring anomalies, potential attack vectors, prioritizing these tasks.
Containment, eradication, and recovery – Having a containment tactic, detecting and mitigating the systems under attack, and recovery plan.
Post-incident activity – Documenting and assessing lessons learned and having a strategy for historical retention.
In our point, proactive prevention/detection and lessons learned are crucial because they both require extensive preparation and make the post-mortem continuously improve the IR plan.
Our recommendations for building a successful IR plan are:
Determine the critical components of your network – To proactively protect your network and data against major damage, you need to replicate and store your data in a remote location.
Identify single points of failure in your network and address them – You should have a plan B for every critical component of your network, including hardware, software, and staff roles.
Define a workforce continuity plan – Help ensure employee safety and limit business downtime by enabling them to work remotely in case of breaches or natural disasters. Build out infrastructure with technologies such as virtual private networks (VPNs) and secure web gateways to support workforce communication.
Element #2. The Incident Response Team
The incident response team's goal is to coordinate team members and resources during a cyber incident to minimize impact and quickly restore operations. This includes:
Analysis – Document the extent, priority, and impact of a breach to see which assets were affected and if the incident requires attention.
Reporting – Tell team members of reporting procedures. Gather relevant trending data to show the importance of the incident response team.
Response – Explore root causes, record findings, and carry out recovery strategies and communicate the status of your organization to team members.
Our five recommendations for an incident response team are:
Isolate exceptions – Technology alone cannot successfully detect security breaches. You should also rely on human insight. The following are a few conditions to watch for daily: traffic anomalies, accessing accounts without permission, excessive consumption, and suspicious files.
Use a centralized approach – Gather information from security tools and IT systems, and keep it in a central location, such as a Security Information and Event Management system (SIEM).
Assert, don’t assume – Don’t conduct an investigation based on the assumption that an event or incident exists. Instead of making assumptions, make assertions based on a question that you can evaluate and verify. For example, “If I’ve noted alert X on system Y, I should also observe event Z occur nearby.”.
Eliminate impossible events – You may not know what you are looking for exactly. On these occasions, eliminate occurrences that could be logically explained. You will then find yourself with the events that have no clear explanation.
Take post-incident measures – Continue monitoring your systems for any unusual behavior to ensure the intruder has not returned. Watch for new incidents and conduct a post-incident review to isolate any problems experienced during the execution of the incident response plan.
Element #3. Tools
Incident response is a critical, time-sensitive activity, and in virtually all organizations security analyst time is scarce. It is impossible to manually review and investigate all alerts from modern security tools.
Automating incident response activities can help reduce the time it takes to mitigate a critical incident, preventing malware from spreading or stopping attackers from doing any more damage. It can also save time by allowing security teams to review more security events, and identify and investigate important potential incidents.
There are various tools you may need for organizing the IR process properly:
SIEM tools – Gathers and aggregates log data created in the technology infrastructure of the organization, including applications, host systems, network and security devices (e.g., antivirus filters and firewalls). Provides reports on security-related incidents, including malware activity and logins. It also sends alerts if the activity conflicts with existing rule sets, indicating a security issue.
Intrusion Detection Systems (IDS) – Uses baselines or attack signatures to issue an alert when suspicious behavior or known attacks take place on a server, a host-based intrusion detection system (HIDS), or a network-based intrusion detection system (NIDS).
Netflow Analyzers – Looks at actual traffic across border gateways and within a network. Netflow is used to track a specific thread of activity, to see what protocols are in use on your network, or to see which assets are communicating between themselves.
Vulnerability Scanners – Isolates potential areas of risk, assesses the attack surface area of your organization for known weaknesses, and provides instructions for remediation. Vulnerabilities may be caused by misconfiguration, bugs in your own applications, or usage of third party components that can be exploited by attackers.
Availability Monitoring – The aim of incident response is to limit downtime. A service or application outage can be the initial sign of an incident in progress. Availability monitoring stops adverse situations by studying the uptime of infrastructure components, including apps and servers. It tells the webmaster of issues before they impact the organization.
Web Proxies – Controls access to websites and logs what is being connected. Many threats operate over HTTP, including being able to log into the remote IP address. The HTTP connection can also be essential for forensics and threat tracking.
Log collection software – Gathers logs from different software and hardware components at a single place for quick search and analysis.
The Incident Response Team must measure the effectiveness of their detection solution and question everything. Is it detecting most alerts, or are the majority reported by users and system administrators? What is the detection accuracy/false-positive rates?
We recommend teams to look into:
Mean Time to Detect (MTTD) – The effectiveness of your detection solution.
Mean Time to Respond/Repair (MTTR) – The time it takes to see a security concern, identify the impact, determine the course of action, and implement it.
These numbers can vary widely, but over time trends will appear, providing useful insight about where you need to invest for additional protection, remediation, and automation capabilities.
Incident Response Plan Example
The image attached below will make you understand the complete lifecycle of any incident from occurring, detecting to solving any incident.
Incident Response Plan can be modified, updated according to its need as it’s important to note that every organization is unique, and their plan will be unique.
These vary from country to country (GDPR, HIPAA) and even state to state, such as California’s SB1386. All these types of scenarios need to be considered when building your Incident Response Plan.
As Incident Response is important for any organization to prevent from any malicious activity spreading into the organization, which can cause any data leakage, server compromisation, unauthorized access, etc, it is really important for any organization to take cybersecurity very seriously because many organizations failed to implement cybersecurity as their top management do not want to invest in it, then we read the same in news about the hacks happened on the company due to carelessness, we have multiple examples of hacks that happened due to this.
Feel free to reach out if you need an implementation of an efficient Incident Response Plan for your Organization of global proportions.