Security Monitoring and Response
Click any control badge to view its details. Download SVG
Key Control Areas
- Telemetry Collection and Log Management (AU-02, AU-03, AU-04, AU-05, AU-08, AU-12, SI-04): The foundation of detection is comprehensive telemetry. AU-02 defines which events must be logged: authentication events, privilege escalation, data access, configuration changes, process execution, network connections, and failed actions. AU-03 specifies the content of each record: who, what, when, where, outcome, and contextual metadata. AU-04 ensures sufficient storage capacity for log retention -- under-provisioned log storage is a common failure that creates blind spots exactly when forensic data is needed. AU-05 defines responses when logging fails: alerts, failover to backup collectors, and in critical systems, halting operations rather than operating blind. AU-08 mandates time synchronisation across all sources using NTP with authenticated time sources -- without accurate timestamps, cross-source correlation is unreliable. AU-12 ensures audit record generation at the point of event occurrence. SI-04 provides real-time system monitoring including network intrusion detection, file integrity monitoring, and host-based anomaly detection. Implementation requires deploying collection agents (EDR, syslog forwarders, cloud log connectors), centralising to a log management platform, and ensuring retention meets both operational and compliance requirements.
- Correlation, Analytics, and Detection Engineering (AU-06, AU-07, SI-04, CA-07, RA-05, PM-16): Raw telemetry becomes intelligence through correlation and analysis. AU-06 mandates review, analysis, and reporting of audit records -- the core SIEM function. This includes rule-based correlation (known attack patterns, sigma rules), statistical baselining (anomaly detection for user behaviour, network traffic, process execution), and threat hunting (hypothesis-driven investigation of telemetry). AU-07 provides audit record reduction and report generation: transforming millions of raw events into prioritised, actionable alerts. SI-04 at the analytics layer applies network behaviour analysis, protocol anomaly detection, and integration with threat intelligence feeds. CA-07 provides continuous monitoring that goes beyond point-in-time assessments to real-time security posture awareness. RA-05 integrates vulnerability scanning data into the correlation engine, enriching alerts with vulnerability context -- an exploit attempt against a known-vulnerable system is higher priority than one targeting a patched system. PM-16 maintains threat awareness: tracking adversary TTPs via MITRE ATT&CK, consuming threat intelligence, and updating detection logic as the threat landscape evolves. Detection engineering must be treated as a continuous discipline, not a one-time configuration.
- Alert Triage, Enrichment, and Orchestration (IR-04, IR-05, IR-07, SI-05, PM-14): Orchestration bridges detection and response. IR-04 handles incident management including automated initial triage: when an alert fires, the SOAR platform enriches it with asset context (is this a crown jewel system?), identity context (is this a privileged account?), vulnerability context (is the target vulnerable?), and threat intelligence (is the indicator known malicious?). This enrichment transforms a raw alert into a decision-ready case. IR-05 provides continuous incident monitoring, correlating related alerts into coherent incidents rather than treating each alert in isolation. IR-07 delivers incident response assistance through automated playbooks and analyst decision support. SI-05 distributes security alerts and advisories from vendors and intelligence sources, integrating them into the triage workflow. PM-14 ensures testing and monitoring programmes validate that the detection and response pipeline works end-to-end -- regular purple team exercises, detection coverage assessments, and mean-time-to-detect/respond metrics.
- Automated and Manual Response (IR-04, IR-06, IR-08, SC-07, AC-02, SI-04): Response must be fast enough to limit blast radius. IR-04 at the response layer executes containment actions: network isolation of compromised hosts via EDR or SDN policy, credential revocation through identity provider integration, blocking of malicious IPs/domains at firewall and proxy, and quarantine of malicious files. IR-06 automates incident reporting to stakeholders, regulators, and law enforcement where required. IR-08 defines incident response plans with specific playbooks for common scenarios: ransomware, business email compromise, data exfiltration, insider threat, and supply chain compromise. SC-07 enables dynamic boundary reconfiguration for containment -- automatically tightening network segmentation to isolate affected zones. AC-02 supports response through emergency account disablement and forced credential rotation. SI-04 provides ongoing monitoring during response to confirm containment effectiveness and detect attacker pivoting. The balance between automated and manual response depends on confidence: high-confidence, low-risk actions (blocking a known-malicious hash) should be fully automated; high-impact actions (isolating a production server) should require analyst approval.
- Forensics, Evidence Preservation, and Recovery (AU-09, IR-04, CP-09, CP-10, SI-07): Post-incident activities ensure lessons are learned and evidence supports legal proceedings. AU-09 protects audit information from tampering -- immutable log stores, write-once-read-many archives, and cryptographic integrity verification ensure that forensic evidence is admissible and trustworthy. IR-04 includes post-incident analysis: root cause identification, timeline reconstruction, and identification of detection gaps. CP-09 provides system backup that enables recovery to a known-good state after containment. CP-10 covers system recovery and reconstitution: bringing affected systems back online safely, verifying they are clean, and monitoring them closely post-recovery. SI-07 verifies software and firmware integrity during recovery, ensuring that restored systems have not been tampered with. Every incident should produce updated detection rules, improved playbooks, and refined response procedures.
- SOC Operations and Workforce (AT-02, AT-03, IR-02, PS-04, PS-07): People are the irreducible core of security operations. AT-02 provides security awareness across the organisation -- users are sensors too, and trained users report phishing, anomalous behaviour, and policy violations. AT-03 delivers role-based training for SOC analysts: tier 1 triage, tier 2 investigation, tier 3 threat hunting, and incident commander roles. IR-02 provides incident response training including tabletop exercises, simulation drills, and post-incident reviews. PS-04 governs personnel termination procedures, ensuring access revocation is immediate and monitored -- a disgruntled leaver with active credentials is a high-risk insider threat. PS-07 manages third-party personnel security for outsourced SOC functions including managed detection and response (MDR) providers.
- Threat Intelligence Integration (PM-16, RA-03, RA-05, SI-05, SR-10): Threat intelligence transforms reactive detection into proactive defence. PM-16 establishes a threat awareness programme that consumes strategic, tactical, and operational intelligence. RA-03 uses threat intelligence to inform risk assessments: which adversary groups target your sector, what TTPs they use, and which of your assets are most likely targets. RA-05 integrates vulnerability intelligence into monitoring: correlating CVE data with asset inventory to prioritise detection for exploitable vulnerabilities. SI-05 distributes security alerts from vendors, ISACs, and government agencies. SR-10 extends monitoring into the supply chain, detecting compromised components and malicious updates before they reach production systems. Intelligence should feed directly into detection engineering: every new intelligence report should be evaluated for detection opportunities.
When to Use
This pattern applies to any organisation that needs to detect and respond to security threats -- which in practice means every organisation. It is particularly critical for organisations in regulated industries (financial services, healthcare, critical infrastructure) where detection and response capabilities are mandated. Organisations with significant cloud workloads need cloud-native detection alongside traditional on-premises monitoring. Those experiencing rapid growth need scalable detection architecture before the estate outgrows ad-hoc monitoring. Any organisation that has suffered a breach where dwell time exceeded days should treat this pattern as urgent.
When NOT to Use
Very small organisations (under 20 users) with simple IT environments may find a full SIEM/SOAR deployment disproportionate and should consider managed detection and response (MDR) services as an alternative. Organisations without basic preventive controls (patch management, endpoint protection, access control) should establish foundations before investing in advanced detection -- you need to reduce the noise floor before detection becomes effective. Air-gapped environments with no internet connectivity have a fundamentally different threat model and monitoring approach.
Typical Challenges
Log volume is the first challenge: a single busy web server generates gigabytes of logs daily, and storage costs for multi-year retention are significant. Alert fatigue is endemic -- poorly tuned SIEM rules generate thousands of false positives that desensitise analysts and bury genuine threats. SIEM deployment often stalls at log collection without progressing to meaningful detection engineering. Correlating events across disparate sources requires normalisation that is never as clean as vendors promise. SOC analyst retention is an industry-wide problem: burnout from shift work, alert fatigue, and repetitive triage drives turnover above 30% annually in many organisations. Measuring SOC effectiveness is difficult -- mean-time-to-detect and mean-time-to-respond are useful but can be gamed. Cloud environments generate novel telemetry formats that traditional on-premises SIEM tools handle poorly. Encrypted traffic (TLS 1.3) limits network-based detection, shifting reliance to endpoint and identity telemetry. SOAR playbook maintenance requires ongoing engineering effort that is often underestimated. Executive expectations of 'real-time detection' clash with the reality that sophisticated adversaries operate within normal user behaviour patterns.
Threat Resistance
Security Monitoring and Response directly addresses the detection and containment phases of the kill chain. Advanced persistent threats with long dwell times are detected through behavioural analytics and anomaly detection (AU-06, SI-04, CA-07). Ransomware is detected at multiple stages: initial access via phishing detection, lateral movement via network anomaly detection, and pre-encryption staging via endpoint behavioural monitoring (SI-04, IR-04). Insider threats are identified through user behaviour analytics that detect anomalous data access, privilege use, and working patterns (AU-06, AC-02, PS-04). Supply chain compromise is detected through software integrity monitoring and update verification (SI-07, SR-10). Credential theft and abuse is detected through impossible travel, anomalous authentication patterns, and failed logon monitoring (AU-02, AC-07, IA-05). Data exfiltration is detected through DLP integration, network flow analysis, and cloud access monitoring (AC-04, SI-04, AU-06). The key architectural principle is defence in depth for detection: no single detection mechanism catches everything, but layered collection, correlation, and analytics create overlapping detection zones that are extremely difficult for adversaries to evade simultaneously.
Assumptions
The organisation has committed to building or procuring security operations capability (in-house SOC, managed SOC, or hybrid). Network architecture supports centralised log collection (agents can reach collectors, bandwidth is sufficient). A log management or SIEM platform is deployed or planned. Endpoints support EDR agent deployment across the estate. Identity providers can emit authentication and authorisation events. Cloud environments expose audit trails via APIs. The organisation has or will develop incident response procedures and playbooks.
Developing Areas
- AI-driven triage accuracy versus analyst trust is the defining tension in modern SOC operations. AI copilots can summarise alerts, suggest investigation steps, and recommend containment actions with impressive speed, but SOC analysts report low confidence in AI recommendations they cannot independently verify. The fundamental challenge is that AI triage operates as a black box -- analysts cannot trace the reasoning chain from raw telemetry to recommendation -- and in a domain where false negatives have severe consequences, trust must be earned through demonstrated accuracy over extended operational periods.
- Detection engineering as a formalised discipline is still being defined. While the concept of treating detection logic with the same engineering rigour as software development (version control, testing, peer review, CI/CD deployment) is gaining acceptance, the profession lacks standardised job descriptions, career paths, training curricula, and certification. The Sigma rules project provides a community-maintained detection corpus, but most organisations still write detections in vendor-specific query languages without systematic coverage analysis or quality assurance processes.
- Security data lake economics are disrupting the traditional SIEM market but creating architectural complexity. The cost per GB of storing security telemetry in cloud object storage (S3, GCS, ADLS) is 10-50x lower than in traditional SIEM platforms, enabling retention of data sources that were previously excluded for cost reasons. However, the trade-off is query latency, real-time alerting capability, and the need to build or buy an analytics layer on top of the raw storage. Most organisations are converging on a hybrid architecture -- hot SIEM for real-time detection and warm data lake for hunting and historical analysis -- but the optimal split between tiers is still being established.
- OCSF and OpenTelemetry convergence for security observability is a developing standards effort that could fundamentally improve cross-platform detection. The Open Cybersecurity Schema Framework (OCSF) provides a vendor-neutral data model for security events, while OpenTelemetry standardises telemetry collection for distributed systems. The convergence of these standards could eliminate the normalisation burden that consumes 30-40% of SIEM engineering effort, but adoption requires vendor commitment that is progressing unevenly, and organisations with existing SIEM investments face migration costs.
- SOAR playbook maintenance burden is generating significant operational debt in organisations that invested heavily in automation. Playbooks that worked at deployment break as integrated APIs change, vendor products update, and the threat landscape evolves. Industry surveys report that the average SOAR deployment has 30-40% of playbooks in a degraded or broken state at any given time, and the engineering resources required for ongoing maintenance were systematically underestimated during procurement. The emerging pattern is fewer, more robust playbooks focused on high-frequency use cases rather than comprehensive automation of the entire incident lifecycle.