← Patterns / SP-049 Draft

AI in Security Operations

AI is now embedded in every tier of security operations. SIEM platforms run ML-based anomaly detection models trained on behavioural baselines. XDR products use AI to correlate alerts across telemetry sources and surface prioritised investigations. Threat intelligence platforms automate IOC enrichment, actor attribution, and TTP mapping. Incident response platforms trigger automated playbook steps based on AI classification of alert type and severity. Security copilots draft investigation summaries, suggest queries, and synthesise threat reports. The productivity case is real — AI that reduces SIEM alert volume through false-positive suppression, or that drafts a coherent incident timeline from log data in seconds, directly addresses the analyst capacity problem every SOC faces. The security risks introduced by security AI are equally real and specifically consequential. Security AI operates on sensitive data (security telemetry, vulnerability data, incident records), influences or automates decisions with real-world impact (blocking network flows, quarantining endpoints, escalating incidents), and is itself a target for adversarial manipulation — threat actors who understand a deployed detection model can craft activity patterns designed to evade it. A false negative in a recommendation system means a user makes a suboptimal choice. A false negative in a security detection model means an intrusion goes undetected. This pattern has a distinct scope from the other OSA AI patterns. SP-027 (Secure LLM Usage) addresses model-layer risks when the organisation uses LLMs — prompt injection, provider security, privacy attacks on fine-tuned models. SP-047 (Secure Agentic AI Frameworks) addresses agentic framework infrastructure for deploying autonomous AI systems. SP-045 (AI Governance) addresses the enterprise management system — bias, explainability, training data governance at the AIMS level. This pattern addresses the use of AI for defensive security — the SOC architect's question rather than the application architect's question. The primary control families are AU (audit), IR (incident response), SI (system integrity), and CA (continuous monitoring), with CM (configuration management) governing the model lifecycle. The central tension this pattern addresses is between automation and accountability. AI in security operations delivers its value by reducing human workload — but every automated decision that removes a human from the loop also removes a human from the accountability chain. The pattern establishes a tiered autonomy model calibrated to consequence: AI can act autonomously on low-impact reversible actions, must notify for medium-impact actions, and must obtain human approval for high-impact or irreversible actions. It also addresses the long-term risk that is hardest to measure: the gradual atrophy of analyst capability as AI handles the tasks that build and maintain those skills.

Release: 26.02 Authors: Aurelius, Vitruvius Updated: 2026-02-22

Assess

ATT&CK This pattern addresses 437 techniques across 13 tactics View on ATT&CK Matrix →

Tap diagram to view interactive version with clickable controls. Download SVG

Key Control Areas

AI-Augmented Threat Detection Governance

CA-07 SI-04 CM-02 SA-11

ML-based detection models embedded in SIEM and XDR platforms require the same governance rigour as other critical security infrastructure — but most organisations apply none. CM-02 (Baseline Configuration) mandates a documented baseline for every detection model in production: training data timeframe, feature set, threshold settings, false positive and false negative rates at deployment, and the model version identifier. Without this baseline, there is no basis for detecting model degradation, evaluating the impact of model updates, or rolling back to a known-good state after a bad update. CA-07 (Continuous Monitoring) applied reflexively to the monitoring system itself: detection model performance metrics (alert volume, true positive rate as validated by analyst feedback, false negative rate on known-attack simulations) must be tracked continuously, not reviewed annually. SI-04 (System Monitoring) — establish monitoring for anomalous detection model behaviour: sudden drops in alert volume may indicate model evasion or data pipeline failure, not an improvement in security posture. SA-11 (Developer Security Testing) — before any detection model or updated threshold reaches production, validate its performance against a holdout dataset that includes labelled attack scenarios representative of current threat actor TTPs, not just clean traffic baselines.

Detection Model Data Governance and Adversarial Robustness

SI-10 SC-28 SA-03 CM-08

Detection models are trained on data that adversaries can influence. An attacker who understands a deployed detection model — its feature set, thresholds, and training data sources — can craft activity patterns specifically designed to evade it: staying below anomaly thresholds, mimicking legitimate traffic patterns during intrusion, fragmenting malicious activity across time to defeat temporal correlations, or injecting log entries that confuse feature extraction. If the model retrains on analyst-labelled data, a patient adversary can poison the feedback loop by generating events that analysts consistently label as false positives. SI-10 (Information Input Validation) — validate the integrity and provenance of training data sources; implement anomaly detection on the training pipeline itself, flagging unexpected changes in data distribution or volume that may indicate adversarial manipulation. SC-28 (Protection of Information at Rest) — training datasets for security AI contain sensitive telemetry; encrypt, access-control, and audit them accordingly — a breach of training data reveals the feature set an adversary can target. SA-03 (System Development Life Cycle) — include adversarial robustness testing in the security AI development lifecycle: specifically evaluate whether the detection model's performance can be systematically degraded by a knowledgeable attacker who understands its feature set and thresholds. CM-08 (System Component Inventory) — maintain a complete inventory of all detection models in production, their data provenance, current performance metrics, and dependency on external data sources.

AI-Assisted Incident Triage and Human Accountability

IR-04 AU-02 AU-03 AC-05

AI-assisted triage changes incident response in three ways: classification (AI determines severity and category), recommendation (AI suggests response actions), and execution (AI triggers playbook steps automatically). Each role requires different human oversight. For classification: the AI's severity assessment must be visible to the analyst and overridable without friction — analyst overrides should be logged and feed into model monitoring. For recommendation: the AI must explain its reasoning in terms the analyst can evaluate, cite the evidence that drove the recommendation, and present alternatives. For execution: define explicit blast radius limits on what the AI can do without human approval — isolating a suspected endpoint is high-impact and potentially irreversible in terms of business disruption; blocking a single IP is low-impact and reversible. IR-04 (Incident Handling) — document the AI's role in each triage workflow explicitly, calibrate approval requirements to consequence severity, and include AI failure modes in IR playbooks: what does the analyst do when the AI triage tool is unavailable, produces clearly incorrect output, or is suspected of being manipulated? AU-02 and AU-03 — AI-assisted triage audit trails must capture the AI's classification and confidence score, the evidence cited, the human reviewer identity, and the action taken, to support post-incident review and model improvement. AC-05 (Separation of Duties) — the AI that classifies an incident must not be the sole actor that determines and executes the response for high-severity or high-impact scenarios.

Hallucination Risk in Security Decision Support

SI-10 CA-07 SA-11 AU-03

Security AI hallucination has domain-specific consequences. An LLM-based security copilot that fabricates a CVE number sends analysts investigating a non-existent vulnerability. An AI threat attribution system that misidentifies a threat actor based on superficial TTP similarity may direct an entire IR investigation toward the wrong hypothesis, consuming analyst capacity and potentially alerting the real attacker that they have been detected. An AI that summarises an alert with contextual threat intelligence may confidently cite information that is outdated, incorrect, or extrapolated beyond the training data. Controls operate at two levels. Technical: SI-10 (Information Input Validation) applied to AI outputs — implement output validation requirements scaled to the decision stakes of the AI's role. AI threat intelligence summaries must cite their sources; AI alert classifications above a defined severity threshold must be reviewed by an analyst before driving automated response; AI-generated queries and detection rules must be tested before deployment. Operational: establish explicit verification requirements in analyst workflows — the AI output is a starting hypothesis, not a conclusion. SA-11 — include hallucination testing in security AI acceptance testing: probe the model with scenarios where confabulation is likely (novel threat actors, recently disclosed vulnerabilities not in training data, ambiguous alert patterns) and evaluate whether the output is appropriately uncertain or falsely confident. CA-07 — monitor AI output quality metrics over time: analyst override rates, flagged hallucination instances, and cases where AI assessment diverged significantly from post-incident ground truth.

AI Threat Intelligence

Automation, Provenance, and Confidence Scoring (SI-04, AU-03, PM-16, RA-03): Automated threat intelligence enrichment — IOC lookup and context, actor attribution, TTP mapping to MITRE ATT&CK, vulnerability correlation — accelerates analyst workflows by eliminating manual lookups. AI-generated threat reports can synthesise large volumes of intelligence into coherent narratives in seconds. The risks are provenance opacity (where did this assessment come from?), knowledge currency (the model's training data has a cutoff, and threat actor TTPs change), and confidence inflation (AI-generated assessments often appear equally confident regardless of the underlying evidence quality). AU-03 (Content of Audit Records) — all AI-generated or AI-enriched threat intelligence must be traceable: what sources were queried, what model or service produced the enrichment, what confidence score was assigned, and when. This supports both forensic review of IR decisions and quality assurance of the intelligence programme. Implement explicit confidence tiers for AI-generated intelligence: high confidence (multiple consistent sources, recent data, corroborated by analyst review), medium confidence (limited sources, moderate recency, not independently verified), low confidence (single source, stale data, or model extrapolation) — and require the tier to be visible in any workflow where the intelligence drives a decision. PM-16 (Threat Awareness Program) — include systematic evaluation of AI threat intelligence quality: review a sample of AI-generated assessments against ground truth, track cases where AI attribution or assessment was later revised, and feed findings into the threat intelligence programme governance. RA-03 (Risk Assessment) — assess the organisation's dependency on specific AI threat intelligence providers and the risk of provider knowledge currency gaps for your specific threat landscape.

Over-Reliance, Skill Atrophy, and Analyst Capability Maintenance

AT-02 AT-03 PS-06 PM-14

Security AI that handles routine triage, alert classification, and investigation summary generation allows analysts to focus on higher-order work — but it also removes the routine practice that builds and maintains the underlying analytical skills. A SOC that has relied on AI for alert triage for eighteen months may, in an AI outage or an adversarial attack on the AI system itself, find that its analysts have lost the manual workflow competence that the AI replaced. This risk is slow to accumulate and difficult to measure before it becomes visible in a crisis. AT-02 and AT-03 (Awareness and Role-Based Training) must include mandatory AI-free capability maintenance exercises: periodic tabletop exercises and live simulations where analysts perform triage, investigation, and response without AI assistance, to validate that baseline skills are intact and identify capability gaps before they matter. Include AI-specific failure scenarios in tabletops: what does the team do when the SIEM ML engine is unavailable, when the AI triage tool is suspected of being compromised, or when AI-generated assessments are systematically incorrect? PS-06 (Access Agreements) — establish explicit expectations in role descriptions and agreements: AI augmentation tools are assistants, analysts retain professional accountability for security decisions made with AI assistance. Analysts should be able to articulate the basis for their decisions independent of AI output. PM-14 (Testing, Training, and Monitoring) — establish analyst capability metrics alongside AI performance metrics in the security operations programme: analyst assessment capability should not be treated as solved by AI deployment.

Security AI Governance and Model Lifecycle

CM-03 CM-04 SA-11 RA-03 CM-02

Security AI systems require a governance structure equivalent to other critical security infrastructure, but most organisations have not yet established one. Define ownership: who is accountable for the performance of each detection model? Who can change its thresholds? Who approves a model update before it reaches production? CM-03 (Configuration Change Control) — all changes to detection model parameters, thresholds, feature sets, training data sources, and ML engine configurations require formal change control with security review, testing in a staging environment, and a rollback plan. A threshold adjustment that reduces false positives may simultaneously reduce true positive detection rate — this is a security-relevant change that deserves the same rigor as a firewall rule change. CM-04 (Impact Analysis) — assess the downstream impact of any model change on the analyst workflow, alert volume, and detection coverage before deployment. SA-11 (Developer Security Testing) — test each model version against the detection coverage requirements, including performance on specific attack scenarios relevant to the organisation's threat profile. RA-03 (Risk Assessment) — formal risk assessment for new security AI deployments should address: adversarial robustness, dependency on external data sources, failure modes and blast radius, and coverage gaps relative to the current threat landscape. Define a model retirement process: when a detection model is replaced or decommissioned, ensure that its detection coverage is transferred to the replacement before the old model is disabled.

Security AI Supply Chain and Third-Party Risk

SA-09 SR-02 SA-04 CM-08

Most enterprise security AI is procured from vendors — SIEM platforms with embedded ML detection engines, XDR products with AI correlation, threat intelligence platforms with AI enrichment services. The security properties of these AI components (what they detect, how they can be evaded, what data they expose) depend on vendor decisions that are typically opaque to the customer: model training data, update cadence, detection logic, and data processing arrangements. SA-09 (External System Services) — vendor assessment for security AI products must go beyond standard SaaS security evaluation to include: what training data underpins the detection model, how frequently and through what process models are updated, what the disclosure process is when a model update changes detection coverage, and what the data residency and processing arrangements are for security telemetry sent to the vendor. SR-02 (Supply Chain Risk Management) — treat AI detection models embedded in security products as software supply chain components: subscribe to the vendor's model update notifications, test updates in a staging environment before production rollout, and maintain rollback capability if an update degrades detection performance. SA-04 (Acquisition Process) — security AI product evaluations should explicitly assess adversarial robustness (has the model been tested against adversarial evasion?), model governance documentation, and the vendor's process for handling zero-day detection gaps. CM-08 (System Component Inventory) — include AI components in the security technology inventory, with model version, data residency, and update history recorded.

When to Use

Organisation uses SIEM with ML-based anomaly detection or AI-based alert prioritisation. XDR or EDR products with AI detection are in production. AI threat intelligence enrichment is used in analyst workflows. AI-assisted triage, investigation summaries, or response playbook automation are deployed. Organisation is evaluating AI security copilots (Microsoft Security Copilot, Google SecOps AI, or custom LLM-based security tools). SOC is experiencing alert fatigue and evaluating AI as a solution to analyst capacity constraints.

When NOT to Use

Organisation has no security operations function and no security monitoring tooling. All security monitoring is entirely outsourced with no visibility into or control over the MSSP's tooling. Organisation uses only traditional signature-based security tools with no ML or AI components. Note: even organisations that do not explicitly deploy AI security tools may be running AI components embedded in commercial security products — verify before applying this contra-indication.

Typical Challenges

Detection model governance is the most commonly missing control — organisations that would never deploy a firewall rule change without a change ticket routinely allow SIEM ML thresholds to be adjusted informally. Vendor opacity is pervasive: SIEM and XDR vendors treat their detection models as proprietary IP and provide minimal documentation about model architecture, training data, or update logic, making independent assessment of adversarial robustness or coverage gaps nearly impossible. Hallucination risk is poorly understood in security operations contexts — analysts who are trained to trust outputs from deterministic security tools apply the same trust to probabilistic AI outputs, increasing the risk that AI-generated threat assessments drive response without appropriate verification. Skill atrophy is a slow-moving risk that organisations typically discover only during a crisis (AI outage, adversarial attack on security AI). Adversarial evasion of ML detection requires significant attacker sophistication today but this capability is diffusing as attack tooling incorporates AI-evasion features. The boundary between security AI governance (this pattern) and enterprise AI governance (SP-045) requires explicit organisational decision — avoid split accountability where neither the CISO nor the AI governance function clearly owns security AI model risk.

Threat Resistance

Detection model evasion is addressed through adversarial robustness testing in the model development and acceptance process, continuous monitoring of detection model performance for anomalous drops in alert volume or detection rate, and analyst review requirements that ensure human eyes on the telemetry independent of model output. Training data poisoning is mitigated through data provenance validation, integrity monitoring of training pipelines, and separation of analyst feedback loops from automatic retraining. Hallucination in security decision support is mitigated through output verification requirements scaled to decision stakes, source citation requirements for AI threat intelligence, and analyst override friction reduction so that challenging AI output is easy and expected rather than exceptional. Analyst skill atrophy is addressed through mandatory capability maintenance exercises without AI assistance and explicit role definitions that preserve analyst accountability. Over-automation and disproportionate AI-triggered response actions are controlled through tiered autonomy limits, blast radius constraints on automated actions, and human approval requirements for high-impact response. Third-party security AI supply chain risk is addressed through vendor assessment requirements, staged model update deployment, and rollback capability for detection model changes.

Assumptions

The organisation operates a security operations capability (internal SOC, managed SOC, or hybrid) that uses one or more AI-augmented security tools: SIEM with ML detection, XDR, AI-based threat intelligence enrichment, or AI-assisted triage. AI components in security tooling may be embedded (invisible in commercial products) or explicit (deployed and configured as AI services). Some security AI is provided as a managed service by vendors who operate the model infrastructure; others run on-premises or in customer-controlled cloud. The pattern applies to both scenarios with emphasis varying by deployment model. Analysts interact with AI outputs in triage, investigation, and response workflows — the human-AI handoff points are the primary design decisions.

Developing Areas

Adversarial evasion of ML security detection is maturing as an attacker capability. Research demonstrates that adversarial perturbations can defeat ML-based malware classifiers, network anomaly detectors, and behavioural detection models. These techniques are beginning to appear in offensive tooling. Security teams deploying ML-based detection should assume that sophisticated adversaries targeting them specifically may have invested in understanding the detection model's feature set and limitations — a threat model calibration that most security AI deployments currently do not include.
Security AI regulatory scope is unclear. The EU AI Act's risk classification includes AI systems used in critical infrastructure security — whether enterprise security AI falls under high-risk classification is being clarified through the Commission's guidance on AI in cybersecurity. DORA (EU Digital Operational Resilience Act) applies to ICT risk management including AI components in financial sector security operations. Organisations in regulated sectors should assess whether their security AI deployments require conformity assessment or regulatory notification.
AI security copilot evaluation maturity: Microsoft Security Copilot, Google SecOps AI, and emerging competitors offer LLM-based interfaces to security telemetry and threat intelligence. Evaluating these products requires assessing hallucination behaviour in security-specific contexts (not just general LLM benchmarks), data residency for the security queries and telemetry sent to the provider, and the accountability model when a copilot recommendation drives an incorrect response action. Vendor evaluation frameworks for AI security copilots are not yet standardised.
Detection-as-code and AI-generated rules: AI tools that generate SIEM detection rules from natural language descriptions or threat intelligence reports are emerging. These tools reduce the time to develop new detections but introduce quality and adversarial robustness questions — AI-generated rules may have subtle logic errors, over-fit to specific attack patterns in training data, or be susceptible to evasion by adversaries who know the rule generation approach. Testing and review requirements for AI-generated detection rules are not yet standardised.

Related Patterns

Patterns that operate within or alongside this one. Click any to view.

SP-025 Advanced Monitoring and Detection SP-027 Secure LLM Usage SP-031 Security Monitoring and Response SP-036 Incident Response SP-045 AI Governance and Responsible AI

AC: 1AT: 2AU: 3CA: 1CM: 4IR: 2PM: 2PS: 1PT: 1RA: 1SA: 4SC: 1SI: 2SR: 1

AC-05 Separation of Duties

AT-02 Literacy Training and Awareness

AT-03 Role-Based Training

AU-02 Event Logging

AU-03 Content of Audit Records

AU-06 Audit Monitoring, Analysis, and Reporting

CA-07 Continuous Monitoring

CM-02 Baseline Configuration

CM-03 Configuration Change Control

CM-04 Impact Analysis

CM-08 System Component Inventory

IR-04 Incident Handling

IR-06 Incident Reporting

PM-14 Testing, Training, and Monitoring

PM-16 Threat Awareness Program

PS-06 Access Agreements

PT-02 Authority to Process Personally Identifiable Information

RA-03 Risk Assessment

SA-03 System Development Life Cycle

SA-04 Acquisition Process

SA-09 External System Services

SA-11 Developer Security Testing

SC-28 Protection of Information at Rest

SI-04 System Monitoring

SI-10 Information Input Validation

SR-02 Supply Chain Risk Management Plan