Author: Denis Avetisyan
A new framework leverages the power of large language models to automate key security operations, from threat identification to incident resolution.

This paper details an end-to-end LLM framework for threat detection, query generation, and resolution, significantly improving Security Operations Center efficiency and accuracy.
Modern Security Operations Centers (SOCs) struggle to keep pace with escalating threats and complex, fragmented security information. This challenge is addressed in ‘Toward Autonomous SOC Operations: End-to-End LLM Framework for Threat Detection, Query Generation, and Resolution in Security Operations’ which introduces a novel framework leveraging large language models to automate critical workflows. By combining ensemble-based detection, syntax-constrained query generation, and retrieval-augmented resolution, the framework achieves significant improvements in accuracy and reduces incident triage times from hours to under ten minutes. Could this approach represent a viable path toward truly autonomous SOC operations at scale, and what further refinements are needed to ensure robust and reliable performance in real-world deployments?
The Erosion of Signal in a Noisy System
Contemporary Security Operations Centers (SOCs) face an escalating crisis of information overload. The sheer volume of security alerts generated by modern networks-often exceeding tens of thousands daily-routinely swamps analyst capacity. This alert fatigue is compounded by the increasing sophistication of threats, which employ techniques designed to evade traditional signature-based detection. Attackers leverage polymorphism, obfuscation, and multi-stage payloads, creating a landscape where discerning genuine threats from benign activity becomes exceptionally difficult. Consequently, SOCs are forced to prioritize alerts based on limited information, increasing the risk of overlooking critical incidents and leaving organizations vulnerable to persistent and evolving cyberattacks. The challenge isn’t simply more alerts, but a marked increase in alerts requiring nuanced investigation and contextual understanding.
Contemporary threat detection, reliant on signature-based systems and rule-driven alerts, increasingly falters against the ingenuity of modern cyberattacks. These attacks frequently employ techniques like polymorphic malware, fileless threats, and living-off-the-land tactics to evade established defenses. Consequently, security analysts are inundated with a high volume of alerts, many of which are false positives or represent minor anomalies. This relentless stream contributes to significant analyst fatigue, hindering their ability to effectively prioritize and investigate genuine security incidents. The resulting delays and oversights create vulnerabilities, allowing sophisticated attacks to progress undetected and potentially causing substantial damage before a response can be mounted. This mismatch between the evolving threat landscape and static detection methods necessitates a shift towards more adaptive and intelligent security solutions.
The accelerating pace of cyber threats necessitates incident response capabilities that surpass current human limitations and technological infrastructure. Investigations are increasingly hampered not simply by the volume of alerts, but by their complexity and the speed with which attackers adapt. Analysts face a constant struggle to correlate data, identify genuine threats amidst false positives, and contain breaches before significant damage occurs. Consequently, organizations are actively exploring automation, artificial intelligence, and machine learning to augment human expertise, enabling faster triage, more accurate threat assessment, and ultimately, a more resilient security posture. These advancements aim to shift the focus from reactive containment to proactive threat hunting, allowing security teams to anticipate and neutralize attacks before they fully materialize – a critical evolution in the face of increasingly sophisticated adversaries.

Querying the Void: A New Form of Intelligence
Automated query generation utilizes Large Language Models (LLMs) to translate natural language security inquiries into executable search queries, thereby reducing the time and expertise required for initial investigation phases. This approach addresses the bottleneck created by the need for skilled security analysts to manually construct queries for various data sources. By automating this process, security teams can rapidly gather relevant information, triage alerts more efficiently, and accelerate incident response timelines. The implementation of LLMs in query generation enables broader access to security data insights, even for analysts lacking deep query language proficiency, and facilitates scalability in handling increasing volumes of security events.
The Syntax Query Metadata (SQM) architecture facilitates automated query generation by structuring the process around defined constraints and associated metadata. This approach moves beyond simple natural language processing by explicitly defining the permissible syntax for generated queries, reducing errors and increasing the likelihood of successful execution. Metadata, encompassing data types, field names, and acceptable values, is used to constrain the query generation process, ensuring outputs conform to the target system’s requirements. By grounding query construction in both syntactic rules and semantic metadata, SQM aims to improve query precision and reduce the need for manual intervention or correction.
The Syntax Query Metadata (SQM) architecture achieves an 88% executable query rate by incorporating three core principles. Allow Lists restrict query generation to pre-approved syntax and functions, minimizing the creation of invalid or malicious queries. Metadata Driven Retrieval utilizes contextual information about data sources to tailor queries to specific schemas and fields, increasing relevance. Finally, Documentation Grounding ensures queries are aligned with official documentation for each data source, further improving validity and reducing errors. This combined approach significantly increases the proportion of generated queries that can be successfully executed against target systems.

Validating the Signal: Measuring Query Fidelity
Rigorous evaluation of generated queries is crucial for verifying their alignment with intended security objectives and their effectiveness in retrieving pertinent evidence. This process involves confirming that the query accurately translates the security intent – such as identifying specific threat actors, malicious behaviors, or system vulnerabilities – into a technically sound search request. Evaluation also necessitates assessing whether the query returns a relevant evidence set, minimizing false positives and ensuring that critical security indicators are not overlooked. Failure to perform this evaluation can lead to inaccurate threat detection, ineffective incident response, and potentially compromised security posture.
Quantitative assessment of generated security queries utilizes metrics like ROUGE-L and BLEU scores to determine the degree of overlap with established, expected queries. ROUGE-L, which focuses on the longest common subsequence, achieved a score of 0.731 in testing, indicating a substantial degree of overlap in sequence matching. The BLEU score, evaluating n-gram precision, registered at 0.384, providing a measure of the similarity of generated queries to reference queries based on word choice and order. These scores offer objective data points for evaluating query quality and identifying areas for improvement in the query generation process, allowing for iterative refinement to enhance the accuracy and relevance of retrieved security evidence.
The Security Query Management (SQM) architecture facilitates compatibility with both Google SecOps and IBM QRadar security information and event management systems. Query generation for Google SecOps is specifically dependent on the YARA-L 2.0 rule syntax, while integration with IBM QRadar relies on the Ariel Query Language (AQL). This dual-language support allows the SQM to translate security intent into queries understood by these distinct platforms, enabling consistent threat detection and incident response across heterogeneous security environments.
Orchestrating Response: Automation and the Accelerated Timeline
Automated query generation is transforming how security analysts approach incident investigation, enabling a swift and remarkably accurate determination of root causes. This technology bypasses the traditionally lengthy process of manual data searching and correlation, instead constructing targeted queries based on alert characteristics. Studies demonstrate this automation achieves a 90.0% accuracy rate in incident resolution, significantly reducing the potential for false positives and accelerating containment efforts. By intelligently parsing alerts and translating them into actionable search parameters, analysts can rapidly pinpoint the origin of threats, minimizing damage and bolstering overall security posture. This proactive approach shifts incident response from a reactive scramble to a data-driven, efficient process.
Retrieval-Augmented Generation (RAG) dramatically improves incident response efficiency by equipping security analysts with precisely the information needed, when needed. Rather than relying solely on pre-programmed knowledge or manual searches, RAG systems dynamically access and synthesize data from various sources – including internal knowledge bases, threat intelligence feeds, and past incident reports – to provide contextually relevant insights. This capability allows analysts to quickly understand the scope and severity of an incident, identify potential attack vectors, and formulate effective remediation strategies. By augmenting the analyst’s expertise with readily available, verified information, RAG not only accelerates resolution times but also reduces the risk of misdiagnosis and incomplete responses, fostering a more proactive and resilient security posture.
Significant gains in security operations are realized when incident response is streamlined and integrated with existing service management platforms like ServiceNow. This integration dramatically accelerates remediation timelines by automating workflows and eliminating manual data transfer, which historically consumed considerable analyst time. Recent implementations demonstrate a substantial reduction in analyst triage efforts, compressing the initial investigation phase from an average of four hours to just ten minutes. This accelerated response not only minimizes the impact of security incidents but also allows security teams to address a greater volume of alerts, enhancing overall security posture and demonstrably lowering the mean time to resolution (MTTR) for critical issues.
The pursuit of autonomous Security Operations, as detailed in this framework, echoes a fundamental truth about all complex systems. This work acknowledges that even with advanced tooling like Large Language Models and Retrieval-Augmented Generation, perfect stability remains elusive; latency – the delay in response – is an inherent cost. As Bertrand Russell observed, “The only thing that you can be absolutely sure of is that nothing is certain.” This aligns with the article’s implicit acceptance of ongoing refinement; the LLM framework doesn’t eliminate the need for human oversight, but rather reshapes it, accepting that constant vigilance is the price of a resilient system. The framework strives not for an impossible permanence, but for graceful adaptation in the face of inevitable change.
What Lies Ahead?
The pursuit of autonomous Security Operations Centers, as outlined in this work, is not a sprint toward perfect automation, but rather the slow accrual of managed decay. Each successful query generated, each threat accurately identified, merely postpones the inevitable – the emergence of novel attack vectors and the entropy inherent in any complex system. The framework presented represents a snapshot in time, a momentary stabilization against the tide. The true measure of its longevity will not be its initial accuracy, but its capacity to degrade gracefully.
Future iterations will inevitably confront the limitations of retrieval-augmented generation. The very knowledge bases used to inform these systems are themselves imperfect, biased, and subject to obsolescence. Technical debt, in this context, isn’t merely a coding shortfall; it’s the past’s assumptions baked into the present’s defenses. A critical area for exploration lies in developing mechanisms for self-correction, allowing the system to identify and remediate its own knowledge gaps-to learn its own obsolescence.
Ultimately, the question isn’t whether these systems can eliminate human intervention entirely, but whether they can shift it. The ideal outcome may not be a fully automated SOC, but one where analysts are freed from repetitive tasks to focus on the genuinely novel-the anomalies that represent not merely bugs, but evolutionary leaps in the adversarial timeline. Every bug, after all, is a moment of truth, revealing the system’s current limitations and charting the course for its future adaptation.
Original article: https://arxiv.org/pdf/2604.27321.pdf
Contact the author: https://www.linkedin.com/in/avetisyan/
See also:
- Robinhood’s $75M OpenAI Bet: Retail Access or Legal Minefield?
- Lonely Player Anomaly Commission Guide In NTE (Wandering Puppet Locations)
- All Nameless Hospital Endings Full Guide In NTE
- All Skyblazer Armor Locations in Crimson Desert
- All Hauntingham’s Letters & Hidden Page in New Super Lucky’s Tale
- How to Complete Funny Blocks Game in Infinity Nikki
- Change Your Perspective Anomaly Commission Guide In NTE (Neverness to Everness)
- Riven Tides Classified Records Keycard Door Location in ARC Raiders
- Grime 2 Map Unlock Guide: Find Seals & Fast Travel
- Midas Tower ReroRero Phone Booth Location in NTE
2026-05-02 23:00