HIDBench: Benchmarking Large Language Models for Host-Based Intrusion Detection

ArXi:2605.21773v1 Announce Type: cross Recent benchmark efforts have advanced the evaluation of large language models (LLMs) in cybersecurity, including tasks such as penetration testing and vulnerability identification. However, a critical cybersecurity task, namely intrusion detection from system logs, remains unexplored. In this work, we present a new benchmark to assess LLMs' capabilities in ing host-based intrusion detection systems