AI RESEARCH

Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems

arXiv CS.AI

ArXi:2606.00925v1 Announce Type: cross Open agent platforms allow community contributors to publish reusable skills that agents can invoke at runtime. This extensibility also creates a supply-chain risk: malicious contributors can hide harmful behavior inside skills that appear benign under superficial inspection. However, existing defenses are hard to evaluate because there is no benchmark that measures both malicious-skill detection and runtime verification. We present SkillVetBench, a two-stage security vetting benchmark for open agentic skill ecosystems.