AI RESEARCH

Building an Adversarial Malware Dataset by Family and Type: Generation, Evasion, and Poisoning Evaluation

arXiv CS.LG

ArXi:2605.25937v1 Announce Type: cross We present a dataset of adversarial malware samples derived from the public RawMal-TF collection of real-world malware binaries. Using a suite of adversarial malware generators, we construct two sets of adversarial PE files: 44,347 family-labelled samples and 33,596 type-labelled samples, achieving evasion rates of 98.35 % and 92.20 % against the EMBER classifier, respectively. Each adversarial binary is accompanied by detailed metadata, including EMBER scores and VirusTotal classifications.