Short Story Creative Writing Benchmark. Baidu Ernie 5.1: -0.35, Qwen 3.7 Max: -2.01, Mistral Medium 3.5: -2.13, Grok 4.3: -3.81.
r/singularity
•
Generative AI
Open Source AI
AI Research
This benchmark uses head-to-head comparisons of stories written in response to the same constrained creative briefs. The target range is 600-800 words. info: github.com/lechmazur/writing/ submitted by /u/zero0_one1 [link] [comments]