SkillOpt treats markdown skill files as trainable parameters with proper optimization machinery

r/LocalLLaMA
AI Research

Paper came out recently that formalizes something a lot of agent builders have been doing ad hoc. They use a frontier model to propose bounded edits (add/delete/replace) to markdown skill files, then gate every edit against a held out validation set. Only strict improvements accepted, ties rejected, rejected edits become negative signal for the next round. Few things worth noting: Best skills converge with 1 to 4 accepted edits out of many proposals. Edit budget of 4 to 8 per step works best, remove the cap and performance collapses. Median final skill is ~920 tokens.