AI RESEARCH
Lean-GAP: A Dataset of Formalized Graduate Algebra Problems
arXiv CS.AI
•
ArXi:2606.02588v1 Announce Type: cross We present Lean-GAP (Lean-Graduate Agebra Problems), 430 formalized graduate-level algebra problems from the textbook Abstract Algebra by Dummit and Foote. We develop a scalable pipeline consisting of PDF-to-LaTeX preprocessing, autoformalization into Lean 4, and verification of informal-formal correspondence. While the preprocessing and autoformalization stages can be largely automated, we find that verification remains the most subtle and labor-intensive component, requiring careful human oversight.