AI RESEARCH
Go-UT-Bench: A Fine-Tuning Dataset for LLM-Based Unit Test Generation in Go
arXiv CS.LG
•
Training data imbalance poses a major challenge for code LLMs. Most available data heavily over represents raw opensource code while underrepresenting broader software engineering tasks, especially in low resource languages like Golang. As a result, models excel at code autocompletion but struggle with real world developer workflows such as unit test generation.