Hot-Start Chinese Language Modeling:Visual Glyphs Accelerate Sample-Efficient Learning

ArXi:2601.09566v4 Announce Type: replace-cross In this work, we study whether rendering Chinese characters as visual glyph images, rather than discrete token IDs as mainstream LLMs do, providing an inductive bias for character-level language modeling. Our central finding gives a double-edged insight: visual inputs produce a pronounced hot-start effect, than doubling early-stage accuracy within the first epoch (at 0.4% of total