AI RESEARCH
Bounded Hyperbolic Tangent: A Stable and Efficient Alternative to Pre-Layer Normalization in Large Language Models
arXiv CS.AI
•
ArXi:2601.09719v3 Announce Type: replace-cross Pre-Layer Normalization (Pre-LN) is the de facto choice for large language models (LLMs) and is crucial for stable pre