vLLM and PyTorch Work Together to Improve the Developer Experience on aarch64

About This Tutorial

PyTorch 2.11 allows direct installation of CUDA-enabled wheels on aarch64 Linux from PyPI, simplifying deployment on systems like NVIDIA GH200, GB200, and GB300. This change improves the installation experience for vLLM users, resolving a two-year-old issue that previously required custom package indexes and workarounds. Collaboration between vLLM and PyTorch through the PyTorch Foundation facilitated this fix. The update eliminates the need for explicit pip flags and transitive dependency issues, making it easier to install and run vLLM on these systems.