EDUCATION & TRAINING

FlexAttention + FlashAttention-4: Fast and Flexible

PyTorch Blog

About This Tutorial

TL;DR: On Hopper and Blackwell GPUs, FlexAttention now has a FlashAttention-4 backend. We added in PyTorch to automatically generate CuTeDSL score/mask modification functions, and to JIT-instantiate FlashAttention-4 for custom.