EDUCATION & TRAINING
FlexAttention + FlashAttention-4: Fast and Flexible
PyTorch Blog
About This Tutorial
TL;DR: On Hopper and Blackwell GPUs, FlexAttention now has a FlashAttention-4 backend. We added in PyTorch to automatically generate CuTeDSL score/mask modification functions, and to JIT-instantiate FlashAttention-4 for custom.