vllm.v1.attention.backends.mla.flashinfer_fa2_mla ¶
FlashInfer FA2 MLA Backend.
Uses the FlashInfer BatchMLAPagedAttentionWrapper with the FA2 backend for bf16 MLA decode. This backend requires plan()/run() API with CSR-format page indices and supports returning LSE for DCP.