Skip to content

Commit

Permalink
[LLVMCPU] Add support for dynamic quantization + reassociation of gro…
Browse files Browse the repository at this point in the history
…uped qmm MegaPR

[LLVMCPU] Allow parallel tiling in LLVMCPUSplitReduction, tile reduction by 2
This commit enables tiling of parallel dimensions in LLVMCPUSplitReduction,
as well as changing the tile size of the resulting reduction to 2. The latter
change is an x86 specific optimization that allows targeting specific
instructions through VectorContractCustomKernels.

[LLVMCPU] Add support for vecmat cases in VectorContractCustomKernel
This commit introduces some new functionality to VectorContractCustomKernels:
  1. Matching for vecmat kernels that have 1D vector shapes
  2. Support for `vector.contract` ops with split reduction dimensions
  3. Ability to allow promoting smaller bitwidth inputs with `arith.extui` or
     `arith.extsi` before passing into the `llvm.inline_asm` op
  4. Ability to specify explicit constraint strings per register input in a
     VectorContractCustomKernel
  5. Support for `i4` and `i8` input types
  6. New  x86 AVX512VNNI i16xi16->i32 vecmat kernel with split reduction

This commit also adds `vector.transfer_read` flattening patterns and
VectorContractCustomKernel lowering patterns to LLVMCPUVectorLowering.

[LLVMCPU] Add pass to breakdown subbyte `arith.extui`
This pass breaks down `arith.extui` ops that have `i4` inputs into a
sequence of `vector.shuffle->arith.andi->arith.shrui`. This avoids bad
lowering of subbyte extends in x86 backend. This pass is somewhat
specific to some work on vecmat VectorContractCustomKernels right now,
and has some unique matchings.

The pass also attempts to make use of AVX512 registers, so the vector
size for the resulting IR is hardcoded as 512 bits. This needs to
change before landing. This pass in general needs some refactoring
before landing.

[LLVMCPU] Add pass to fold away unit dimensions on `vector.contract` ops
This pass folds away unit dimensions on `vector.contract` ops to get these
ops into a form that is recognizable by the VectorContractCustomKernels
patterns.

This pass also hoists `vector.shape_cast` ops out of containing
`scf.for` ops if possible when the shape cast operates on the accumulator
of a `vector.contract` op. This pattern may be better off somewhere else,
but for now it is here because the unit dim folding pattern can produce
a hoistable `vector.shape_cast` op in cases with split reduction.

[LLVMCPU] Add flag to restrict reassociated quantized matmul optimizations

[LLVMCPU] Add additional Memref alias foldings

[LLVMCPU] Simplify VectorContractCustomKernels x86 constraint codes, add new AVX512 kernel
  • Loading branch information
Max191 authored and qedawkins committed Oct 12, 2023
1 parent c5ac55e commit f2b185c
Show file tree
Hide file tree
Showing 11 changed files with 1,502 additions and 88 deletions.
3 changes: 3 additions & 0 deletions compiler/src/iree/compiler/Codegen/LLVMCPU/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -53,8 +53,11 @@ iree_compiler_cc_library(
"KernelDispatch.cpp",
"LLVMCPUAssignConstantOrdinals.cpp",
"LLVMCPUAssignImportOrdinals.cpp",
"LLVMCPUBreakDownSubbyteExtend.cpp",
"LLVMCPUCheckIRBeforeLLVMConversion.cpp",
"LLVMCPUEmitVectorizationRemarks.cpp",
"LLVMCPUFoldMemRefAliasOps.cpp",
"LLVMCPUFoldVectorContractUnitDims.cpp",
"LLVMCPULinkExecutables.cpp",
"LLVMCPULowerExecutableTarget.cpp",
"LLVMCPULowerToUKernels.cpp",
Expand Down
3 changes: 3 additions & 0 deletions compiler/src/iree/compiler/Codegen/LLVMCPU/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,11 @@ iree_cc_library(
"KernelDispatch.cpp"
"LLVMCPUAssignConstantOrdinals.cpp"
"LLVMCPUAssignImportOrdinals.cpp"
"LLVMCPUBreakDownSubbyteExtend.cpp"
"LLVMCPUCheckIRBeforeLLVMConversion.cpp"
"LLVMCPUEmitVectorizationRemarks.cpp"
"LLVMCPUFoldMemRefAliasOps.cpp"
"LLVMCPUFoldVectorContractUnitDims.cpp"
"LLVMCPULinkExecutables.cpp"
"LLVMCPULowerExecutableTarget.cpp"
"LLVMCPULowerToUKernels.cpp"
Expand Down
Loading

0 comments on commit f2b185c

Please sign in to comment.