Skip to content

Commit

Permalink
Refactor the control tree and other infrastructure (#710)
Browse files Browse the repository at this point in the history
Details:
1. A "plugin" architecture.
- Users are now able to register new kernels, kernel preferences, and
  blocksizes at runtime, directly from user applications.
- Plugins can be created, configured, and built using only an installed
  version of BLIS -- no source or source code changes required.
- Plugins support both reference and optimized kernels, as well as
  custom configuration-to-kernel-set mappings.
- Building plugins (including reference and relevant optimized kernels)
  for enabled architectures or architecture families is automated, as is
  linking into the final library.
- The configure script is now installed as 'configure-plugin'. In this
  mode, it can be used to initialize a plugin from a template including
  optional example code, and prepare a build system for compiling the
  plugin into a shared or static library.
- Additional configuration files, templates, and build system components
  are also installed to '%prefix%/share/blis'.
- The cntx_t struct now has extensible data structures for holding
  kernels, preferences, and blocksizes. These are based on a "stack"
  structure which contains a list of fixed-size data blocks. Adding a
  new entry (which may require allocating a new block or reallocating
  the block pointer array) requires locking, but looking up entries is
  lock-free and takes O(1) time.
- Kernels can depend on either 1 or 2 type parameters (e.g.
  mixed-precision packing requires 2). The func2_t struct supports
  the latter, but can be implicitly cast to func_t if only "diagonal"
  entries are needed. The number of type parameters can be inferred from
  the kernel ID for type safety.
- Functions have been added to register new kernels, preferences, and
  blocksizes with the global kernel structure (gks). This creates
  corresponding entries in each allocated context and returns the next
  available ID. Plugins use this API to register user kernels, although
  the user is responsible for tracking the returned IDs for later
  lookup. Setting newly-registered reference kernels, as well as
  overriding these with optimized kernels is done in exactly the same
  manner as in bli_cntx_init_ref() and bli_cntx_init_<subconfig>().

2. Restructuring of the control and thread control trees.
- The control tree has been substantially restructured to support more
  flexibility.
- The "default" control trees for gemm (also used for
  hemm/symm/herk/her2k/syrk/syr2k/trmm/trmm3) and trsm are now
  represented as a single structure containing all necessary control
  tree nodes and parameters.
- An API has been added to modify the default gemm/trsm control trees.
- This same API is used by the framework and packm/gemm/trsm variants
  to access specific control tree nodes.
- Users can alternatively create a custom control tree from scratch.
- The blocksizes are now encoded directly in the control tree, rather
  than via loop IDs. The logic for adjusting blocksizes for certain
  operations has been moved to the control tree initialization.
- Type information is encoded in the control tree to drive proper
  selection of packing and computational kernels provided by the user.
- The packing microkernel now receives an opaque "params" struct which
  is user-definable and can be used to pass additional information
  through the call stack.
- The auxinfo_t struct has been updated with a .params field for
  opaque user data as well as the global offsets of the current
  microtile.
- The packm and gemm variants can be overridden by the user, and also
  receive an opaque params struct via the associated control tree
  node.
- The structure-aware packing kernel bli_packm_struc_cxk() is no longer
  hard-coded to be called from the default packm variant, but can be
  overridden by the user. It also supports mixed-precision/mixed-domain
  natively now.
- The thread control tree (thrinfo_t) is now created entirely up-front
  by inspecting the control tree. The required number of threads at each
  level is encoded in the control tree via loop IDs (actually a bitfield
  of loop IDs), although the ordering and number of such IDs is
  arbitrary. The logic for adjusting the number of threads at each level
  based on operation type (e.g. trmm) is now in the control tree
  initialization and expressed by combining loop IDs from multiple
  levels into a single level.
- The mem_t object containing the pack buffer pointer has been moved
  from the control tree to the thread control tree. NOTE: **The control
  tree is now strictly const throughout the operation, and only a
  single copy is shared by all threads.**
- The thread control tree node for packing has been changed so that
  there is no longer a "fake" node indicating a team of single threads.
  Instead, the number of threads and thread IDs in the "normal" thread
  control tree node are used. This change has also been made to the
  gemmsup thread control tree and packing variants, as well as to the
  gemmlike sandbox.
- Parameters controlling packing (e.g. inversion of the diagonal,
  direction, schema) are not stored directly in the control tree but in
  the opaque params struct. The packing control tree node and its
  default params struct are stored together in the "combined"
  gemm/trsm control tree structure and initialized as a unit. Users can
  update these parameters individually or substitute a custom packm
  variant and params struct.
- The "target" and "execution" datatypes has been removed from the obj_t
  struct and replaced by type information in the control tree.
- The "sub-node" and "sub-prenode" of a control tree node have been
  replaced by an arbitrary number of sub-nodes accessed by index. There
  is a hard cap on the number of sub-nodes (currently 2). Sub-nodes are
  added during control tree initialization, *after*
  creation/initialization of the parent node through an updated API.
- The level-3 thread decorator has been significantly simplified and
  directly calls bli_l3_int(). The control tree is created externally,
  and it is no longer necessary to alias matrices or set object pack
  schemas. Also, the rntm_t passed in may be NULL. Finally, family
  and scalar information is no longer needed here.
- bli_l3_int() is now a simple inline function which extracts the next
  control tree node and variant and calls it.
- bli_*_front() have been removed and inlined into the expert object
  API with significant simplification.
- 1m (or other induced method) no longer uses an alternative cntx_t.
- The .pack_fn/.ker_fn pointers and associated params fields on the
  obj_t were removed in favor of the present solution.

3. Overhaul of variable substitution in configure script.
- The configure script has been somewhat re-written to use a
  centralized mechanism for substituting variables into build system and
  other configuration files.
- All substitution variables go through the same pathway now, which
  necessitated some variable naming changes for variables which were
  named the same in e.g. Makefile and bli_config.h but with
  different definitions.
- CC and CXX variables can now contain spaces, e.g. 'g++ -std=c++17'.
  This provides better support for integration with build tooling such
  as autotools.

4. Overhaul of packing kernels.
- Previously there were two packing kernels referenced in the cntx_t
  structure for MRxk and NRxk shaped micropanels, respectively. These
  have now been merged into one kernel which is responsible for packing
  any dense rectangular portion of either A or B.
- The packing kernel now receives information about the register
  blocksize (cdim_max) and duplication factor (the "broadcast-B"
  format, although this can also apply to the A matrix).
- The structure-aware packing kernel (bli_packm_struc_cxk(), which is
  now user-overridable) also receives global offsets of the current
  micropanel within A or B.
- Explicit kernels for packing the diagonal blocks of
  triangular/symmetric/Hermitian matrices have been added to the
  cntx_t. This means that the bli_packm_struc_ckx() "kernel" no longer
  needs to directly touch data (except to zero out some regions).
- bli_packm_struc_cxk() has also been updated to work only in terms of
  fundamental elements (i.e., real datatypes) when computing offsets and
  when zeroing data, which greatly simplifies mixed-domain/1m packing.
- bli_packm_scalar() has been updated to better support complex scalars
  in mixed-domain operations.
- Pack schemas for PACKED_ROW_PANELS* and PACKED_COL_PANELS* have
  been merged into simply PACKED_PANELS*. This reflects the merging of
  the packing kernels into a single generic kernel. There were only a
  very few places which needed the row/column information and this is
  now supplied by alternative means.
- Packing variants always behave "as if" the A matrix were being packed
  (i.e. the code assumes packing column-stored row panels). Packing of B
  is handled by applying an implicit or explicit transpose before
  packing. This change also applies to gemmsup.

5. Improved MD/MP support.
- All level-3 operations (except trsm) now support full
  mixed-domain/mixed-precision operation.
- Explicit 1m packing kernels have been added in the cntx_t.
- An explicit 1m microkernel wrapper has been added to the cntx_t.
- An extra packing kernel for the "ro" format has been added, along with
  the pack_t enumeration value. This supports the packing for
  real*complex -> real, including potential scaling by a complex alpha,
  support for structured matrices, etc.
- Extra microkernel wrappers for mixed-domain operations have been added
  to support the 'ccr' (and by extension, 'crc'), 'rcc', and 'crr'
  cases. Notably this includes full support for general stride storage
  and complex alpha/beta.
- Packing kernels and gemm microkernels are now "templated" based on two
  type parameters rather than one. For packing this allows direct
  optimization of mixed-precision kernels, and for gemm microkernels
  this allows direct optimization of mixed-precision without writing to
  a temporary buffer. Reference packing kernels are directly
  instantiated for all mixes of precisions, while by default
  mixed-precision gemm microkernels are supported via a microkernel
  wrapper. The "old" way of specifying optimized kernels using a single
  type parameter works unchanged.
- alpha and beta are typecast appropriately to the computational or
  output datatype, respectively, and **always** to the complex domain.
  Scalar typecasting has also been added to gemmsup for safety.
- The gemm macrokernel doesn't have to do any typecasting anymore, as a
  microkernel wrapper or optimized mixed-precision/mixed-domain kernel
  now handles this.
- 1m and mixed-domain operations now always use a microkernel wrapper,
  rather than adjusting parameters in the gemm macrokernel.
- The gemmt macrokernel **does** still have to handle explicit
  write-back of microtiles which intersect the diagonal, although
  typecasting has already been performed.
- The gemmt_x_ker_var2(), trmm_xx_ker_var2(), and trsm_xx_ker_var2()
  functions have been removed. The appropriate macrokernel pointer is
  selected during control tree initialization.
- Real domain MR/NR are checked for even-ness based on the gemm
  microkernel's row preference in order to guarantee proper 1m and
  mixed-domain operation.
- Full range of mixed-domain/mixed-precision functionality tested in the
  testsuite ('input.*.mixed').

6. Other changes:
- The build system has been updated to support C++ source files
  throughout the framework. While the intent is not to add such files to
  BLIS itself, this supports plugins written in C++.
- Many instances of configuration-specific code have been simplified by
  introducing an INSERT_GENTCONF macro which instantiates a block of
  code for each enabled sub-configuration. The ConfigurationHowTo.md
  document has been updated accordingly.
- PASTEMAC?/PASTECH?/PASTEF77? have been removed in favor of
  variadic macros which accept any number of arguments (up to a
  reasonable limit).
- The INSERT_GENTFUNC* macros have been updated to clean up
  mixed-precision and mixed-domain instantiations.
- bli_align_dim_to_mult() has been updated to support rounding either up
  or down based on a flag.
- Checking for empty matrices and other early exits (level-3 only) has
  been consolidated into a single utility function.
- The auxinfo_t struct is always passed as const.
- The new function bli_obj_alias_submatrix() aliases a matrix while also
  resetting the root to NULL, offsets to zero (while adjusting the
  buffer), and applying any implicit transpose.
- Level-3 pruning functions now only check matrix structure to see what
  to do, not the operation family.
- gemmsup packing has been updated to use the "normal" pack buffer
  allocation routines.
- Remove duplicate checks for early return from gemmsup handler.
- bli_determine_blocksize() has been significantly simplified.
- Partitioning packed panels is no longer allowed.
- Added bli_xxsame macros.
- Automated the calculation of info bit shifts and masks based on
  predefined bit sizes for various flags. This greatly simplifies
  reordering, adding, or removing flags from the info/info2 bitfields.
- Moved more BLIS_NUM_* macros into the corresponding enums as the
  last entry so that the value is automatically computed.
- Better const-correctness in some level0 scalar macros.
- Better mixed-precision support in some level0 scalar macros.
- Added a bli_axpbys_mxn() macro.
- bli_thread_range_sub() takes explicit thread ID and number of threads
  rather than a thrinfo_t node.
- "De-templated" BLIS gemmlike sandbox (specifically, bls_gemm_bp_var1()
  and bls_packm_var1()).
- Combined bls_l3_packm_[ab]() into one function with thin wrappers.
- Deleted bls_packm_var[23]().
- Add a "termination tag" to the testsuite output so that
  'make check-blis' can accurately check for successful completion.
- Add a new function to centrally compute FLOPs for level-3 operations
  in the testsuite.
  • Loading branch information
devinamatthews authored Apr 24, 2024
1 parent a316d2c commit a49238e
Show file tree
Hide file tree
Showing 571 changed files with 19,427 additions and 23,297 deletions.
72 changes: 61 additions & 11 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -321,7 +321,31 @@ endif

# Define a list of makefile fragments to install.
FRAGS_TO_INSTALL := $(CONFIG_MK_FILE) \
$(COMMON_MK_FILE)
$(COMMON_MK_FILE) \
$(DIST_PATH)/build/gen-make-frags/gen-make-frag.sh \
$(DIST_PATH)/build/gen-make-frags/fragment.mk \
$(DIST_PATH)/build/gen-make-frags/ignore_list \
$(DIST_PATH)/build/gen-make-frags/special_list \
$(DIST_PATH)/build/gen-make-frags/suffix_list \
$(DIST_PATH)/build/flatten-headers.py \
$(DIST_PATH)/build/mirror-tree.sh \
$(DIST_PATH)/config_registry \
$(DIST_PATH)/build/detect/iset/avx.s \
$(DIST_PATH)/build/detect/iset/avx512dq.s \
$(DIST_PATH)/build/detect/iset/avx512f.s \
$(DIST_PATH)/build/detect/iset/fma3.s \
$(DIST_PATH)/build/detect/iset/fma4.s

# Define a list of plugin makefile fragments to install.
PLUGIN_FRAGS_TO_INSTALL := $(DIST_PATH)/build/plugin/bli_plugin_init_ref.c \
$(DIST_PATH)/build/plugin/bli_plugin_init_zen3.c \
$(DIST_PATH)/build/plugin/bli_plugin_register.c \
$(DIST_PATH)/build/plugin/my_kernel_1_ref.c \
$(DIST_PATH)/build/plugin/my_kernel_2_ref.c \
$(DIST_PATH)/build/plugin/my_kernel_1_zen3.c \
$(DIST_PATH)/build/plugin/bli_plugin.h.in \
$(DIST_PATH)/build/plugin/config.mk.in \
$(DIST_PATH)/build/plugin/Makefile

PC_IN_FILE := blis.pc.in
PC_OUT_FILE := blis.pc
Expand Down Expand Up @@ -1085,21 +1109,47 @@ $(foreach h, $(HELP_HEADERS_TO_INSTALL), $(eval $(call make-helper-header-rule,$

install-share: check-env $(MK_SHARE_DIR_INST) $(PC_SHARE_DIR_INST)

$(MK_SHARE_DIR_INST): $(FRAGS_TO_INSTALL) $(CONFIG_MK_FILE)
$(MK_SHARE_DIR_INST): $(CONFIGURE_FILE) $(FRAGS_TO_INSTALL) $(PLUGIN_FRAGS_TO_INSTALL) $(CONFIG_DIR)/$(CONFIG_NAME)/$(MAKE_DEFS_FILE)
ifeq ($(ENABLE_VERBOSE),yes)
$(MKDIR) $(@)
$(INSTALL) -m 0644 $(FRAGS_TO_INSTALL) $(@)
$(MKDIR) -p $(@)/$(CONFIG_DIR)/$(CONFIG_NAME)
$(INSTALL) -m 0644 $(CONFIG_DIR)/$(CONFIG_NAME)/$(MAKE_DEFS_FILE) \
$(@)/$(CONFIG_DIR)/$(CONFIG_NAME)
$(MKDIR) $(@)/plugin
$(INSTALL) -m 0755 $(filter %.sh,$(FRAGS_TO_INSTALL)) $(@)
$(INSTALL) -m 0644 $(filter-out %.sh,$(FRAGS_TO_INSTALL)) $(@)
$(INSTALL) -m 0644 $(PLUGIN_FRAGS_TO_INSTALL) $(@)/plugin
$(INSTALL) -m 0755 $(CONFIGURE_FILE) $(@)/configure-plugin
# $(MKDIR) -p $(@)/$(CONFIG_DIR)/$(CONFIG_NAME)
# $(INSTALL) -m 0644 $(CONFIG_DIR)/$(CONFIG_NAME)/$(MAKE_DEFS_FILE) \
# $(@)/$(CONFIG_DIR)/$(CONFIG_NAME)
for THIS_CONFIG in $(FULL_CONFIG_LIST); do \
$(MKDIR) -p $(@)/$(CONFIG_DIR)/$$THIS_CONFIG; \
$(INSTALL) -m 0644 $(CONFIG_DIR)/$$THIS_CONFIG/$(MAKE_DEFS_FILE) \
$(@)/$(CONFIG_DIR)/$$THIS_CONFIG; \
$(INSTALL) -m 0644 $(CONFIG_DIR)/$$THIS_CONFIG/bli_kernel_defs_$$THIS_CONFIG.h \
$(@)/$(CONFIG_DIR)/$$THIS_CONFIG; \
done
else
@$(MKDIR) $(@)
@$(MKDIR) $(@)/plugin
@echo "Installing $(notdir $(FRAGS_TO_INSTALL)) into $(@)/"
@$(INSTALL) -m 0644 $(FRAGS_TO_INSTALL) $(@)
@$(MKDIR) -p $(@)/$(CONFIG_DIR)/$(CONFIG_NAME)
@echo "Installing $(CONFIG_DIR)/$(CONFIG_NAME)/$(MAKE_DEFS_FILE) into $(@)/$(CONFIG_DIR)/$(CONFIG_NAME)"
@$(INSTALL) -m 0644 $(CONFIG_DIR)/$(CONFIG_NAME)/$(MAKE_DEFS_FILE) \
$(@)/$(CONFIG_DIR)/$(CONFIG_NAME)/
@$(INSTALL) -m 0755 $(filter %.sh,$(FRAGS_TO_INSTALL)) $(@)
@$(INSTALL) -m 0644 $(filter-out %.sh,$(FRAGS_TO_INSTALL)) $(@)
@echo "Installing $(notdir $(PLUGIN_FRAGS_TO_INSTALL)) into $(@)/plugin/"
@$(INSTALL) -m 0644 $(PLUGIN_FRAGS_TO_INSTALL) $(@)/plugin
@echo "Installing $(CONFIGURE_FILE) into $(@)/configure-plugin"
@$(INSTALL) -m 0755 $(CONFIGURE_FILE) $(@)/configure-plugin
# @$(MKDIR) -p $(@)/$(CONFIG_DIR)/$(CONFIG_NAME)#\
# @echo "Installing $(CONFIG_DIR)/$(CONFIG_NAME)/$(MAKE_DEFS_FILE) into $(@)/$(CONFIG_DIR)/$(CONFIG_NAME)"
# @$(INSTALL) -m 0644 $(CONFIG_DIR)/$(CONFIG_NAME)/$(MAKE_DEFS_FILE) \
# $(@)/$(CONFIG_DIR)/$(CONFIG_NAME)/
@for THIS_CONFIG in $(FULL_CONFIG_LIST); do \
$(MKDIR) -p $(@)/$(CONFIG_DIR)/$$THIS_CONFIG; \
echo "Installing $(CONFIG_DIR)/$$THIS_CONFIG/$(MAKE_DEFS_FILE) into $(@)/$(CONFIG_DIR)/$$THIS_CONFIG"; \
$(INSTALL) -m 0644 $(CONFIG_DIR)/$$THIS_CONFIG/$(MAKE_DEFS_FILE) \
$(@)/$(CONFIG_DIR)/$$THIS_CONFIG; \
echo "Installing $(CONFIG_DIR)/$$THIS_CONFIG/bli_kernel_defs_$$THIS_CONFIG.h into $(@)/$(CONFIG_DIR)/$$THIS_CONFIG"; \
$(INSTALL) -m 0644 $(CONFIG_DIR)/$$THIS_CONFIG/bli_kernel_defs_$$THIS_CONFIG.h \
$(@)/$(CONFIG_DIR)/$$THIS_CONFIG; \
done
endif

$(PC_SHARE_DIR_INST): $(PC_IN_FILE)
Expand Down
20 changes: 0 additions & 20 deletions build/bli_config.h.in
Original file line number Diff line number Diff line change
Expand Up @@ -146,26 +146,6 @@
#endif
#endif

#ifndef BLIS_ENABLE_MIXED_DT
#ifndef BLIS_DISABLE_MIXED_DT
#if @enable_mixed_dt@
#define BLIS_ENABLE_MIXED_DT
#else
#define BLIS_DISABLE_MIXED_DT
#endif
#endif
#endif

#ifndef BLIS_ENABLE_MIXED_DT_EXTRA_MEM
#ifndef BLIS_DISABLE_MIXED_DT_EXTRA_MEM
#if @enable_mixed_dt_extra_mem@
#define BLIS_ENABLE_MIXED_DT_EXTRA_MEM
#else
#define BLIS_DISABLE_MIXED_DT_EXTRA_MEM
#endif
#endif
#endif

#if @enable_sup_handling@
#define BLIS_ENABLE_SUP_HANDLING
#else
Expand Down
29 changes: 18 additions & 11 deletions build/config.mk.in
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,9 @@ CONFIG_NAME := @config_name@
# sub-configuration in CONFIG_LIST corresponds to a configuration
# sub-directory in the 'config' directory. See the 'config_registry'
# file for the full list of registered configurations.
CONFIG_LIST := @config_list@
CONFIG_LIST := @config_list@
FULL_CONFIG_LIST := @full_config_list@
FULL_SUBCONFIG_LIST := @full_subconfig_list@

# This list of kernels needed for the configurations in CONFIG_LIST.
# Each item in this list corresponds to a sub-directory in the top-level
Expand All @@ -62,6 +64,7 @@ CONFIG_LIST := @config_list@
# kernel set X, and configuration W uses kernel set Q, and the CONFIG_LIST
# might contained "X Y Z W", then the KERNEL_LIST would contain "X Z Q".
KERNEL_LIST := @kernel_list@
FULL_KERNEL_LIST := @full_kernel_list@

# This list contains some number of "kernel:config" pairs, where "config"
# specifies which configuration's compilation flags (CFLAGS) should be
Expand Down Expand Up @@ -101,9 +104,12 @@ CLANG_OT_12_0_0 := @clang_older_than_12_0_0@
AOCC_OT_2_0_0 := @aocc_older_than_2_0_0@
AOCC_OT_3_0_0 := @aocc_older_than_3_0_0@

# The C++ compiler. NOTE: A C++ is typically not needed.
# The C++ compiler. NOTE: A C++ compiler is typically not needed.
CXX := @CXX@

# The Fortran compiler. NOTE: A Fortran compiler is typically not needed.
FC := @FC@

# Static library indexer.
RANLIB := @RANLIB@

Expand All @@ -113,12 +119,13 @@ AR := @AR@
# Python Interpreter
PYTHON := @PYTHON@

# Preset (required) CFLAGS and LDFLAGS. These variables capture the value
# of the CFLAGS and LDFLAGS environment variables at configure-time (and/or
# the value of CFLAGS/LDFLAGS if either was specified on the command line).
# Preset (required) CFLAGS, CXXFLAGS, and LDFLAGS. These variables capture the value
# of the CFLAGS, CXXFLAGS, and LDFLAGS environment variables at configure-time (and/or
# the value of CFLAGS/CXXFLAGS/LDFLAGS if any was specified on the command line).
# These flags are used in addition to the flags automatically determined
# by the build system.
CFLAGS_PRESET := @cflags_preset@
CXXFLAGS_PRESET := @cxxflags_preset@
LDFLAGS_PRESET := @ldflags_preset@

# The level of debugging info to generate.
Expand All @@ -129,7 +136,7 @@ ENABLE_DEBUG := @enable_debug@
MK_ENABLE_ASAN := @enable_asan@

# Whether operating system support was requested via --enable-system.
ENABLE_SYSTEM := @enable_system@
ENABLE_SYSTEM := @mk_enable_system@

# The requested threading model(s).
THREADING_MODEL := @threading_model@
Expand Down Expand Up @@ -179,8 +186,8 @@ ARG_MAX_HACK := @enable_arg_max_hack@
# Whether to build the static and shared libraries.
# NOTE: The "MK_" prefix, which helps differentiate these variables from
# their corresonding cpp macros that use the BLIS_ prefix.
MK_ENABLE_STATIC := @enable_static@
MK_ENABLE_SHARED := @enable_shared@
MK_ENABLE_STATIC := @mk_enable_static@
MK_ENABLE_SHARED := @mk_enable_shared@

# Whether to use an install_name based on @rpath.
MK_ENABLE_RPATH := @enable_rpath@
Expand All @@ -190,11 +197,11 @@ MK_ENABLE_RPATH := @enable_rpath@
EXPORT_SHARED := @export_shared@

# Whether to enable either the BLAS or CBLAS compatibility layers.
MK_ENABLE_BLAS := @enable_blas@
MK_ENABLE_CBLAS := @enable_cblas@
MK_ENABLE_BLAS := @mk_enable_blas@
MK_ENABLE_CBLAS := @mk_enable_cblas@

# Whether libblis will depend on libmemkind for certain memory allocations.
MK_ENABLE_MEMKIND := @enable_memkind@
MK_ENABLE_MEMKIND := @mk_enable_memkind@

# The names of the addons to include when building BLIS. If empty, no addons
# will be included.
Expand Down
Loading

0 comments on commit a49238e

Please sign in to comment.