At H2O.ai, we believe that every company can and should be an AI company.
To make your own explainable AI platform, the platform needs to be open and extensible. This allows data scientists to control the automatic machine learning optimization process and ensure fairness, transparency and interpretability. Data scientists can add their insights, customizations and domain expertise as custom explainers to build the models responsibly.
MLI module of Driverless AI uses the concept of recipes so that users can add and develop custom explainers.
Table of Contents
- Introduction to MLI Bring Your Own Recipes
- Explainable Models and Explainers
- Custom Explainer
- Best Practices
- Explainer Examples
- Appendices
- Resources
H2O Driverless AI is an artificial intelligence (AI) platform for automatic machine learning.
Driverless AI provides robust interpretability of machine learning models to explain modeling results in a human-readable format. In the Machine Learning Interpretability (MLI) view, Driverless AI employs a host of different techniques and methodologies for interpreting and explaining the results of its models.
The set of techniques and methodologies can be extended with recipes. Driverless AI has support for BYOR (Bring Your Own Recipe). These recipes are Python code snippets. With BYOR, you can use your own recipes in combination with or instead of all built-in recipes. This allows you to further extend MLI explainers in addition to out-of-the-box techniques.
Custom explainer recipes can be uploaded into Driverless AI at runtime without having to restart the platform, just like a plugin.
When MLI user starts interpretation, model compatible explainers (from the available set of out-of-the-box and custom explainers) are selected and executed. Explainers create model explanations which are visualized in Driverless AI UI and/or can be downloaded:
- explainer execution
- explanation creation
- optional explanation normalization
- explanation visualization in UI and/or download
BYOR allows Data Scientists to bring their own recipes or leverage the existing, open-source recipes to explain models. In this way, the expertise of those creating and using the recipes is leveraged to focus on domain-specific functions to build customizations.
MLI BYORs in Driverless AI are of two main types:
- explainable models
- Driverless AI interpretable / glass box model recipes like XNN.
- model explainers
- MLI explainer recipes used for post hoc model analysis.
This guide elaborates model explainers.
Say hello to custom explainers with your first explainer:
from h2oaicore.mli.oss.byor.core.explainers import (
CustomExplainer,
)
from h2oaicore.mli.oss.byor.core.explanations import WorkDirArchiveExplanation
class ExampleHelloWorldExplainer(CustomExplainer):
_display_name = "Hello, World!"
_description = "This is 'Hello, World!' explainer example."
_regression = True
_explanation_types = [WorkDirArchiveExplanation]
def __init__(self):
CustomExplainer.__init__(self)
def explain(self, X, y=None, explanations_types=None, **kwargs) -> list:
explanation = self.create_explanation_workdir_archive(
display_name=self.display_name, display_category="Demo"
)
return [explanation]
If you want try Hello, World!
explainer now, please refer to Hello world! section.
Find more examples of simple explainers in Explainer Examples section:
- Logging Example
- EDA Example
- Score Example
- Parameters Example
- Compatibility Example
- Morris SA example
- Explainer Templates
- ...
This section describes how to create, deploy, run, debug and get results of custom explainers in detail. It is structured according to custom explainer life-cycle shown by activity diagram below:
Also you may want to check Creating Custom Explainer with MLI BYORs tutorial if you want to get started with custom explainers quickly.
Custom explainer recipe is Python class whose parent class is CustomExplainer
.
class MyExplainer(CustomExplainer):
...
MLI BYORs interfaces anticipate that instances of classes implementing custom explainer interfaces can run in different explainer container runtimes.
- Driverless AI is the first available MLI BYORs container runtime.
- Local (standalone) or cloud MLI BYORs container runtime might be provided in the future.
Custom explainers can be defined/implemented as:
- runtime independent
- such explainers are based on CustomExplainer and will run in any MLI BYORs container runtime
- runtime aware
- such explainers will run in a specific MLI BYORs container runtime only
For example Driverless AI MLI BYOR container explainers are based on CustomDaiExplainer
(in addition to CustomExplainer) to get access to Driverless AI specific APIs, artifacts and data
structures.
MLI BYORs define 3 important concepts:
- explainer
- executable code which implements custom explainer interface
- explanation
- model explanation created by explainer (like feature importance)
- format
- a representation of model explanation in a normalized format (like JSon file with model feature importances)
Explainer creates explanations which are persisted in various normalized formats.
Diagram below shows explainers, explanations and (normalized) formats which can be used to render representations in Driverless AI using Grammar of MLI UI components.
Explainer must create at least one explanation. Explanation must have at least one format.
Python base classes for explainers, explanations and formats are defined as follows:
- Explainer:
CustomExplainer
- Explanation type:
CustomExplanation+
- Representation:
CustomExplanationFormat+
- MIME:
application/json
,text/csv
,application/zip
, ...
- MIME:
- Representation:
- Explanation type:
Custom explainers must inherit from the CustomExplainer
class which declares:
- explainer capabilities and attributes as Metadata
- methods which are invoked by custom explainers Runtime
CustomExplainer
defines the following instance attributes:
self.model
self.persistence
self.params
self.explainer_params
self.logger
self.config
These instance attributes are set by setup() method and can be subsequently accessed using self
to create explanations.
- Check setup() section for instance attributes documentation.
- Check Run section to determine order of method invocations on RPC API procedures dispatch.
The following methods are invoked through the explainer lifecycle:
__init__()
... MUST be implemented- Explainer class constructor which takes no parameters and calls parent constructor(s).
check_compatibility() -> bool
... OPTIONAL- Compatibility check which can be used to indicate that the explainer is not compatible with given model, dataset, parameters, etc.
setup()
... MUST be implemented- Explainer initialization which gets various arguments allowing to get ready for compatibility check and actual calculation.
fit()
... MUST be implemented- Method which can pre-compute explainer artifacts like (surrogate) models to be subsequently used by explain methods. This method is invoked only once in the lifecycle of explainer.
explain() -> list
... MUST be implemented- Method which creates and persists global and local explanations which can use artifacts prepared by
fit()
.
- Method which creates and persists global and local explanations which can use artifacts prepared by
explain_global() -> list
... OPTIONAL- Method which can (re)create global explanations - it can be calculated on demand or use artifacts prepared by
fit()
andexplain()
.
- Method which can (re)create global explanations - it can be calculated on demand or use artifacts prepared by
explain_local() -> list
... OPTIONAL- Method which creates local explanations - it can be calculated on demand or use artifacts prepared by
fit()
andexplain()
.
- Method which creates local explanations - it can be calculated on demand or use artifacts prepared by
destroy()
... OPTIONAL- Post explainer explain method clean up.
These methods are invoked by recipe runtime through the explainer lifecycle.
Examples: check Explainer Examples section for examples of custom explainers methods and attributes use.
Custom explainers which need to use Driverless AI specific runtime container are based on CustomDaiExplainer class
(in addition to CustomExplainer class) to get access to Driverless AI APIs, artifacts and data structures (see Runtimes for more details).
Such custom explainer is typically defined as follows:
class MyExplainer(CustomExplainer, CustomDaiExplainer):
...
def __init__(self):
CustomExplainer.__init__(self)
CustomDaiExplainer.__init__(self)
...
...
def setup(self, model, persistence, key=None, params=None, **e_params):
CustomExplainer.setup(self, model, persistence, key, params, **e_params)
CustomDaiExplainer.setup(self, **e_params)
...
...
Driverless AI custom explainer invokes CustomDaiExplainer
parent class constructor and setup()
method to properly initialize. CustomDaiExplainer
defines the following instance attributes:
self.mli_key
self.dai_params
self.dai_version
self.dai_username
self.explainer_deps
self.model_entity
self.dataset_entity
self.validset_entity
self.testset_entity
self.mli_keyconfig
self.sanitization_map
self.labels
self.num_labels
self.used_features
self.enable_mojo
These instance attributes are set by setup() method and can be subsequently accessed using self
to create explanations.
- Examples: check DAI explainer metadata example and DAI explainer example simple custom explainer.
- Check setup() section for instance attributes documentation.
- Refer to Explainer Python API reference for the documentation.
Custom explainer creates explanations. Explanations represent what was computed to explain the model.
Explanations are instances of classes which inherit from CustomExplanation
abstract class. MLI BYORs bring pre-defined set of classes for the most
common explanations like:
WorkDirArchiveExplanation
- Working directory archive explanation can be used to provide
.zip
,.tgz
or other type or archive with artifacts created by explainer in its working directory.
- Working directory archive explanation can be used to provide
GlobalFeatureImportranceExplanation
- Global feature importance explanation can be used for explanations describing global importance of model features.
GlobalDtExplanation
- Global DT explanation can be used for explanations providing "glass box" decision tree associated with given model.
PartialDependenceExplanation
- Partial dependence explanation can be used for explanations clarifying interaction of model features and predictions.
- ...
LocalFeatureImportranceExplanation
- Local feature importance explanation can be used for explanations describing global importance of model features in case of particular dataset row.
LocalDtExplanation
- Local DT explanation can be used for explanations providing "glass box" decision tree path in the tree in case of particular dataset row.
InvididualConditionalExplanation
- ICE explanation can be used for explanations clarifying interaction of model features and predictions in case of particular dataset row
- ...
Explanation has:
- scope - either global (model/whole dataset) or local (particular dataset row)
- at least one format (representation) instance
- tile and tab display names which are used in UI
In order to understand how are explanations stored please refer to Explanations Introspection before reading the rest of this section.
Explanation instantiation example:
# create global feature importance explanation
global_featimp_explanation = GlobalFeatImpExplanation(
explainer=self,
# display name used in UI as tile name
display_name=self.display_name,
# category name used in UI as tab name (tiles pane)
display_category=GlobalFeatImpExplanation.DISPLAY_CAT_NLP,
)
# add JSon format ... feature importance can be downloaded as JSon file
...
global_featimp_explanation.add_format(
explanation_format=json_dt_representation
)
# add CSV format ... feature importance can be downloaded as CSV file
global_featimp_explanation.add_format(
explanation_format=GlobalFeatImpJSonCsvFormat.from_json_datatable(
json_dt_representation
)
)
# add datatable format ... feature importance can be downloaded as datatable frame file
global_featimp_explanation.add_format(
explanation_format=GlobalFeatImpJSonFormat.from_json_datatable(
json_dt_representation
)
)
Initial set of explanation types is extensible - new explanations can be easily
added just by creating a new class which inherits from CustomExplanation
:
class MyCustomExplanation(CustomExplanation):
"""Example of a user defined explanation type."""
_explanation_type = "user-guide-explanation-example"
_is_global = True
def __init__(
self, explainer, display_name: str = None, display_category: str = None
) -> None:
CustomExplanation.__init__(
self,
explainer=explainer,
display_name=display_name,
display_category=display_category,
)
def validate(self) -> bool:
return self._formats is not None
Such custom explanations might be deployed along with explainers which use them.
Example: check Custom Explanation Example explainer.
Explanation representations are actual (downloadable) artifacts (typically files) created by explainers as explanations.
Explanation representations can be stored in various formats
whose structure is identified by MIME types.
Explanation representations are instances of classes which inherit
from CustomExplanationFormat
abstract class.
MLI BYORs bring pre-defined set of classes for the most common formats allowing to persist explanations like:
WorkDirArchiveZipFormat
- Zip archive representation of
WorkDirArchiveExplanation
.
- Zip archive representation of
GlobalFeatImpJSonFormat
- JSon representation of global feature importance explanation
GlobalFeatureImportranceExplanation
.
- JSon representation of global feature importance explanation
GlobalFeatImpDatatableFormat
datatable
frame representation of global feature importance explanationGlobalFeatureImportranceExplanation
.
PartialDependenceJSonFormat
- JSon representation of partial dependence explanation
PartialDependenceExplanation
.
- JSon representation of partial dependence explanation
- ...
LocalDtJSonFormat
- JSon representation of local decision tree explanation i.e. path in the tree in case of particular dataset row.
- ...
Representations which can be rendered by Driverless AI Grammar of MLI UI components can
be easily recognized as they inherit from GrammarOfMliFormat
:
class PartialDependenceJSonFormat(
TextCustomExplanationFormat, GrammarOfMliFormat
):
mime = MimeType.MIME_JSON
...
Representation...
- has format specification using MIME type
- is formed by
- required main or index file
- optional data files(s)
- main/index file and data files can be normalized to format specified by Grammar of MLI so that it can be shown in Driverless AI UI
- expected representations format is documented by Explainer Python API
Representations are either formed by one file or multiple files depending on the the explanation structure and/or experiment type. For example in case of multinomial explanation there is typically per-class data file and all data files are referenced from the index file.
Filesystem example of a simple text representation formed by one file (explanation.txt
):
explainer_..._Example...Explainer_<UUID>
.
├── global_user_guide_explanation_example
│ ├── text_plain
│ │ └── explanation.txt
│ └── text_plain.meta
├── log
│ └── ...
└── work
└── ...
Filesystem example of three representations formed by JSon index files which
are referencing per-class data files in different formats (JSon, CSV, datatable
):
.
├── global_feature_importance
│ ├── application_json
│ │ ├── explanation.json
│ │ ├── feature_importance_class_0.json
│ │ └── feature_importance_class_1.json
│ ├── application_vnd_h2oai_json_csv
│ │ ├── explanation.json
│ │ ├── feature_importance_class_0.csv
│ │ └── feature_importance_class_1.csv
│ ├── application_vnd_h2oai_json_datatable_jay
│ │ ├── explanation.json
│ │ ├── feature_importance_class_0.jay
│ │ └── feature_importance_class_1.jay
│ └── ...
├── log
│ └── ...
└── work
└── ...
File explanation.json
is index file which is referencing data
files e.g. feature_importance_class_0.csv
and feature_importance_class_1.csv
in case of JSon/CSV representation index file looks like:
{
"files": {
"0": "feature_importance_class_0.csv",
"1": "feature_importance_class_1.csv"
},
"metrics": [],
"documentation": "NLP LOCO plot applies ...",
"total_rows": 20
}
See also Explanations Introspection section for more details on representations persistence.
Representation instantiation example:
# index file
(
index_dict,
index_str,
) = PartialDependenceJSonFormat.serialize_index_file(
features=self.features,
classes=["class_A", "class_B", "class_C"],
features_meta={"categorical": [self.features[0]]},
metrics=[{"RMSE": 0.029}, {"SD": 3.1}],
doc=TemplatePartialDependenceExplainer._description,
)
# representation
json_representation = PartialDependenceJSonFormat(
explanation=global_explanation, json_data=index_str
)
# data files: per-feature, per-class (saved as added to format)
for fi, feature in enumerate(self.features):
for ci, clazz in enumerate(
TemplatePartialDependenceExplainer.MOCK_CLASSES
):
json_representation.add_data(
# IMPROVE: tweak values for every class (1 data for simplicity)
format_data=json.dumps(
TemplatePartialDependenceExplainer.JSON_FORMAT_DATA
),
# filename must fit the name from index file ^
file_name=f"pd_feature_{fi}_class_{ci}.json",
)
...
Initial set of representation types is extensible - new representation formats can be easily
added just by creating a new class which inherits from CustomExplanationFormat
.
class GitHubMarkdownFlavorFormat(CustomExplanationFormat, GrammarOfMliFormat):
"""GitHub Markdown representation with text and images."""
mime = MimeType.MIME_MARKDOWN
def __init__(
self,
explanation,
format_file: str,
extra_format_files: Optional[List] = None,
):
CustomExplanationFormat.__init__(
self,
explanation=explanation,
format_data=None,
format_file=format_file,
extra_format_files=extra_format_files,
file_extension=MimeType.ext_for_mime(self.mime),
)
@staticmethod
def validate_data(dt_data: dt.Frame):
return dt_data
Such custom representations might be deployed along with explainers which use them. In case that their MIME type will be supported by Grammar of MLI they will be also rendered in Driverless AI UI.
Custom explainer declares its capabilities and attributes in its metadata section as class attributes:
class MyExplainer(CustomExplainer):
_display_name = "My Explainer"
_regression = True
_binary = True
_global_explanation = True
_explanation_types = [GlobalFeatImpExplanation]
...
The most important metadata class attributes:
- basic
_display_name: str
- recipe display name (used in UI, listings and instances)
_description: str
- recipe description (used in UI, listings and instances)
_keywords: List[str]
- list of recipe keywords used for recipes filtering and categorization
- data:
_iid: bool
- specifies whether recipe can explain IID models
_time_series
- specifies whether recipe can explain time series models
_image
- specifies whether recipe can explain image modela
- problem type:
_regression: bool
- recipe can explain regression problem types (
y
is of numeric type)
- recipe can explain regression problem types (
_binary: bool
- recipe can explain binomial classification problem types (
y
can be of numeric or string type, cardinality 2)
- recipe can explain binomial classification problem types (
_multiclass: bool
- recipe can explain binomial classification problem types (
y
can be of numeric or string type, cardinality 3 or more)
- recipe can explain binomial classification problem types (
- scope:
_global_explanation: bool
- recipe can provide global explanations (like PD)
_local_explanation: bool
- recipe can provide local explanations (like ICE)
- explanations
_explanation_types: List[Type[CustomExplanation]]
- recipe always creates (must) these explanation types (at least one, for example
[GlobalFeatImpExplanation, PartialDependenceExplanation]
)
- recipe always creates (must) these explanation types (at least one, for example
_optional_explanation_types: List[Type[CustomExplanation]]
- recipe may also create these explanation types (0 or more)
- parameters
_parameters: List[CustomExplainerParam]
- list of (0 or more) recipe parameters
- standalone
_requires_predict_method: bool
- recipe explains Driverless AI models (False) or standalone (3rd party model) - standalone explanation requires dataset column with 3rd party model predictions
- dependencies
_modules_needed_by_name: List[str]
- recipe requires Python package dependencies (which can be installed using
pipe
), for example["mypackage==1.3.37"]
- recipe requires Python package dependencies (which can be installed using
_depends_on: List[Type["CustomExplainer"]]
- recipe depends on other recipes - recipe dependencies are automatically added to interpretation execution plan and executed before the recipe so that its artifacts can be used
_priority: float
- recipe priority in execution (high priority executed first)
Please refer to the Explainer Python API documentation for full reference.
Custom explainer can be parametrized and parameters easily resolved using MLI BYOR library functions.
Parameter (name, description, type and default value) declaration:
class ExampleParamsExplainer(...):
PARAM_ROWS_TO_SCORE = "rows_to_score"
_display_name = "Example Params Explainer"
...
_parameters = [
CustomExplainerParam(
param_name=PARAM_ROWS_TO_SCORE,
description="The number of dataset rows to be scored by explainer.",
param_type=ExplainerParamType.int,
default_value=1,
src=CustomExplainerParam.SRC_EXPLAINER_PARAMS,
),
]
Parameter types:
class ExplainerParamType(Enum):
bool
int
float
str
list # selection from predefined list of items
multilist # multiselection from predefined list of items
customlist # list of user strings, without predefined values
dict = auto()
Argument values of declared parameters can be specified in UI when you Run explainer(s) from selection listing:
.. as well as when running the explainer using Python Client API:
explainer_id="...ExampleParamsExplainer"
explainer_params={"rows_to_score": 3}
job = h2oai_client.run_explainers(
explainers=[Explainer(
explainer_id=explainer_id,
explainer_params=str(explainer_params),
)],
params=explainers_params,
)
Python Client API can be used to determine explainer's parameters - check List and Filter section:
explainers = [explainer.dump() for explainer in h2oai_client.list_explainers(
experiment_types=None,
explanation_scopes=None,
dai_model_key=None,
keywords=None,
explainer_filter=[]
)]
...
Found 12 explainers
h2oaicore.mli.byor.recipes.sa_explainer.SaExplainer
...
parameters []
h2oaicore.mli.byor.recipes.dai_pd_ice_explainer.DaiPdIceExplainer
...
parameters [
{'name': 'features',
'description': 'List of features for which to compute PD/ICE.',
'comment': '',
'type': 'multilist',
'val': None,
'predefined': [],
'tags': ['SOURCE_DATASET_COLUMN_NAMES'],
'min_': 0.0,
'max_': 0.0,
'category': ''
},
...
...
Arguments resolution and use in runtime:
...
def setup(self, model, persistence, **e_params):
...
# resolve explainer parameters to instance attributes
self.args = CustomExplainerArgs(
ExampleParamsExplainer._parameters
)
self.args.resolve_params(
explainer_params=CustomExplainerArgs.json_str_to_dict(
self.explainer_params_as_str
)
)
def explain(self, X, y=None, explanations_types=None, **kwargs) -> list:
# use parameter
rows = self.args.get(self.PARAM_ROWS_TO_SCORE)
...
Example: if you want to see full explainer listing and try explainer parameters, check Parameters Example explainer.
Custom explainer must implement default constructor which must not have required parameters:
def __init__(self):
CustomExplainer.__init__(self)
In case that explainer inherits also from CustomDaiExplainer base class, then it must also initialize it:
def __init__(self):
CustomExplainer.__init__(self)
CustomDaiExplainer.__init__(self)
Custom explainer can implement (override) compatibility check method (defined by CustomExplainer and CustomDaiExplainer) which is used to perform runtime check to determine whether explainer can explain given model and dataset.
Compatibility check...
- purpose is to avoid failures which would inevitably occur later
- returns
True
if explainer is compatible,False
otherwise - is invoked before interpretation run
- is invoked on explainer instantiated using
constructor -
setup()
method is not called beforecheck_compatibility()
i.e. instance attributes are initialized - the most important instance attributes might be set
by calling parent classes
check_compatibility()
class MyCustomExplainer(...):
...
def check_compatibility(
self,
params: Optional[messages.CommonExplainerParameters] = None,
**explainer_params,
) -> bool:
CustomExplainer.check_compatibility(self, params, **explainer_params)
CustomDaiExplainer.check_compatibility(self, params, **explainer_params)
...
MLI BYORs runtime provides the following explainer_params
:
explainer_params.get('explainer_params_as_str')
- Explainer parameters as string.
explainer_params.get('params')
- Common explainers parameters.
explainer_params.get('dai_params')
- Driverless AI specific explainers parameters.
explainer_params.get('dai_username')
- Driverless AI user name.
explainer_params.get('model_entity')
- Driverless AI
Model
entity with model details.
- Driverless AI
explainer_params.get('dataset_entity')
- Driverless AI
Dataset
entity with train dataset details.
- Driverless AI
explainer_params.get('validset_entity')
- Driverless AI
Dataset
entity with validation dataset details.
- Driverless AI
explainer_params.get('testset_entity')
- Driverless AI
Dataset
entity with test dataset details.
- Driverless AI
explainer_params.get('features_meta
- Features metadata like type (numerical/categorical).
explainer_params.get('persistence')
- Instance of
CustomExplainerPersistence
class which provides custom explainer the way how to persist the data e.g. to its working directory.
- Instance of
explainer_params.get('cfg')
- Logger which can be used to print info, debug, warning, error or debug messages to explainer's log - to be used e.g. for debugging.
explainer_params.get('logger')
- Driverless AI configuration.
Example: try Compatibility Example explainer.
Custom explainer should implement setup()
method with the following signature:
def setup(self, model, persistence, key=None, params=None, **e_params):
"""Set all the parameters needed to execute `fit()` and `explain()`.
Parameters
----------
model: Optional[ExplainerModel]
DAI explainer model with (fit and) score methods (or `None` if 3rd party
explanation).
persistence: CustomExplainerPersistence
Persistence API allowing (controlled) saving and loading of explanations.
key: str
Optional (given) explanation run key (generated otherwise).
params: CommonExplainerParameters
Common explainer parameters specified on explainer run.
explainer_params:
Explainer parameters, options and configuration.
"""
CustomExplainer.setup(self, model, persistence, key, params, **e_params)
The implementation should invoke parent class setup()
method:
def setup(self, model, persistence, **e_params):
CustomExplainer.setup(self, model, persistence, **e_params)
CustomExplainer's setup()
method sets the following class instance attributes:
self.model
- Instance of
ExplainerModel
class which has predict and fit functions of the model to be explained. These methods can be used to create predictions using the model/scorer.
- Instance of
self.persistence
- Instance of
CustomExplainerPersistence
class which provides custom explainer the way how to persist the data e.g. to its working directory.
- Instance of
self.params
- Common explainers parameters specified on explainer run like target column or columns to drop.
self.explainer_params
- This custom explainer specific parameters specified on explainer run.
self.logger
- Logger which can be used to print info, debug, warning, error or debug messages to explainer's log - to be used e.g. for debugging.
self.config
- Driverless AI configuration.
In case that explainer inherits also from CustomDaiExplainer base class, then it must also initialize it:
def setup(self, model, persistence, **e_params):
CustomExplainer.setup(self, model, persistence, **e_params)
CustomDaiExplainer.setup(self, **e_params)
CustomDaiExplainer's setup()
method sets the following class instance attributes:
self.mli_key
- MLI key (UUID or simple name) that can be used to access interpretation filesystem and DB.
self.dai_params
- Driverless AI specific explainers parameters specified on explainer run like config overrides, validation/test dataset keys, etc.
self.dai_version
- Driverless AI version
self.dai_username
- Current Driverless AI user name.
self.explainer_deps
- Explainer can declare that it depends on other explainer e.g. to reuse an artifact it pre-computed. Explainer dependencies field is a dictionary of explainer runtime dependencies details. Keys in this dictionary are explainer IDs (class names or deployment IDs as declared in this explainer metadata), values are lists of explainer run keys (UUIDs or simple names). This is how this explainer can determine status and location of explainer jobs for explainer (types) it depends on.
self.model_entity
- Driverless AI
Model
entity with model details.
- Driverless AI
self.dataset_entity
- Driverless AI
Dataset
entity with train dataset details.
- Driverless AI
self.validset_entity
- Driverless AI
Dataset
entity with validation dataset details.
- Driverless AI
self.testset_entity
- Driverless AI
Dataset
entity with test dataset details.
- Driverless AI
self.sanitization_map
- Sanitization map is a class which allows to map dataset columns, feature names, ... between sanitized and
non-sanitized space. For example dataset column
'O\'Neil.tm\t {non-escaped} "raw" feature[0]'
must be escaped like'O\'Neil_tm_ {non-escaped} "raw" feature_0_'
. Make sure to use sanitization map to ensure correct and clash-less mapping between sanitized and non-sanitized names.
- Sanitization map is a class which allows to map dataset columns, feature names, ... between sanitized and
non-sanitized space. For example dataset column
self.labels
- List of labels (classes) used by the model in case binomial or multinomial classification. Undefined in case of regression models.
self.num_labels
- Number of labels (classes) used by the model.
self.used_features
- Model typically uses a subset of dataset columns (features) - this fields provides list of features actually used by the model.
self.enable_mojo
- Boolean indicating whether MOJO scorer should be enabled or disabled.
Instance attributes listed above can be subsequntly used in fit()
and explain*()
methods.
Custom explainer can implement fit()
method with the following signature:
def run_fit(self, X, y=None, **kwargs):
"""Build explainer and explanation prerequisites.
This is method invoked by custom explainer execution engine (can add code to
be executed before/after `fit()` overridden by child classes).
Parameters
----------
X: Union[datatable.Frame, Any]
Data frame.
y: Optional[Union[datatable.Frame, Any]]
Labels.
"""
return self
Fit method can be used to pre-compute artifacts to be subsequently used by explain*()
method(s).
Custom explainer must implement explain()
method which is supposed to create and persist global and/or local explanations:
def explain(self, X, y=None, explanations_types: list = None, **kwargs,
) -> list:
...
explain()
method parameters:
X
- dataset handle (
datatable
frame)
- dataset handle (
y
- optional labels
explanations_types
- optional list of explanation types to be calculated (remind
optional_explanation_types
explainer Metadata declaration)
- optional list of explanation types to be calculated (remind
explain()
method is invoked when interpretation is run and it typically performs the following steps:
- dataset preparation
- predict method use or customization
- explanation calculation
- explanation persistence
- optional explanation normalization
Subsequent sections elaborate each aforementioned step.
Custom explainer can use dataset to explain model within explain()
method:
- dataset handle (
datatable
frame) is injected asX
parameter ofexplain()
method self.dataset_entity
instance attribute (set by setup() method in case of CustomDaiExplainer) provides dataset details
Thus explain()
method can be used to prepare dataset (sample, filter, transform or leave it as is) and make it ready
for subsequent processing.
Example: check EDA Example explainer.
Custom explainer can use model or dataset with predictions (standalone mode) to explain model within explain()
method. This section elaborates the case when (Driverless AI) model is explained.
class ExplainerModel:
def __init__(self, predict_method, fit_method):
self.predict_method = predict_method
self.fit_method = fit_method
def fit(self, X, y=None, **kwargs):
self.fit_method(X, y, **kwargs)
def predict(self, X, y=None, **kwargs):
return self.predict_method(X, y, **kwargs)
Model (ExplainerModel
class instance) is injected to explainer by setup() method:
- model can be accessed using
self.model
as it is instance attribute self.model
provides predict method which can be used and/or customized- model injection to explainer can be disabled with
__requires_preloaded_predictor = False
to improve performance whenExplainerModel
instance is not needed
Model metadata:
self.model_entity
instance attribute (set by setup() method in case of CustomDaiExplainer) provides model details
Thus explain()
method can be use and/or customize predict and fit methods prepared by MLI BYORs runtime:
class ExampleScoreExplainer(CustomExplainer, CustomDaiExplainer):
...
def explain(self, X, y=None, explanations_types=None, **kwargs) -> list:
...
prediction = self.model.predict_method(dataset)
self.logger.info(f"Prediction : {prediction}")
Examples:
- check Score Example explainer for how to use predict method
- check Morris SA example explainer for how to customize predict method and encode categorical features as numerical
Custom explainer can use persistence instance attribute (set by setup()) to persist explanation (intermediary) results to its working directory and subsequently to (optionally) persist normalized explanations.
Custom explainer persistence provides access/paths to explainer directories:
self.persistence.get_explainer_working_dir()
- explainer working directory - this is where explainer is allowed to persist its files
self.persistence.get_explainer_working_file(file_name)
- path to
file_name
file in explainer's working directory
- path to
self.persistence.get_explainer_log_dir()
- explainer logs directory
self.persistence.get_explainer_dir()
- explainer directory path
self.persistence.base_dir
- MLI directory path
... and many more. Persistence is used like this:
class ExamplePersistenceExplainer(CustomExplainer):
...
def explain(self, X, y=None, explanations_types=None, **kwargs) -> list:
...
# use self.persistence object to get file system paths
self.logger.info(f"Explainer MLI dir: {self.persistence.base_dir}")
self.logger.info(f"Explainer dir: {self.persistence.get_explainer_dir()}")
# save 1st row of dataset to work directory and prepare work directory archive
df_head = X[:1, :]
df_head.to_csv(
self.persistence.get_explainer_working_file("dataset_head.csv")
)
...
Example: check Persistence Example explainer.
As was mentioned previously, explain() method creates and persist global and/or local explanations. Custom explainer can persist final/intermediary results to its working directory.
In case that there is no need to visualize the result, then explainer can use
for instance pre-defined WorkDirArchiveExplanation
to create Zip archive
of the working directory with created artifacts. Such archive can be subsequently
downloaded either from UI or using Python Client..
class ExampleScoreExplainer(CustomExplainer, CustomDaiExplainer):
...
def explain(self, X, y=None, explanations_types=None, **kwargs) -> list:
...
return [
self.create_explanation_workdir_archive(
display_name=self.display_name, display_category="Demo"
)
]
Example: check Morris SA example explainer for explanation calculation and persistence use.
In case that explanations created by a custom explainer should provide UI representations as Driverless AI charts, then explanations data, which are typically stored as files in explainer's working directory, must be normalized.
By normalized files are meant files which have format and structure specified by Grammar of MLI for given chart type.
MLI BYORs runtime provides helpers to accomplish this task which are based on CustomExplanation and CustomExplanationFormat classes. These helper classes can be used to easily create normalized representations in various formats:
- explanations: check CustomExplanation section for how to create normalized explanations
- check Custom Explanation Example for how to create custom explanation
- formats: check CustomExplanationFormat section for how to create normalized representations (formats)
- check Morris SA example for 3rd party library output normalization example
Explainer templates for all types of Grammar of MLI charts can be used to make creation of UI-ready explainers easier - just replace "foo" data with the output of your explainer/library/calculation and get GoM compatible explainer:
- decision tree
- feature importance
- Markdown report with...
- PD/ICE
- scatter plot
- ... and other templates
Normalized explanations and formats benefits can be summarized as follows:
- explainers can be searched and filtered by explanation types
- Python Client API provides explainer explanations and formats introspection allowing to determine which explanations are available and in which formats
- Grammar of MLI explanations and formats are rendered in Driverless AI UI as interactive charts
Find more details on normalization in Grammar of MLI section.
Custom explainer can implement explain_local()
method in order to provide local explanations. By local explanations are meant explanations created for particular dataset row.
Custom explainer must declare the ability to provide local explanations:
class TemplateDecisionTreeExplainer(CustomExplainer):
_display_name = "Template DecisionTree explainer"
...
_local_explanation = True
...
This declaration enables local explanations which means that Driverless AI RPC API method can be invoked and search is shown in Driverless AI UI atop chart (it's hidden otherwise):
Custom explainers can use the following options to provide local explanations:
- load cached explanation
- Explanations are pre-computed, normalized and persisted by explain() method. When a local explanation is requested by Driverless AI RPC API, its runtime loads the explanation and returns it. Typically this is the fastest way how to provide local explanations.
- on-demand explanation calculation (async/sync)
- If local explanations cannot be cached (for instance because the dataset is huge and local explanations computation would take too much time and/or would not fit to disk), then it must be computed on-demand. In this case Driverless AI explainer runtime instantiates the explainer, invokes setup() method and (a)synchronously invokes
explain_local()
method. The decision whether to invokeexplain_local()
method synchronously or asynchronously is up to the explainer - it must be stored as hint in normalized persisted local explanation descriptor.
- If local explanations cannot be cached (for instance because the dataset is huge and local explanations computation would take too much time and/or would not fit to disk), then it must be computed on-demand. In this case Driverless AI explainer runtime instantiates the explainer, invokes setup() method and (a)synchronously invokes
In order to invoke explain_local()
custom explainer method, Python Client API get_explainer_local_result()
procedure is called.
For example client can be invoked as follows to get ICE by row number for given feature and class (multinomial model explanation):
ice_by_row = h2o_client.get_explainer_local_result(
mli_key=mli_key,
explainer_job_key=explainer_partial_dependence_job_key,
explanation_type=IndividualConditionalExplanation.explanation_type(),
explanation_format=MimeType.MIME_JSON_DATATABLE,
id_column_name=None,
id_column_value="10", # 0..10 ~ 11th row
page_offset=0,
page_size=1,
result_format=MimeType.MIME_JSON,
explanation_filter=[
FilterEntry(IceJsonDatatableFormat.FILTER_FEATURE, local_feature),
FilterEntry(IceJsonDatatableFormat.FILTER_CLASS, local_class),
],
)
Both aforementioned local explanation creation options are described in subsequent sections.
Cached local explanations are dispatched by the representations themselfs - custom
explainer is neither instantiated nor its explain_local()
method invoked.
The decision whether to dispatch local explanation request using cached
representation's data or on-demand explainer invocation is made automatically
by MLI BYORs runtime which calls representations's is_on_demand()
method
as a part of h2o_client.get_explainer_local_result()
dispatch:
False
return value means that representation is able to return cached local explanationsTrue
return value means that MLI BYOR runtime must use On-demand Local Explanation dispatch
Example: explainer declares that it provides local explanation as well as its type (LocalFeatImpExplanation
):
class TemplateFeatureImportanceExplainer(CustomExplainer):
...
_local_explanation = True
_explanation_types = [GlobalFeatImpExplanation, LocalFeatImpExplanation]
...
explain()
method executed on interpretation run creates local explanation
and also binds it to corresponding global explanation. Note that it uses
LocalFeatImpDatatableFormat
representation:
...
def explain(self, X, y=None, explanations_types: list = None, **kwargs):
local_explanation = LocalFeatImpExplanation(explainer=self)
...
dt_format = LocalFeatImpDatatableFormat(
explanation=local_explanation, frame=dt.Frame(data_dict)
)
local_explanation.add_format(dt_format)
# associate local explanation with global explanation
global_explanation.has_local = local_explanation.explanation_type()
...
... LocalFeatImpDatatableFormat
representation class:
is_on_demand()
method returnsFalse
asLocalFeatImpDatatableFormat
supports cached dispatchget_local_explanation()
method, which can load cached local explanations, is implemented byLocalFeatImpDatatableFormat
Check template_featimp_explainer.py
from Explainer Templates for
example of cached local explanation dispatch.
There are two types of on-demand local explanation calculation:
- synchronous
explain_local()
method is invoked on explainer instance created using constructor to get the local explanation - can be used if load/transformation is fast and there
- asynchronous
explain_local()
method is invoked on fully checked and initialized explainer - it's used when a calculation/scorer/transformation is needed
The decision whether to perform synchronous or asynchronous execution is made based on local explanation index file flag which was created on explainer run by explain() method.
For example:
# create local explanation
local_explanation = LocalDtExplanation(
explainer=self,
display_name="Local Explanation",
display_category="Example",
)
# create index file
json_local_idx, _ = LocalDtJSonFormat.serialize_index_file(
classes=["class_A", "class_B", "class_C"],
doc=TemplateDecisionTreeExplainer._description,
)
# specify that local explanation is on-demand
json_local_idx[LocalDtJSonFormat.KEY_ON_DEMAND] = True
# specify that it will be dispatched SYNCHRONOUSLY
on_demand_params: dict = dict()
on_demand_params[LocalDtJSonFormat.KEY_SYNC_ON_DEMAND] = True
json_local_idx[
LocalDtJSonFormat.KEY_ON_DEMAND_PARAMS
] = on_demand_params
# add local explanation to explanations returned by the explainer
# (it will be persisted by MLI BYORs runtime)
local_explanation.add_format(
explanation_format=LocalDtJSonFormat(
explanation=local_explanation,
json_data=json.dumps(json_local_idx, indent=4),
)
)
Example explanation's index file created by the code above looks like:
{
"files": {
"class_A": "dt_class_0.json",
"class_B": "dt_class_1.json",
"class_C": "dt_class_2.json"
},
"metrics": [],
"documentation": "...",
"on_demand": true,
"on_demand_params": {
"synchronous_on_demand_exec": true
}
}
The decision whether to invoke explainer in synchronous or asynchronous mode
is done by MLI BYOR runtime automatically - it reads explanation index file
specified h2o_client.get_explainer_local_result()
parameters (explanation type
and MIME), and invokes explainer methods as described below.
Synchronous local explanation dispatch invokes explainer methods as follows:
__init__()
(constructor)- explain_local()
Asynchronous local explanation dispatch invokes explainer methods as follows:
MLI BYORs runtime invokes explainer methods to get cached local explanation on
h2o_client.get_explainer_local_result()
procedure invocation as follows:
__init__()
(constructor)- check_compatibility()
- setup()
- explain_local()
- destroy()
In both synchronous and asynchronous cases it is expected that local explanation will be returned as string.
Check template_dt_explainer.py
from Explainer Templates for
example of synchronous on-demand dispatch.
Since: Driverless AI 1.9.2
Custom explainer can implement explain_global()
method to update explanation(s) on-demand. Such recalculation and/or extension of existing explanations can be initiated using Driverless AI RPC API for example as follows:
global_run_job_key = h2oai_client.update_explainer_global_result(
mli_key="570a95c4-66f1-11eb-a6ff-e86a64888647",
explainer_job_key="670a95c4-66f1-11eb-a6ff-e86a64888647",
params=Client.build_common_dai_explainer_params(),
explainer_params=json.dumps(
{ 'features': ["AGE"] }
),
explanation_type=(
explanations.PartialDependenceExplanation.explanation_type()
),
explanation_format=(
representations.PartialDependenceJSonFormat.mime
),
update_params=json.dumps(
{
UpdateGlobalExplanation.UPDATE_MODE: (
UpdateGlobalExplanation.OPT_MERGE
),
UpdateGlobalExplanation.PARAMS_SOURCE: (
UpdateGlobalExplanation.OPT_INHERIT
),
}
),
)
update_explainer_global_result()
parameters description:
mli_key
- Key of the target MLI where to update explanation(s).
explainer_job_key
- Key of the target explainer (job) in which to update explanation(s).
params
- Optional
CommonDaiExplainerParams
to parametrize explainer run.params
argument content can be overridden by previous (original) explainer run parameters stored on the server side by using "inherit" option inupdate_params
.
- Optional
explainer_params
- Explainer specific parameters to be used in "update" run. For instance it can be used to specify for which dataset features to add/update explanatins, with which resolution etc.
explanation_type
- Optional specification of the explanation type to be updated. Use
None
to requestion all explanations and/or when explainer knows what to update.
- Optional specification of the explanation type to be updated. Use
explanation_format
- Optional specification of the explanation format to be updated. Use
None
to requestion all explanation formats and/or when explainer knows what to update.
- Optional specification of the explanation format to be updated. Use
update_params
- Control how to update explanations with
update_params
dictionary like merge vs. overwrite explanations, inherit interpretation parameters vs. useparams
interpretation parameters. SeeUpdateGlolbalExplanation
class for more details.
- Control how to update explanations with
CustomExplainer
interface anticipates that instances of classes implementing this interface can run in different explainer runtimes/containers like Driverless AI or standalone (locally).
When implementing custom explainer for Driverless AI explainer runtime, this method doesn't have to be overridden as explanations are typically computed, normalized and persisted. Driverless AI RPC API (which can be used using Python Client API) then looks up persisted global explanations automatically.
Custom explainer can get parameters passed to explain_global()
methods by the runtime as in the example below:
class MyExplainer(CustomExplainer):
...
def explain_global(self, X, y=None, **e_params) -> list:
"""Update Partial Dependence explanation.
Parameters
----------
X: dt.Frame
Dataset (whole) as datatable frame (handle).
y: dt.Frame
Optional predictions as datatable frame (handle).
Returns
-------
List[OnDemandExplanation]:
Update on-demand explanations.
"""
...
systemutils.loggerdebug(
self.logger,
f"\nexplain_global(): "
f"\n target MLI key: {e_params.get(OnDemandExplainKey.MLI_KEY)}"
f"\n target job key: {e_params.get(OnDemandExplainKey.EXPLAINER_JOB_KEY)}"
f"\n MLI key : {self.mli_key}"
f"\n job key : {self.key}"
f"\n all params : {e_params}",
)
...
Custom explainer can optionally implement and override destroy()
method to perform post explainer run clean-up.
As was already mentioned, CustomExplainer
interface anticipates that instances of classes which implement it can run in different explainer runtimes/containers like Driverless AI or standalone.
When implementing custom explainer for Driverless AI runtime, this method doesn't have to be overridden unless specific resources must be released or purged after explainer run.
destroy()
method is invoked by MLI BYORs runtime on:
destroy()
method:
- can be seen as
finally
section oftry
/catch
in programming languages - typically purges some content of explainer's working directory
- is not invoked on removal of the interpretation using RPC API (filesystem as well as internal data structures, like DB entities, are purged by MLI BYORs runtime automatically)
Please refer to Explainer Python API for more details.
To deploy (upload from local machine or download from a URL) custom explainer recipe using UI, open MLI homepage by clicking MLI
tab:
Click NEW INTERPRETATION
button and choose UPLOAD MLI RECIPE
to upload recipe from your computer:
Custom recipe will be uploaded and it will be installed along with its dependencies.
Alternatively recipe can be downloaded from a URL specified using MLI RECIPE URL
option.
Recipe can be also deployed using Driverless AI Python Client API:
recipe: CustomRecipe = h2oai_client.upload_custom_recipe_sync(recipe_file_path)
Python Client API can be used to:
- list and filter explainers by:
- experiment type (regression, binomial, multinomial)
- scope (local, global)
- model (DAI model)
- keywords (any string)
- filter (generic filter: IID/TS/image, requires model, etc.)
- list and filter models by:
- applicable explainer
- list and filter datasets by:
- explainable explainer
Filtering is also used when running explainers from Driverless AI MLI UI to offer compatible explainers only.
Example: list explainer descriptors with their name as well as properties and capabilities:
explainers = [explainer.dump() for explainer in h2oai_client.list_explainers(
experiment_types=None,
explanation_scopes=None,
dai_model_key=None,
keywords=None,
explainer_filter=[]
)]
print(f"Found {len(explainers)} explainers")
for explainer in explainers:
print(f" {explainer['id']}")
for key in explainer:
print(f" {key} {explainer[key]}")
Result contains explainer details which can be used to determine whether and how to run the explainer:
Found 12 explainers
h2oaicore.mli.byor.recipes.sa_explainer.SaExplainer
id h2oaicore.mli.byor.recipes.sa_explainer.SaExplainer
name SA explainer
model_types ['iid', 'time_series']
can_explain ['regression', 'binomial']
explanation_scopes ['global_scope']
explanations [
{'explanation_type': 'global-sensitivity-analysis',
'name': 'SaExplanation',
'category': None,
'scope': 'global',
'has_local': None,
'formats': []
}
]
parameters []
keywords ['run-by-default']
h2oaicore.mli.byor.recipes.dai_pd_ice_explainer.DaiPdIceExplainer
id h2oaicore.mli.byor.recipes.dai_pd_ice_explainer.DaiPdIceExplainer
name DAI PD/ICE explainer
model_types ['iid', 'time_series']
can_explain ['regression', 'binomial', 'multinomial']
explanation_scopes ['global_scope', 'local_scope']
explanations [
{'explanation_type': 'global-partial-dependence',
'name': 'PartialDependenceExplanation',
'category': None,
'scope': 'global',
'has_local': None,
'formats': []
}, {
'explanation_type': 'local-individual-conditional-explanation',
'name': 'IndividualConditionalExplanation',
'category': None,
'scope': 'local',
'has_local': None, 'formats': []
}
]
parameters [
{'name': 'features',
'description': 'List of features for which to compute PD/ICE.',
'comment': '',
'type': 'multilist',
'val': None,
'predefined': [],
'tags': ['SOURCE_DATASET_COLUMN_NAMES'],
'min_': 0.0,
'max_': 0.0,
'category': ''
},
...
...
Example: custom explainers filtering:
explainers = [explainer.dump() for explainer in h2oai_client.list_explainers(
experiment_types=['multinomial'],
explanation_scopes=["local_scope"],
dai_model_key="4be68f15-5997-11eb-979d-e86a64888647",
keywords=["run-by-default"],
explainer_filter=[FilterEntry("iid", True)]
)]
Valid FilterEntry
values can be determined from:
class ExplainerFilter:
# explainers which support IID models
IID: str = ModelTypeExplanation.IID
# explainers which support TS models
TIME_SERIES: str = ModelTypeExplanation.TIME_SERIES
# explainers which support image
IMAGE: str = ModelTypeExplanation.IMAGE
# explainers which require predict method (model)
REQUIRES_PREDICT: str = "requires_predict_method"
# explainer ID to get particular explainer descriptor
EXPLAINER_ID = "explainer_id"
Check Python Client API Jupyter Notebook for more examples.
To run custom explainer, click NEW INTERPRETATION
button on MLI homepage:
After new interpretation dialog opens, choose model, dataset and select explainer(s) you want to run:
To run particular explainer only, uncheck all others, choose the explainer and click DONE
:
When ready, click LAUNCH MLI
to run the interpretation:
MLI BYORs runtime invokes explainer methods on interpretation run as follows:
- Preparation: explainers check and execution plan creation:
__init__()
(constructor)- check_compatibility()
- Sequential explainers execution:
__init__()
(constructor)- setup()
- explain()
- destroy()
Custom explainer recipe can be also run using Driverless AI Python Client API:
explainer_id="h2oaicore.mli.byor.recipes.sa_explainer.SaExplainer"
explainer_params={"rows_to_score": 3}
explainers_params: CommonDaiExplainerParameters = h2oai_client.build_common_dai_explainer_params(
target_col="target_column",
model_key="5be68f15-5997-11eb-979d-e86a64888647",
dataset_key="6be68f15-5997-11eb-979d-e86a64888647",
)
job: ExplainersRunJob = h2oai_client.run_explainers(
explainers=[
Explainer(
explainer_id=explainer_id,
explainer_params=str(explainer_params),
)
],
params=explainers_params,
)
Hints:
- valid explainer IDs can be determined using List and Filter procedures
explainers
argument is list - any number of explainers (not just one ) can be run- per explainer parameters are passed using string
explainer_params
argument which does not have specified format (however, typically it's JSon or TOML) and it's up to custom recipe author to define it's structure and content - explainers parameters (like target column or columns to skip) - which are passed to/shared
by all explainers (and can be created using
build_common_dai_explainer_params()
which provides default values) - are defined as follows:
CommonExplainerParameters
target_col str
weight_col str
prediction_col str # no model explanation
drop_cols str[]
sample_num_rows int # >0 to sample, -1 to skip sampling
CommonDaiExplainerParameters
common_params CommonExplainerParameters
model ModelReference
dataset DatasetReference
validset DatasetReference
testset DatasetReference
use_raw_features bool
config_overrides str
sequential_execution bool
debug_model_errors bool
debug_model_errors_class str
ExplainersRunJob
contains explainer jobs keys as well as interpretation status:
ExplainersRunJob
explainer_job_keys str[]
mli_key str
created float
duration int
status int
progress float
In order to determine interpretation or explainer job status use:
get_explainer_job_status(mli_key: str, explainer_job_key: str) -> ExplainerJobStatus
get_explainer_job_statuses(mli_key: str, explainer_job_keys: List[str]) -> List[ExplainerJobStatus]
Check Python Client API Jupyter Notebook for more explaines and end to end explainer run scenario.
Custom explainer Python API:
- Recipe author can use
self.logger
instance attribute to log debugging messages. This messages are stored to explainer's log. - Explainer runtime logs explainer related events/errors using the same logger which ensures that log contains full explainer run trace.
To get explainer log - with your, explainer runtime and explainer log items - from UI, click task manager RUNNING | FAILED | DONE
button in the upper right corner of (running) interpretation:
... and hover over explainer's entry in the list of tasks:
Buttons allowing to abort the explainer and get its logs will appear.
Use log to determine root cause of the failure, fix it and simply re-deploy the custom explainer in the same way as it was deployed.
Every explainer run has its own log which can be downloaded from the server in order to determine what failed/succeeded:
url: str = h2oai_client.get_explainer_run_log(explainer_id)
...
h2oai_client.download(url, target_directory)
Explanations can be viewed in Driverless AI UI as interactive charts. Any custom explainer recipe which creates normalized (Grammar of MLI compatible) explanations, can show these explanations in UI.
Explanations which are not normalized can be downloaded - either using Snapshots or as (working directory) archive when such representation is created.
Custom explainer recipe can be also get using Driverless AI Python Client API. Following sections explain how to:
- perform Explanations Introspection to find out which explanations and in which formats are available
- download Explanations representations
- download explainer Snapshot with all its working directory data, normalized explanations and logs
Check Python Client API Jupyter Notebook with examples of how to lookup and download explanations.
Python Client API can be used to perform explanations introspection to find out which explanations and in which formats are available.
In order to understand the API more easily check how are explanations stored on the server side:
mli_experiment_0b83998c-565d-11eb-b860-ac1f6b46eab4/
explainer_h2oaicore_mli_byor_recipes_sa_explainer_SaExplainer_0b83998e-565d-11eb-b860-ac1f6b46eab4
...
explainer_..._TemplatePartialDependenceExplainer_0e3fc89b-565d-11eb-b860-ac1f6b46eab4
.
├── global_partial_dependence
│ ├── application_json
│ │ ├── explanation.json
│ │ ├── pd_feature_0_class_0.json
│ │ ├── pd_feature_0_class_1.json
│ │ ├── pd_feature_0_class_2.json
│ │ ├── pd_feature_1_class_0.json
│ │ ├── pd_feature_1_class_1.json
│ │ └── pd_feature_1_class_2.json
│ └── application_json.meta
├── global_work_dir_archive
│ ├── application_zip
│ │ └── explanation.zip
│ └── application_zip.meta
├── local_individual_conditional_explanation
│ ├── application_vnd_h2oai_json_datatable_jay
│ │ ├── explanation.json
│ │ ├── ice_feature_0_class_0.jay
│ │ ├── ice_feature_0_class_1.jay
│ │ ├── ice_feature_0_class_2.jay
│ │ ├── ice_feature_1_class_0.jay
│ │ ├── ice_feature_1_class_1.jay
│ │ ├── ice_feature_1_class_2.jay
│ │ └── y_hat.jay
│ └── application_vnd_h2oai_json_datatable_jay.meta
├── log
│ ├── explainer_run_0e3fc89b-565d-11eb-b860-ac1f6b46eab4_anonymized.log
│ ├── explainer_run_0e3fc89b-565d-11eb-b860-ac1f6b46eab4.log
│ └── logger.lock
├── result_descriptor.json
└── work
├── raw_data_frame.jay
└── EXPLAINER_DONE
Directory listing above shows:
- interpretation directory
mli_experiment_<key>
which includes directory for every explainer which was ran as a part of the interpretation - per-explainer directory
explainer_<explainer id>_<key>
which contains:- explainer working directory
work
- explainer logs directory
log
- per-explanation directory (see below)
- explainer working directory
- per-explanation directory
global_<explanation type>
orlocal_<explanation type>
which contains:- explanations are prefixed with scope followed by explanation type (CustomExplanation)
like
global_partial_dependence
- per-explanation representation directory (see below)
- explanations are prefixed with scope followed by explanation type (CustomExplanation)
like
- per-explanation representation directory is identified by (escaped) MIME type (format) like
application_json
which contains:- index file
explanation.<MIME extension>
whose name is alwaysextension.
and extension is driven by MIME type - optional data file(s) which contain (typically per-feature and/or per-class) explanation data, format is defined by Grammar of MLI
- index file
Python Client API can be used to determine (for particular interpretation):
- list of executed explainers
- list of explanations created by the explainer
- list of formats available for the explanation
- download URL for given representation
Example: getting representations of the interpretation shown in the directory listing above.
List explainers which were ran withing interpretation (mli_key
) using get_explainer_job_statuses()
procedure:
explainer_job_statuses=h2oai.get_explainer_job_statuses(
mli_key="0b83998c-565d-11eb-b860-ac1f6b46eab4",
explainer_job_keys=None, # get all keys
)
print(f"Explainers run in {mli_key} interpretation:")
for explainer_job_status in explainer_job_statuses:
pprint.pprint(explainer_job_status.dump())
It returns explainer result descriptors listing explanations and formats. Entry for the directory listing from above looks like:
Explainers run in 0b83998c-565d-11eb-b860-ac1f6b46eab4 interpretation:
...
{'explainer_job': {'child_explainers_job_keys': [],
'created': 1610624324.2524223,
'duration': 489.0938754081726,
'entity': {'can_explain': ['regression',
'binomial',
'multinomial'],
'explanation_scopes': ['global_scope',
'local_scope'],
'explanations': [{'category': 'EXAMPLE',
'explanation_type': 'global-partial-dependence',
'formats': ['application/json'],
'has_local': 'local-individual-conditional-explanation',
'name': 'Template PD/ICE',
'scope': 'global'},
{'category': 'EXAMPLE',
'explanation_type': 'local-individual-conditional-explanation',
'formats': ['application/vnd.h2oai.json+datatable.jay'],
'has_local': None,
'name': 'Template ICE',
'scope': 'local'},
{'category': 'EXAMPLE',
'explanation_type': 'global-work-dir-archive',
'formats': ['application/zip'],
'has_local': None,
'name': 'Template PD/ICE ZIP',
'scope': 'global'}],
'id': 'False_template_pd_explainer_2dc07fea_contentexplainer.TemplatePartialDependenceExplainer',
'keywords': ['template'],
'model_types': ['iid'],
'name': 'Template PD/ICE explainer',
'parameters': []},
'error': '',
'message': 'Explainer 0e3fc89b-565d-11eb-b860-ac1f6b46eab4 '
'run successfully finished',
'progress': 1.0,
'status': 0},
'explainer_job_key': '0e3fc89b-565d-11eb-b860-ac1f6b46eab4',
'mli_key': '0b83998c-565d-11eb-b860-ac1f6b46eab4'}
...
Descriptors can be used to filter and/or lookup explanations user needs for example just by iterating and testing types/scopes/MIMEs of created explanations.
Descriptor for particular explainer job can be get as follows:
explainer_descriptor = h2oai.list_explainer_results(
explainer_job_key="0e3fc89b-565d-11eb-b860-ac1f6b46eab4"
)
Response:
{
'id': 'False_template_pd_explainer_2dc07fea_contentexplainer.TemplatePartialDependenceExplainer',
'name': 'Template PD/ICE explainer',
'model_types': ['iid'],
'can_explain': ['regression', 'binomial', 'multinomial'],
'explanation_scopes': ['global_scope', 'local_scope'],
'explanations': [{'explanation_type': 'global-partial-dependence',
'name': 'Template PD/ICE',
'category': 'EXAMPLE',
'scope': 'global',
'has_local': 'local-individual-conditional-explanation',
'formats': ['application/json']},
{'explanation_type': 'local-individual-conditional-explanation',
'name': 'Template ICE',
'category': 'EXAMPLE',
'scope': 'local',
'has_local': None,
'formats': ['application/vnd.h2oai.json+datatable.jay']},
{'explanation_type': 'global-work-dir-archive',
'name': 'Template PD/ICE ZIP',
'category': 'EXAMPLE',
'scope': 'global',
'has_local': None,
'formats': ['application/zip']}],
'parameters': [],
'keywords': ['template']
}
Python Client API get_explainer_job_statuses()
and list_explainer_results()
procedures described in Explanations Introspection section can be used
to get enough information to determine particular explanation representation's URL and download it:
explanation_url=h2oai_client.get_explainer_result_url_path(
mli_key=mli_key,
explainer_job_key="0e3fc89b-565d-11eb-b860-ac1f6b46eab4",
explanation_type="global-work-dir-archive",
explanation_format="application/zip",
)
h2oai_client.download(explanation_url, "/home/user/Downloads")
To download all explanations of given explainer you can use explanations descriptors introspection and simply iterate available explanations and formats:
explainer_job_key="0e3fc89b-565d-11eb-b860-ac1f6b46eab4"
# get explainer result descriptor
explainer_descriptor = h2oai_client.list_explainer_results(
explainer_job_key=explainer_job_key,
)
# download all explanations in all formats
for explanation in explainer_descriptor.explanations:
for explanation_format in explanation.formats:
explanation_url=h2oai_client.get_explainer_result_url_path(
mli_key=mli_key,
explainer_job_key=explainer_job_key,
explanation_type=explanation.explanation_type,
explanation_format=explanation_format,
)
print(
f"\nDownloading {explanation.explanation_type} as {explanation_format} from:"
f"\n {explanation_url} ..."
)
h2oai_client.download(explanation_url, "/home/user/Downloads")
Result:
Downloading global-partial-dependence as application/json from:
h2oai/mli_experiment_0b83998c-565d-11eb-b860-ac1f6b46eab4/explainer_False_template_pd_explainer_2dc07fea_contentexplainer_TemplatePartialDependenceExplainer_0e3fc89b-565d-11eb-b860-ac1f6b46eab4/global_partial_dependence/application_json/explanation.json ...
Downloading local-individual-conditional-explanation as application/vnd.h2oai.json+datatable.jay from:
h2oai/mli_experiment_0b83998c-565d-11eb-b860-ac1f6b46eab4/explainer_False_template_pd_explainer_2dc07fea_contentexplainer_TemplatePartialDependenceExplainer_0e3fc89b-565d-11eb-b860-ac1f6b46eab4/local_individual_conditional_explanation/application_vnd_h2oai_json_datatable_jay/explanation.json ...
Downloading global-work-dir-archive as application/zip from:
h2oai/mli_experiment_0b83998c-565d-11eb-b860-ac1f6b46eab4/explainer_False_template_pd_explainer_2dc07fea_contentexplainer_TemplatePartialDependenceExplainer_0e3fc89b-565d-11eb-b860-ac1f6b46eab4/global_work_dir_archive/application_zip/explanation.zip ...
Check Python Client API Jupyter Notebook for more examples.
Explanations can be downloaded using Driverless AI UI as snapshots. Snapshot is Zip archive of explainer directory as shown in Explanations Introspection directory listing.
Custom explainer run snapshots can be downloaded also using Python Client API:
snapshot_url = h2oai_client.get_explainer_snapshot_url_path(
mli_key=mli_key,
explainer_job_key="0e3fc89b-565d-11eb-b860-ac1f6b46eab4",
)
h2oai_client.download(snapshot_url, "/home/user/Downloads")
Custom explainer results can be visualized using Grammar of MLI UI components in Driverless AI.
Grammar of MLI is a set of interactive charts which can render normalized results (explanations representations) created by custom explainers. Custom Explainer Python API provides helpers which aims to make explanations normalization easy.
Sub-sections of this chapter provide overview of available components and expected data format specification.
Please read Explanations Introspection section and check directory listing there. Make sure that you understand the following concepts:
- index files
- data files
- explanation types
- explanation formats
Template explainer:
Custom Explainers API
- explanations:
GlobalFeatImpExplanation
LocalFeatImpExplanation
- representations/formats:
GlobalFeatImpJSonFormat
LocalFeatImpJSonFormat
Example of the server-side filesystem directory structure:
.
├── global_feature_importance
│ ├── application_json
│ │ ├── explanation.json
│ │ ├── featimp_class_A.json
│ │ ├── featimp_class_B.json
│ │ └── featimp_class_C.json
│ └── ...
├── log
│ └── ...
└── work
└── ...
Index file explanation.json
:
{
"files": {
"class_A": "featimp_class_A.json",
"class_B": "featimp_class_B.json",
"class_C": "featimp_class_C.json"
},
"total_rows": 20,
"metrics": [{
"R2": 0.96
}, {
"RMSE": 0.03
}, {
"Bias": 0.34
}],
"documentation": "Feature importance explainer..."
}
files
dictionary key is class name, value is file namemetrics
is dictionary of key/value pairs to be rendered atop charttotal_rows
is integer value used for pagingdocumentation
is shown on clicking?
as help
Data file(s) per-class:
{
bias?: num,
data: [
{
label: str,
value: num,
scope: 'global' | 'local',
}
]
}
- contains feature importances data for particular class
bias
- bias value (optional)data
- feature importances (list)label
is name of featurevalue
is feature namescope
can beglobal
for global orlocal
for local explanations
For example:
{
"bias": 0.15,
"data": [{
"label": "PAY_0",
"value": 1.0,
"scope": "global"
}, {
"label": "PAY_2",
"value": 0.519,
"scope": "global"
}, {
...
Local explanation:
{
"files": {},
"metrics": [],
"documentation": "Shapley explanations are ...",
"on_demand": true,
"on_demand_params": {
"synchronous_on_demand_exec": true,
"is_multinomial": false,
"raw_features": true
}
}
files
- same as global explanationmetrics
- same as global explanationdocumentation
- same as global explanationon_demand
- indicate whether local explanation should be dispatched On-demand Local Explanation (true
) or as Cached Local Explanation (false
)- in case of on-demand dispatch,
files
andmetrics
don't have to be set
- in case of on-demand dispatch,
on_demand_params
- can contain any values which are needed in case of on-demand dispatchsynchronous_on_demand_exec
-true
in case of synchronous On-demand Local Explanation,false
in case of asynchronous On-demand Local Explanation- ... any additional parameters
Template explainer:
Custom Explainers API
- explanations:
PartialDependenceExplanation
IndividualConditionalExplanation
- representations/formats:
PartialDependenceJSonFormat
IceJsonDatatableFormat
Example of the server-side filesystem directory structure:
.
├── global_partial_dependence
│ ├── application_json
│ │ ├── explanation.json
│ │ ├── pd_feature_0_class_0.json
│ │ ├── pd_feature_0_class_1.json
│ │ ├── pd_feature_0_class_2.json
│ │ ├── pd_feature_1_class_0.json
│ │ ├── pd_feature_1_class_1.json
│ │ └── pd_feature_1_class_2.json
│ └── ...
├── log
│ └── ...
└── work
└── ...
Index file explanation.json
:
{
"features": {
"feature_1": {
"order": 0,
"feature_type": [
"categorical"
],
"files": {
"class_A": "pd_feature_0_class_0.json",
"class_B": "pd_feature_0_class_1.json",
"class_C": "pd_feature_0_class_2.json"
}
},
"feature_2": {
"order": 1,
"feature_type": [
"numeric"
],
"files": {
"class_A": "pd_feature_1_class_0.json",
"class_B": "pd_feature_1_class_1.json",
"class_C": "pd_feature_1_class_2.json"
}
}
},
"metrics": [{
"RMSE": 0.029
},
{
"SD": 3.1
}
],
"documentation": "PD and ICE explainer ..."
}
features
dictionary is feature namefeature_type
controls whether PD should be rendered ascategorical
ornumerical
files
dictionary key is class name, value is file namemetrics
is dictionary of key/value pairs to be rendered atop charttotal_rows
is integer value used for pagingdocumentation
is shown on clicking?
as help
Data file(s) per feature and per-class:
{
prediction?: num,
data: [
{
bin: num,
pd: num,
sd: num,
residual-pd?: num,
residual-sd?: num,
ice?: num,
oor?: bool
}
],
data_histogram_numerical?: [
{
x: any,
frequency: num,
}
],
data_histogram_categorical?: [
{
x: any,
frequency: num,
}
]
}
- contains PD for particular feature and class
prediction
- original DAI model prediction (optional)data
- per-bin value (list):bin
- bin valuepd
- partial dependence valueresidual-pd
- residual partial dependence value (optional)ice
- local explanation ~ ICE value (optional)sd
- standard deviationresidual-sd
- residual standard deviation value (optional)oor
- out of range indicator (bool)
data_histogram_numerical
- optional histogram data for continous features (list):x
- value for which is histogram computedfrequency
- number of occurences within bin
data_histogram_numerical
- optional histogram data for discrete features (list):x
- value for which is histogram computedfrequency
- number of occurences of the value
For example:
{
"data": [{
"bin": "UNSEEN",
"pd": 0.18315742909908295,
"sd": 0.1297120749950409,
"oor": true
}, {
"bin": "married",
"pd": 0.18658745288848877,
"sd": 0.13090574741363525,
"oor": false
}, {
...
...
"data_histogram_categorical": [{
"x": "divorce",
"histogram": 69
}, {
"x": "married",
"histogram": 2577
}, {
"x": "other",
"histogram": 3
}, {
"x": "single",
"histogram": 2720
}],
}
Local explanation:
{
"features": {
"feature_1": {
"order": 0,
"feature_type": [
"numeric"
],
"files": {
"class_A": "ice_feature_0_class_0.jay",
"class_B": "ice_feature_0_class_1.jay",
"class_C": "ice_feature_0_class_2.jay"
}
},
"feature_2": {
"order": 1,
"feature_type": [
"numeric"
],
"files": {
"class_A": "ice_feature_1_class_0.jay",
"class_B": "ice_feature_1_class_1.jay",
"class_C": "ice_feature_1_class_2.jay"
}
}
},
"metrics": [],
"y_file": "y_hat.jay"
}
The same structure as in case of global explanation, except optional fields:
on_demand
- indicate whether local explanation should be dispatched On-demand Local Explanation (true
) or as Cached Local Explanation (false
)- in case of on-demand dispatch,
files
andmetrics
don't have to be set
- in case of on-demand dispatch,
on_demand_params
- can contain any values which are needed in case of on-demand dispatchsynchronous_on_demand_exec
-true
in case of synchronous On-demand Local Explanation,false
in case of asynchronous On-demand Local Explanation- ... any additional parameters
Template explainers:
- Markdown with Pandas images
- Markdown with Vega diagrams
- Markdown with feature importance summary chart
Example of the server-side filesystem directory structure:
.
├── global_report
│ ├── text_markdown
│ │ ├── explanation.md
│ │ └── image.png
│ └── ...
├── log
│ └── ...
└── work
└── ...
Index file - Markdown report itself - explanation.md
:
# Example Report
This is an example of **Markdown report** which can be created by explainer.
![image](./image.png)
Data file(s):
- directory may contain also images (like
image.png
) or any other artifacts references from the report
Local explanation is not supported.
Template explainer:
Custom Explainers API
- explanations:
GlobalDtExplanation
LocalDtExplanation
- representations/formats:
GlobalDtJSonFormat
LocalDtJSonFormat
Example of the server-side filesystem directory structure:
.
├── global_decision_tree
│ ├── application_json
│ │ ├── dt_class_A.json
│ │ ├── dt_class_B.json
│ │ ├── dt_class_C.json
│ │ └── explanation.json
│ └── ...
├── log
│ ├── explainer_run_0f22e430-565d-11eb-b860-ac1f6b46eab4_anonymized.log
│ ├── explainer_run_0f22e430-565d-11eb-b860-ac1f6b46eab4.log
│ └── ...
├── log
│ └── ...
└── work
└── ...
Index file explanation.json
:
{
"files": {
"class_A": "dt_class_A.json",
"class_B": "dt_class_B.json",
"class_C": "dt_class_C.json"
},
"metrics": [{
"cvRmse": 0.029
},
{
"trainRmse": 3.1
},
{
"r2": 3.1
},
{
"klimeNfold": 3.1
}],
}
files
dictionary key is class name, value is file name
Data file(s) per per-class:
{
data: [
{
key: str,
name: str,
parent: str,
edge_in: str,
edge_weight: num,
leaf_path: bool,
}
]
}
- contains tree data for particular feature:
data
- tree node value (list)key
- unique node IDname
- human readable node nameparent
- unique parent node IDedge_in
- human readable edge name bettwen node and parent nodeedge_weight
- edge weight valueleaf_path
- flag for selected row path
For example:
{
"data": [{
"key": "0",
"name": "LIMIT_BAL",
"parent": null,
"edge_in": null,
"edge_weight": null,
"leaf_path": false
}, {
"key": "0.0",
"name": "LIMIT_BAL",
"parent": "0",
"edge_in": "< 144868.000 , NA",
"edge_weight": 0.517,
"leaf_path": false
}, {
...
Local explanation:
{
"files": {
"class_A": "dt_class_0.json",
"class_B": "dt_class_1.json",
"class_C": "dt_class_2.json"
},
"metrics": [],
"documentation": "Template DecisionTree explainer...",
"on_demand": true,
"on_demand_params": {
"synchronous_on_demand_exec": true
}
}
The same structure as in case of global explanation, except optional fields:
on_demand
- indicate whether local explanation should be dispatched On-demand Local Explanation (true
) or as Cached Local Explanation (false
)- in case of on-demand dispatch,
files
andmetrics
don't have to be set
- in case of on-demand dispatch,
on_demand_params
- can contain any values which are needed in case of on-demand dispatchsynchronous_on_demand_exec
-true
in case of synchronous On-demand Local Explanation,false
in case of asynchronous On-demand Local Explanation- ... any additional parameters
Since: Driverless AI 1.9.2
Template explainer:
Custom Explainers API
- explanations:
GlobalSummaryFeatImpExplanation
LocalSummaryFeatImpExplanation
- representations/formats:
GlobalSummaryFeatImpJSonFormat
GlobalSummaryFeatImpJsonDatatableFormat
LocalSummaryFeatImpJSonFormat
Example of the server-side filesystem directory structure:
.
├── global_feature_importance
│ ├── application_json
│ │ ├── explanation.json
│ │ ├── featimp_class_A.json
│ │ ├── featimp_class_B.json
│ │ └── featimp_class_C.json
│ └── ...
├── log
│ └── ...
└── work
└── ...
.
├── global_summary_feature_importance
│ ├── application_json
│ │ ├── explanation.json
│ │ ├── summary_feature_importance_class_0_offset_0.json
│ │ ├── ...
│ │ └── summary_feature_importance_class_6_offset_1.json
│ ├── application_vnd_h2oai_json_datatable_jay
│ │ ├── explanation.json
│ │ ├── summary_feature_importance_class_0.jay
│ │ ├── ...
│ │ └── summary_feature_importance_class_6.jay
│ └── ...
├── log
│ └── ...
└── work
└── ...
Index file explanation.json
:
{
"documentation": "Summary Shapley feature importance explainer...",
"total_rows": 12,
"rows_per_page": 10,
"files": {
"asics_ds_trainer": {
"0": "summary_feature_importance_class_0_offset_0.json",
"10": "summary_feature_importance_class_0_offset_1.json"
},
...
},
"specialized_29er": {
"0": "summary_feature_importance_class_6_offset_0.json",
"10": "summary_feature_importance_class_6_offset_1.json"
}
},
"metrics": [
{
"Global bias": -4.180208683013916
}
]
}
files
dictionary key is class name, value is page offset- page dictionary key is page offset, value is file name
metrics
is dictionary of key/value pairs to be rendered atop charttotal_rows
is integer value used for pagingrows_per_page
is integer with the number of features per-pagedocumentation
is shown on clicking?
as help
Data file(s) per-class:
{
data: [
{
feature: str,
shapley_value: num,
count: num,
avg_high_value: num,
scope: 'global' | 'local',
}
]
}
- contains Shapley values data for particular class
data
- Shapley values (list)feature
is name of feature (y-axis value)shapley_value
is x-axis valuecount
is the number of rows in x-axis (Shapley values) bin (given by x-axis chart resolution)avg_high_value
is the average value of feature values of rows withing the bin (where binning is driven by Shapley values of the rowscope
can beglobal
for global orlocal
for local explanations
For example:
{
"data": [
{
"feature": "avg_speed",
"shapley_value": -0.9384549282109822,
"count": 0,
"avg_high_value": 0.0
},
{
"feature": "avg_speed",
"shapley_value": -0.9318626367885361,
"count": 0,
"avg_high_value": 0.0
},
...
Local explanation:
{
"documentation": "Summary feature importance explainer...",
"files": {
"class_A": {
"0": "dt_class_0_offset_0.json"
},
...
"class_C": {
"0": "dt_class_2_offset_0.json"
}
},
"metrics": [{"Bias": 0.65}],
"on_demand": true,
"on_demand_params": {
"synchronous_on_demand_exec": false,
"features_per_page": {
"class_A": {
"0": [
"BILL_AMT1",
"LIMIT_BAL",
"PAY_0",
"PAY_2",
"PAY_3",
"PAY_4",
"PAY_5",
"PAY_6",
"PAY_AMT1",
"PAY_AMT4"
],
...
"20": [
"EDUCATION"
]
},
...
}
}
}
files
- same as global explanationmetrics
- same as global explanationdocumentation
- same as global explanationon_demand
- indicate whether local explanation should be dispatched On-demand Local Explanation (true
) or as Cached Local Explanation (false
)- in case of on-demand dispatch,
files
andmetrics
don't have to be set
- in case of on-demand dispatch,
on_demand_params
- can contain any values which are needed in case of on-demand dispatchsynchronous_on_demand_exec
-true
in case of synchronous On-demand Local Explanation,false
in case of asynchronous On-demand Local Explanation- ... any additional parameters
Local explanations are expected to return rows (features) only for one page.
Template explainer:
Custom Explainers API
- explanations:
GlobalScatterPlotExplanation
- representations/formats:
GlobalScatterPlotJSonFormat
Example of the server-side filesystem directory structure:
.
├── global_scatter_plot
│ ├── application_json
│ │ ├── explanation.json
│ │ ├── scatter_class_A.json
│ │ ├── scatter_class_B.json
│ │ └── scatter_class_C.json
│ └── ...
├── log
│ └── ...
└── work
└── ...
Index file explanation.json
:
{
"files": {
"class_A": "scatter_class_A.json",
"class_B": "scatter_class_B.json",
"class_C": "scatter_class_C.json"
},
metrics: [{
"bias": 0.23,
}, {
"clusterName": "cluster 1",
}, {
"R2": 0.56,
}, {
"RMSE": 0.289,
}]
"documentation": "Scatter plot explainer..."
}
files
dictionary key is class name, value is file name
Data file(s) per per-class/cluster:
{
data: [
{
rowId: num,
responseVariable: num,
limePred: num,
modelPred: num,
actual: num,
reasonCode: [
{
label: str,
value: num,
}
]
}
],
bias: str
}
- contains scatter plot data per class (per-cluster):
data
value (list):rowId
- unique row IDresponseVariable
- response variable valuelimePred
- LIME prediction valuemodelPred
- model prediction valueactual
- actual valuereasonCode
reason code (list)label
- feature's namevalue
- feature's value
bias
- bias value
For example:
{
{
"bias": 0.15,
"data": [{
"rowId": 1,
"responseVariable": 25,
"limePred": 20,
"modelPred": 30,
"actual": 40
}, {
"rowId": 2,
"responseVariable": 33,
"limePred": 15,
"modelPred": 35,
"actual": 25
}, {
...
Local explanation not supported.
Performance best practices:
- Use fast and efficient data manipulation tools like
datatable
,sklearn
,numpy
orpandas
instead of Python lists, for-loops etc. - Use disk sparingly, delete temporary files as soon as possible.
- Use memory sparingly, delete objects when no longer needed.
Safety best practices:
- Driverless AI automatically performs basic acceptance tests for all custom recipes unless disabled.
- More information in the FAQ.
Security best practices:
-
Recipes are meant to be built by people you trust and each recipe should be code-reviewed before going to production.
-
Assume that a user with access to Driverless AI has access to the data inside that instance.
- Apart from securing access to the instance via private networks, various methods of authentication are possible. Local authentication provides the most control over which users have access to Driverless AI.
- Unless the
config.toml
settingenable_dataset_downloading=false
is set, an authenticated user can download all imported datasets as .csv via direct APIs.
-
When recipes are enabled (
enable_custom_recipes=true
, the default), be aware that:- The code for the recipes runs as the same native Linux user that runs the Driverless AI application.
- Recipes have explicit access to all data passing through the transformer/model/scorer API
- Recipes have implicit access to system resources such as disk, memory, CPUs, GPUs, network, etc.
- A H2O-3 Java process is started in the background, for use by all recipes using H2O-3. Anyone with access to the Driverless AI instance can browse the file system, see models and data through the H2O-3 interface.
- The code for the recipes runs as the same native Linux user that runs the Driverless AI application.
-
Best ways to control access to Driverless AI and custom recipes:
- Control access to the Driverless AI instance
- Use local authentication to specify exactly which users are allowed to access Driverless AI
- Run Driverless AI in a Docker container, as a certain user, with only certain ports exposed, and only certain mount points mapped
- To disable all recipes: Set
enable_custom_recipes=false
in the config.toml, or add the environment variableDRIVERLESS_AI_ENABLE_CUSTOM_RECIPES=0
at startup of Driverless AI. This will disable all custom transformers, models and scorers. - To disable new recipes: To keep all previously uploaded recipes enabled and disable the upload of any new recipes, set
enable_custom_recipes_upload=false
orDRIVERLESS_AI_ENABLE_CUSTOM_RECIPES_UPLOAD=0
at startup of Driverless AI.
There is no Driverless AI BYOR recipe versioning API.
To deploy new recipe version (while you keep the previous one) change:
- recipe class name
- recipe display name (optional)
For example from:
class ExampleVersionExplainer_v2_0(CustomExplainer):
_display_name = "Explainer v2.0"
...
... to:
class ExampleVersionExplainer_v2_1(CustomExplainer):
_display_name = "Explainer v2.1"
...
This will create a new explainer with save new name and both explainer versions may coexist.
Examples of simple explainers which demonstrate explainer features.
Deploy and run example explainers as described in Deploy and Run sections or Hello world! example.
Hello, World!
explainer is example of the simplest explainer.
from h2oaicore.mli.oss.byor.core.explainers import (
CustomExplainer,
)
from h2oaicore.mli.oss.byor.core.explanations import WorkDirArchiveExplanation
class ExampleHelloWorldExplainer(CustomExplainer):
_display_name = "Hello, World!"
_description = "This is 'Hello, World!' explainer example."
_regression = True
_explanation_types = [WorkDirArchiveExplanation]
def __init__(self):
CustomExplainer.__init__(self)
def explain(self, X, y=None, explanations_types=None, **kwargs) -> list:
explanation = self.create_explanation_workdir_archive(
display_name=self.display_name, display_category="Demo"
)
return [explanation]
To try Hello world
example:
- store source code of the explainer to
hello_world_explainer.py
file - upload explainer file as described in Deploy section
- run explainer as described in Run section:
- choose a regression model (note that explainer declares that it explains regression models with
_regression = True
) - choose
Hello, World!
explainer in selected recipes listing only (uncheck all others) - click
LAUNCH MLI
- choose a regression model (note that explainer declares that it explains regression models with
- once explainer run finishes, you can get it's result - zip archive as described in Get section.
- note that
display_category
parameter is used to name tab in UI - note that
display_name
parameter is used to name tile in UI
- note that
Archive representation created by the explainer contains its working directory content (which is empty in this case) Anyway this is simplest way how to get any explainer artifact, computation results or representations created by the explainer if their visualization is not needed.
from h2oaicore.mli.oss.byor.core.explainers import (
CustomExplainer,
)
from h2oaicore.mli.oss.byor.core.explanations import WorkDirArchiveExplanation
class ExampleLoggingExplainer(CustomExplainer):
_display_name = "Example Logging Explainer"
_description = "This is logging explainer example."
_regression = True
_explanation_types = [WorkDirArchiveExplanation]
def __init__(self):
CustomExplainer.__init__(self)
def setup(self, model, persistence, **kwargs):
CustomExplainer.setup(self, model, persistence, **kwargs)
self.logger.info(f"{self.display_name} explainer initialized")
def explain(self, X, y=None, explanations_types=None, **kwargs) -> list:
self.logger.debug(f"explain() method invoked with args: {kwargs}")
if not explanations_types:
self.logger.warning(
f"Explanation types to be returned by {self.display_name} not specified"
)
try:
return [
self.create_explanation_workdir_archive(
display_name=self.display_name, display_category="Demo"
)
]
except Exception as ex:
self.logger.error(
f"Explainer '{ExampleLoggingExplainer.__name__}' failed with: {ex}"
)
raise ex
from h2oaicore.mli.oss.byor.core.explainers import (
CustomExplainer,
)
from h2oaicore.mli.oss.byor.core.explanations import WorkDirArchiveExplanation
class ExampleEdaExplainer(CustomExplainer):
_display_name = "Example Dataset Explainer"
_description = "This is Explanatory Data Analysis explainer example."
_regression = True
_explanation_types = [WorkDirArchiveExplanation]
def __init__(self):
CustomExplainer.__init__(self)
def setup(self, model, persistence, **kwargs):
CustomExplainer.setup(self, model, persistence, **kwargs)
def explain(self, X, y=None, explanations_types=None, **kwargs) -> list:
self.logger.debug("explain() method invoked with dataset:")
self.logger.debug(f" type: {type(X)}")
self.logger.debug(f" shape: {X.shape}")
self.logger.debug(f" columns: {X.names}")
self.logger.debug(f" types: {X.stypes}")
self.logger.debug(f" unique: {X.nunique()}")
self.logger.debug(f" max: {X.max()}")
self.logger.debug(f" min: {X.min()}")
return [
self.create_explanation_workdir_archive(
display_name=self.display_name, display_category="Demo"
)
]
from h2oaicore.mli.oss.byor.core.explainers import (
CustomDaiExplainer,
CustomExplainer,
)
from h2oaicore.mli.oss.byor.core.explanations import WorkDirArchiveExplanation
class ExampleScoreExplainer(CustomExplainer, CustomDaiExplainer):
_display_name = "Example Score Explainer"
_description = (
"This is explainer example which demonstrates how to get model predict "
"method and use it to score dataset."
)
_regression = True
_explanation_types = [WorkDirArchiveExplanation]
def __init__(self):
CustomExplainer.__init__(self)
CustomDaiExplainer.__init__(self)
def setup(self, model, persistence, **e_params):
CustomExplainer.setup(self, model, persistence, **e_params)
CustomDaiExplainer.setup(self, **e_params)
def explain(self, X, y=None, explanations_types=None, **kwargs) -> list:
# prepare 1st row of the dataset with features used by the model
df = X[:1, self.used_features]
self.logger.info(f"Dataset to score: {df}")
# model predict method
prediction = self.model.predict_method(df)
self.logger.info(f"Prediction : {prediction}")
return [
self.create_explanation_workdir_archive(
display_name=self.display_name, display_category="Demo"
)
]
from h2oaicore.mli.oss.byor.core.explainers import (
CustomDaiExplainer,
CustomExplainer,
CustomExplainerParam,
)
from h2oaicore.mli.oss.byor.core.explanations import WorkDirArchiveExplanation
from h2oaicore.mli.oss.byor.explainer_utils import CustomExplainerArgs
from h2oaicore.mli.oss.commons import ExplainerParamType
class ExampleParamsExplainer(CustomExplainer, CustomDaiExplainer):
PARAM_ROWS_TO_SCORE = "rows_to_score"
_display_name = "Example Params Explainer"
_description = (
"This explainer example shows how to define explainer parameters."
)
_regression = True
_parameters = [
CustomExplainerParam(
param_name=PARAM_ROWS_TO_SCORE,
description="The number of dataset rows to be scored by explainer.",
param_type=ExplainerParamType.int,
default_value=1,
src=CustomExplainerParam.SRC_EXPLAINER_PARAMS,
),
]
_explanation_types = [WorkDirArchiveExplanation]
def __init__(self):
CustomExplainer.__init__(self)
CustomDaiExplainer.__init__(self)
self.args = None
def setup(self, model, persistence, **e_params):
CustomExplainer.setup(self, model, persistence, **e_params)
CustomDaiExplainer.setup(self, **e_params)
# resolve explainer parameters to instance attributes
self.args = CustomExplainerArgs(ExampleParamsExplainer._parameters)
self.args.resolve_params(
explainer_params=CustomExplainerArgs.json_str_to_dict(
self.explainer_params_as_str
)
)
def explain(self, X, y=None, explanations_types=None, **kwargs) -> list:
# use parameter
rows = self.args.get(self.PARAM_ROWS_TO_SCORE)
df = X[:rows, self.used_features]
prediction = self.model.predict_method(df)
self.logger.info(
f"Predictions of dataset with shape {df.shape}: {prediction}"
)
return [
self.create_explanation_workdir_archive(
display_name=self.display_name, display_category="Demo"
)
]
from typing import Optional
from h2oaicore.messages import CommonExplainerParameters
from h2oaicore.mli.oss.byor.core.explainers import (
CustomDaiExplainer,
CustomExplainer,
)
from h2oaicore.mli.oss.byor.core.explanations import WorkDirArchiveExplanation
class ExampleCompatibilityCheckExplainer(CustomExplainer, CustomDaiExplainer):
_display_name = "Example Compatibility Check Explainer"
_description = "This is explainer with compatibility check example."
_regression = True
_explanation_types = [WorkDirArchiveExplanation]
def __init__(self):
CustomExplainer.__init__(self)
CustomDaiExplainer.__init__(self)
def check_compatibility(
self,
params: Optional[CommonExplainerParameters] = None,
**explainer_params,
) -> bool:
CustomExplainer.check_compatibility(self, params, **explainer_params)
CustomDaiExplainer.check_compatibility(self, params, **explainer_params)
# explainer can explain only dataset with less than 1M rows (without sampling)
if self.dataset_entity.row_count > 1_000_000:
# not supported
return False
return True
def explain(self, X, y=None, explanations_types=None, **kwargs) -> list:
return [
self.create_explanation_workdir_archive(
display_name=self.display_name, display_category="Demo"
)
]
from h2oaicore.mli.oss.byor.core.explainers import (
CustomExplainer,
)
from h2oaicore.mli.oss.byor.core.explanations import WorkDirArchiveExplanation
class ExamplePersistenceExplainer(CustomExplainer):
_display_name = "Example Persistence Explainer"
_description = (
"This is explainer example which demonstrates how to use persistence object"
"in order to access explainer file system (sandbox) - working, explanations "
"and MLI directories."
)
_regression = True
_explanation_types = [WorkDirArchiveExplanation]
def __init__(self):
CustomExplainer.__init__(self)
def setup(self, model, persistence, **kwargs):
CustomExplainer.setup(self, model, persistence, **kwargs)
def explain(self, X, y=None, explanations_types=None, **kwargs) -> list:
# use self.persistence object to get file system paths
self.logger.info(f"Explainer MLI dir: {self.persistence.base_dir}")
self.logger.info(
f"Explainer dir: {self.persistence.get_explainer_dir()}"
)
# save 1st row of dataset to work directory and prepare work directory archive
df_head = X[:1, :]
df_head.to_csv(
self.persistence.get_explainer_working_file("dataset_head.csv")
)
return [
self.create_explanation_workdir_archive(
display_name=self.display_name, display_category="Demo"
)
]
from h2oaicore.mli.oss.byor.core.explainers import (
CustomDaiExplainer,
CustomExplainer,
)
from h2oaicore.mli.oss.byor.core.explanations import CustomExplanation
from h2oaicore.mli.oss.byor.core.representations import (
TextCustomExplanationFormat,
)
class MyCustomExplanation(CustomExplanation):
"""Example of a user defined explanation type."""
_explanation_type = "user-guide-explanation-example"
_is_global = True
def __init__(
self, explainer, display_name: str = None, display_category: str = None
) -> None:
CustomExplanation.__init__(
self,
explainer=explainer,
display_name=display_name,
display_category=display_category,
)
def validate(self) -> bool:
return self._formats is not None
class ExampleCustomExplanationExplainer(CustomExplainer, CustomDaiExplainer):
_display_name = "Example Custom Explanation Explainer"
_description = (
"Explainer example which shows how to define custom explanation."
)
_regression = True
_explanation_types = [MyCustomExplanation]
def __init__(self):
CustomExplainer.__init__(self)
CustomDaiExplainer.__init__(self)
def setup(self, model, persistence, **e_params):
CustomExplainer.setup(self, model, persistence, **e_params)
CustomDaiExplainer.setup(self, **e_params)
def explain(self, X, y=None, explanations_types=None, **kwargs) -> list:
df = X[:1, self.used_features]
prediction = self.model.predict_method(df)
# create CUSTOM explanation
explanation = MyCustomExplanation(
explainer=self,
display_name="Custom Explanation Example",
display_category="Example",
)
# add a text format to CUSTOM explanation
explanation.add_format(
TextCustomExplanationFormat(
explanation=explanation,
format_data=f"Prediction is: {prediction}",
format_file=None,
)
)
return [explanation]
ExampleCustomExplanationExplainer
explainer demonstrates how to create a new explanation type.
To try custom explanation explainer:
- store source code of the explainer to a file
- upload explainer file as described in Deploy section
- run explainer as described in Run section:
- choose a regression model
- choose
Example Custom Explanation Explainer
explainer in selected recipes listing only (uncheck all others) - click
LAUNCH MLI
- once explainer run finishes, you can get it's result as follows:
- click
0 RUNNING | 0 FAILED | 1 DONE
button - however over
Example Custom Explanation Explainer
row and clickSNAPSHOT
button to download explaner data snapshot
- click
The content of the snapshot archive is shown below - not how are the paths and names created based on the explanation and format classes:
explainer_..._ExampleCustomExplanationExplainer_<UUID>
.
├── global_user_guide_explanation_example
│ ├── text_plain
│ │ └── explanation.txt
│ └── text_plain.meta
├── log
│ ├── explainer_run_1f16a4ce-5a62-11eb-979d-e86a64888647_anonymized.log
│ ├── explainer_run_1f16a4ce-5a62-11eb-979d-e86a64888647.log
│ └── logger.lock
├── result_descriptor.json
└── work
└── EXPLAINER_DONE
from h2oaicore.mli.oss.byor.core.explainers import (
CustomDaiExplainer,
CustomExplainer,
)
from h2oaicore.mli.oss.byor.core.explanations import WorkDirArchiveExplanation
class ExampleMetaAndAttrsExplainer(CustomExplainer, CustomDaiExplainer):
_display_name = "Example DAI Explainer Metadata and Attributes"
_description = (
"This explainer example prints explainer metadata, instance attributes and "
"setup() method parameters."
)
_regression = True
_explanation_types = [WorkDirArchiveExplanation]
def __init__(self):
CustomExplainer.__init__(self)
CustomDaiExplainer.__init__(self)
def setup(self, model, persistence, **e_params):
CustomExplainer.setup(self, model, persistence, **e_params)
CustomDaiExplainer.setup(self, **e_params)
self.logger.info("setup() method parameters:")
self.logger.info(f" {e_params}")
self.logger.info("explainer metadata:")
self.logger.info(f" display name: {self._display_name}")
self.logger.info(f" description: {self._description}")
self.logger.info(f" keywords: {self._keywords}")
self.logger.info(f" IID: {self._iid}")
self.logger.info(f" TS: {self._time_series}")
self.logger.info(f" image: {self._image}")
self.logger.info(f" regression: {self._regression}")
self.logger.info(f" binomial: {self._binary}")
self.logger.info(f" multinomial: {self._multiclass}")
self.logger.info(f" global: {self._global_explanation}")
self.logger.info(f" local: {self._local_explanation}")
self.logger.info(f" explanation types: {self._explanation_types}")
self.logger.info(
f" optional e. types: {self._optional_explanation_types}"
)
self.logger.info(f" parameters: {self._parameters}")
self.logger.info(f" not standalone: {self._requires_predict_method}")
self.logger.info(f" Python deps: {self._modules_needed_by_name}")
self.logger.info(f" explainer deps: {self._depends_on}")
self.logger.info(f" priority: {self._priority}")
self.logger.info("explainer instance attributes:")
self.logger.info(f" explainer params: {self.explainer_params}")
self.logger.info(f" common params: {self.params}")
self.logger.info(f" DAI params: {self.dai_params}")
self.logger.info(f" explainer deps: {self.explainer_deps}")
self.logger.info(f" model with predict method: {self.model}")
self.logger.info(f" features used by model: {self.used_features}")
self.logger.info(f" target labels: {self.labels}")
self.logger.info(f" number of target labels: {self.num_labels}")
self.logger.info(f" persistence: {self.persistence}")
self.logger.info(f" MLI key: {self.mli_key}")
self.logger.info(f" DAI username: {self.dai_username}")
self.logger.info(f" model entity: {self.model_entity}")
self.logger.info(f" dataset entity: {self.dataset_entity}")
self.logger.info(
f" validation dataset entity: {self.validset_entity}"
)
self.logger.info(f" test dataset entity: {self.testset_entity}")
self.logger.info(f" sanitization map: {self.sanitization_map}")
self.logger.info(f" enable MOJO: {self.enable_mojo}")
self.logger.info(f" Driverless AI configuration: {self.config}")
def explain(self, X, y=None, explanations_types=None, **kwargs) -> list:
return [
self.create_explanation_workdir_archive(
display_name=self.display_name, display_category="Demo"
)
]
Putting MLI BYOR examples together: Morris Sensitivity Analysis explainers demonstrates how to use a 3rd party library as MLI BYOR recipe to understand Driverless AI models.
from functools import partial
import datatable as dt
import numpy as np
import pandas as pd
from h2oaicore.mli.oss.byor.core.explainers import (
CustomDaiExplainer,
CustomExplainer,
)
from h2oaicore.mli.oss.byor.core.explanations import GlobalFeatImpExplanation
from h2oaicore.mli.oss.byor.core.representations import (
GlobalFeatImpJSonDatatableFormat,
GlobalFeatImpJSonFormat,
)
from h2oaicore.mli.oss.byor.explainer_utils import clean_dataset
# Explainer MUST extend abstract CustomExplainer class to be discovered and
# deployed. In addition it inherits common metadata and (default) functionality. The
# explainer must implement fit() and explain() methods.
#
# Explainer CAN extend CustomDaiExplainer class if it will run on Driverless AI server
# and use experiments. CustomDaiExplainer class provides easy access/handle to the
# dataset and model (metadata and artifacts), filesystem, ... and common logic.
class MorrisSensitivityLeExplainer(CustomExplainer, CustomDaiExplainer):
"""InterpretML: Morris sensitivity (https://github.com/interpretml/interpret)"""
# explainer display name (used e.g. in UI explainer listing)
_display_name = "Morris Sensitivity Analysis"
# explainer description (used e.g. in UI explanations help)
_description = (
"Morris sensitivity analysis explainer provides Morris SA based feature "
"importance which is a measure of the contribution of an input variable "
"to the overall predictions of the Driverless AI model. In applied "
"statistics, the Morris method for global sensitivity analysis is a so-called "
"one-step-at-a-time method (OAT), meaning that in each run only one "
"input parameter is given a new value."
"This Morris sensitivity analysis explainer is based based on InterpretML"
"library (http//interpret.ml)."
)
# declaration of supported experiments: regression / binary / multiclass
_regression = True
_binary = True
# declaration of provided explanations: global, local or both
_global_explanation = True
# declaration of explanation types this explainer creates e.g. feature importance
_explanation_types = [GlobalFeatImpExplanation]
# Python package dependencies (can be installed using pip)
_modules_needed_by_name = ["gevent==1.5.0", "interpret==0.1.20"]
# explainer constructor must not have any required parameters
def __init__(self):
CustomExplainer.__init__(self)
CustomDaiExplainer.__init__(self)
self.cat_variables = None
self.mcle = None
# setup() method is used to initialize the explainer based on provided parameters
# which are passed from client/UI. See parent classes setup() methods docstrings
# and source to check the list of instance fields which are initialized for the
# explainer
def setup(self, model, persistence, key=None, params=None, **e_params):
CustomExplainer.setup(self, model, persistence, key, params, **e_params)
CustomDaiExplainer.setup(self, **e_params)
# abstract fit() method must be implemented - its purpose is to pre-compute
# any artifacts e.g. surrogate models, to be used by explain() method
def fit(self, X: dt.Frame, y: dt.Frame = None, **kwargs):
# nothing to pre-compute
return self
# explain() method is responsible for the creation of the explanations
def explain(
self, X, y=None, explanations_types: list = None, **kwargs
) -> list:
# 3rd party Morris SA library import
from interpret.blackbox import MorrisSensitivity
# DATASET: categorical features encoding (for 3rd party libraries which
# support numeric features only), rows w/ missing values filtering, ...
X = X[:, self.used_features] if self.used_features else X
x, self.cat_variables, self.mcle, _ = clean_dataset(
frame=X.to_pandas(),
le_map_file=self.persistence.get_explainer_working_file("mcle"),
logger=self.logger,
)
# PREDICT FUNCTION: Driverless AI scorer -> library compliant predict function
def predict_function(
pred_fn, col_names, cat_variables, label_encoder, X
):
X = pd.DataFrame(X.tolist(), columns=col_names)
# categorical features inverse label encoding used in case of 3rd party
# libraries which support numeric only
if label_encoder:
X[cat_variables] = X[cat_variables].astype(np.int64)
label_encoder.inverse_transform(X)
# score
preds = pred_fn(X)
# scoring output conversion to the format expected by 3rd party library
if isinstance(preds, pd.core.frame.DataFrame):
preds = preds.to_numpy()
if preds.ndim == 2:
preds = preds.flatten()
return preds
predict_fn = partial(
predict_function,
self.model.predict_method,
self.used_features,
self.cat_variables,
self.mcle,
)
# CALCULATION of the Morris SA explanation
sensitivity: MorrisSensitivity = MorrisSensitivity(
predict_fn=predict_fn, data=x, feature_names=list(x.columns)
)
morris_explanation = sensitivity.explain_global(name=self.display_name)
# NORMALIZATION of proprietary Morris SA library data to explanation w/
# Grammar of MLI format for the visualization in Driverless AI UI
explanations = [self._normalize_to_gom(morris_explanation)]
# explainer MUST return declared explanation(s) (_explanation_types)
return explanations
#
# optional NORMALIZATION to Grammar of MLI
#
"""
explainer_morris_sensitivity_explainer_..._MorrisSensitivityExplainer_<UUID>
├── global_feature_importance
│ ├── application_json
│ │ ├── explanation.json
│ │ └── feature_importance_class_0.json
│ └── application_vnd_h2oai_json_datatable_jay
│ ├── explanation.json
│ └── feature_importance_class_0.jay
├── log
│ ├── explainer_job.log
│ └── logger.lock
└── work
"""
# Normalization of the data to the Grammar of MLI defined format. Normalized data
# can be visualized using Grammar of MLI UI components in Driverless AI web UI.
#
# This method creates explanation (data) and its representations (JSon, datatable)
def _normalize_to_gom(self, morris_explanation) -> GlobalFeatImpExplanation:
# EXPLANATION
explanation = GlobalFeatImpExplanation(
explainer=self,
# display name of explanation's tile in UI
display_name=self.display_name,
# tab name where to put explanation's tile in UI
display_category=GlobalFeatImpExplanation.DISPLAY_CAT_CUSTOM,
)
# FORMAT: explanation representation as JSon+datatable (JSon index file which
# references datatable frame for each class)
jdf = GlobalFeatImpJSonDatatableFormat
# data normalization: 3rd party frame to Grammar of MLI defined frame
# conversion - see GlobalFeatImpJSonDatatableFormat docstring for format
# documentation and source for helpers to create the representation easily
explanation_frame = dt.Frame(
{
jdf.COL_NAME: morris_explanation.data()["names"],
jdf.COL_IMPORTANCE: list(morris_explanation.data()["scores"]),
jdf.COL_GLOBAL_SCOPE: [True]
* len(morris_explanation.data()["scores"]),
}
).sort(-dt.f[jdf.COL_IMPORTANCE])
# index file (of per-class data files)
(
idx_dict,
idx_str,
) = GlobalFeatImpJSonDatatableFormat.serialize_index_file(
["global"],
doc=MorrisSensitivityLeExplainer._description,
)
json_dt_format = GlobalFeatImpJSonDatatableFormat(explanation, idx_str)
json_dt_format.update_index_file(
idx_dict, total_rows=explanation_frame.shape[0]
)
# data file
json_dt_format.add_data_frame(
format_data=explanation_frame,
file_name=idx_dict[jdf.KEY_FILES]["global"],
)
# JSon+datatable format can be added as explanation's representation
explanation.add_format(json_dt_format)
# FORMAT: explanation representation as JSon
#
# Having JSon+datatable formats it's easy to get other formats like CSV,
# datatable, ZIP, ... using helpers - adding JSon representation:
explanation.add_format(
explanation_format=GlobalFeatImpJSonFormat.from_json_datatable(
json_dt_format
)
)
return explanation
See https://github.com/h2oai/driverlessai-recipes/tree/master/explainers for more Driverless AI explainer recipe examples.
If you want to create a new explainer, then you can use templates which were prepared for every Grammar of MLI explanation type:
- decision tree
- feature importance
- Markdown report with...
- PD/ICE
- scatter plot
- ...
Check templates/ folder in Driverless AI recipes GitHub repository and download source code from there.
[...Explainer recipe Python API (apidoc)...]
Python Client API Jupyter Notebook contains end to end scenario which demonstrates how to use Driverless AI RPC API client to upload, filter, run, debug and get explainer results.
Driverless AI provides RPC API which can be accessed using generated Python Client. This section is RPC API procedures reference.
ExplanationDescriptor
explanation_type str
name str
category str
scope str
has_local str
formats str[]
ExplainerDescriptor
id str
name str
model_types str[]
can_explain str[]
explanation_scopes str[]
explanations ExplanationDescriptor[]
parameters ConfigItem[]
keywords str[]
ExplainerRunJob
progress float
status int
error str
message str
entity ExplainerDescriptor
created float
duration int
child_explainers_job_keys str[]
ExplainersRunJob
explainer_job_keys str[]
mli_key str
created float
duration int
status int
progress float
CommonExplainerParameters
target_col str
weight_col str
prediction_col str # no model explanation
drop_cols str[]
sample_num_rows int # >0 to sample, -1 to skip sampling
CommonDaiExplainerParameters
common_params CommonExplainerParameters
model ModelReference
dataset DatasetReference
validset DatasetReference
testset DatasetReference
use_raw_features bool
config_overrides str
sequential_execution bool
debug_model_errors bool
debug_model_errors_class str
Explainer
explainer_id str # explainer ID
explainer_params str # declared explainer parameters as JSon string
# complete explainers runs descriptor (RPC API)
ExplainersRunSummary
common_params CommonExplainerParameters
explainers Explainer[]
explainer_run_jobs ExplainerRunJob[]
ExplainerJobStatus
mli_key str
explainer_job_key str
explainer_job ExplainerRunJob
create_custom_recipe_from_url str
url str
upload_custom_recipe_sync CustomRecipe
file_path str
list_explainers ExplainerDescriptor[]
experiment_types str[]
explanation_scopes str[]
dai_model_key str
keywords str[]
explainer_filter FilterEntry[]
list_explainable_models ListModelQueryResponse
explainer_id str
offset int
size int
get_explainer ExplainerDescriptor
explainer_id str
run_explainers ExplainersRunJob*
explainers Explainer[] # explainers to run
params CommonDaiExplainerParameters # common DAI explainer run parameters
run_interpretation_with_explainers ExplainersRunJob*
explainers Explainer[] # explainers to run
params CommonDaiExplainerParameters # common DAI explainer run parameters
interpret_params InterpretParameters
get_explainer_run_job ExplainerRunJob
explainer_job_key str
abort_explainer_run_jobs void
mli_key str
explainer_job_keys str[]
get_explainer_job_status ExplainerJobStatus
mli_key str
explainer_job_key str
get_explainer_job_statuses ExplainerJobStatus[]
mli_key str
explainer_job_keys str[]
get_explainer_job_keys_by_id str[]
mli_key str
explainer_id str
get_explainer_run_log_url_path str
mli_key str
explainer_job_key str
list_explainer_results ExplainerDescriptor
explainer_job_key str
get_explainer_result_url_path str
mli_key str
explainer_job_key str
explanation_type str
explanation_format str
get_explainer_snapshot_url_path str*
mli_key str
explainer_job_key str
FilterEntry
filter_by str
value str
get_explainer_result str
mli_key str
explainer_job_key str
explanation_type str
explanation_format str
page_offset int
page_size int
result_format str
explanation_filter FilterEntry[]
get_explainer_local_result str
mli_key str
explainer_job_key str
explanation_type str
explanation_format str
id_column_name str
id_column_value str
page_offset int
page_size int
result_format str
explanation_filter FilterEntry[]
Driverless AI administrators can configure server using config.toml
file located in Driverless AI home directory. Users can use the same configuration keys in config overrides (API or UI) to change default server behavior.
MLI BYORs related Driverless AI configuration items:
excluded_mli_explainers: list[str]
- Exclude (problematic) MLI explainers (for all users).
- To disable an explainer use its ID.
Example:
- To disable Sensitivity Analysis explainer use
h2oaicore.mli.byor.recipes.sa_explainer.SaExplainer
ID. - Add the following entry to config overrides in expert settings:
excluded_mli_explainers=['h2oaicore.mli.byor.recipes.sa_explainer.SaExplainer']
- Add the following row to the
config.mk
:excluded_mli_explainers=['h2oaicore.mli.byor.recipes.sa_explainer.SaExplainer']
- Alternatively export the following shell environment variable:
DRIVERLESS_AI_EXCLUDED_MLI_EXPLAINERS=['h2oaicore.mli.byor.recipes.sa_explainer.SaExplainer']
MLI BYOR documentation:
- Creating Custom Explainer with MLI BYORs (getting started)
MLI BYOR examples (source):
- Explainers section of Driverless AI Recipes repository
- Python Client API Jupyter Notebook
Libraries and products: