shap_enhanced.explainers.MBSHAP¶

MB-SHAP: Multi-Baseline SHAP Explainer¶

Theoretical Explanation¶

Multi-Baseline SHAP (MB-SHAP) enhances the robustness of SHAP-based feature attribution by computing SHAP values with respect to multiple baselines rather than a single reference. This addresses a key limitation in standard SHAP explainers: their sensitivity to baseline selection.

By averaging attributions from diverse or locally-relevant baselines (e.g., nearest neighbors, mean, k-means centroids), MB-SHAP produces more stable, reliable, and representative explanations—particularly useful in domains with heterogeneous data distributions or models that exhibit local nonlinearity.

Key Concepts¶

Multiple Baselines:
Each input is explained with respect to a set of baselines instead of just one. Baseline options include:
Random background samples.

Mean or centroid-based references.

K nearest neighbors (local context).

User-specified selections.
Explainer Flexibility:
MB-SHAP is compatible with any SHAP-style explainer, including DeepExplainer, GradientExplainer, and KernelExplainer. It wraps the base explainer and runs it separately for each baseline.
Attribution Averaging:
For each input sample:
SHAP values are computed with respect to each baseline.

The resulting attribution vectors are averaged to yield a final, smoothed explanation.
Local Fidelity:
Using per-input nearest neighbors as baselines helps improve explanation fidelity for local model behavior.

Algorithm¶

Initialization:
- Accepts a model, background dataset, number of baselines, baseline selection strategy (‘random’, ‘nearest’, ‘mean’, ‘kmeans’, etc.),
  SHAP explainer class (e.g., shap.DeepExplainer), and device context.
Baseline Selection:
- For each input sample:
  
  Select multiple baseline samples from the background using the chosen strategy.
SHAP Value Computation:
- For each selected baseline:
  
  Instantiate the base SHAP explainer.
  
  Compute SHAP values for the input sample with respect to that baseline.
- Average the SHAP results across all baselines.
Output:
- Return the final attributions as averaged SHAP values, preserving shape and semantics of the model input.

References

Lundberg & Lee (2017), “A Unified Approach to Interpreting Model Predictions” [SHAP foundation—coalitional feature attribution framework]
Chen et al. (2022), “Explaining a Series of Models by Propagating Shapley Values” (G‑DeepSHAP) [Uses multiple baselines and shows that averaging explanations across them improves consistency and fidelity] :contentReference[oaicite:1]{index=1}
Google Vertex AI documentation (2025) [Allows multiple baseline specifications (e.g. min, max, random) to improve attribution context and stability] :contentReference[oaicite:2]{index=2}
Sundararajan & Najmi (2020), “The Many Shapley Values for Model Explanation” [Discusses how baseline selection influences SHAP-value interpretations, and the implications of multiple baseline settings] :contentReference[oaicite:3]{index=3}
Shaping Up SHAP: Enhancing Stability through Layer-Wise Neighbor Selection (Kelodjou et al., 2023) [Highlights instability in KernelSHAP and proposes neighbor sampling strategies to stabilize results, underscoring need for ensemble or multi-baseline approaches] :contentReference[oaicite:4]{index=4}

Classes

NearestNeighborMultiBaselineSHAP(...[, ...])

NearestNeighborMultiBaselineSHAP: Multi-Baseline SHAP Explainer

class shap_enhanced.explainers.MBSHAP.NearestNeighborMultiBaselineSHAP(base_explainer_class, model, background, n_baselines=5, base_explainer_kwargs=None, device=None)[source]¶

Bases: BaseExplainer

NearestNeighborMultiBaselineSHAP: Multi-Baseline SHAP Explainer

This explainer improves attribution robustness by selecting the K nearest neighbors from a background dataset as baselines for each input sample, computing SHAP values individually for each baseline, and then averaging the results.

It is compatible with various SHAP explainers (e.g., DeepExplainer, GradientExplainer, KernelExplainer) and automatically adapts input types and parameter formats accordingly.

Note

Baseline selection is input-dependent and done per sample using L2 distance in flattened input space.

Parameters:

base_explainer_class – The SHAP explainer class to use (e.g., shap.DeepExplainer).
model (Any) – The predictive model to explain.
background (np.ndarray) – Background dataset (N, …) for nearest neighbor selection.
n_baselines (int) – Number of nearest neighbor baselines to use per sample.
base_explainer_kwargs (dict or None) – Additional keyword arguments passed to the SHAP explainer.
device (str) – Device context for torch-based explainers (‘cpu’ or ‘cuda’).

property expected_value¶

Optional property returning the expected model output on the background dataset.

Returns:: Expected value if defined by the subclass, else None.
Return type:: float or None

explain(X, **kwargs)¶

Alias to shap_values for flexibility and API compatibility.

Parameters:

X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.
kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

shap_values(X, **kwargs)[source]¶

Compute SHAP values using per-sample nearest neighbor baselines.

For each sample in X, this method: 1. Selects the n_baselines nearest neighbors from the background. 2. Instantiates the explainer with the selected baselines. 3. Computes SHAP values with respect to each baseline. 4. Averages SHAP values across baselines to produce a robust explanation.

\[\phi(x) = \frac{1}{K} \sum_{k=1}^{K} \text{SHAP}(x | b_k)\]

Parameters:

X (np.ndarray) – Input samples to explain, shape (N, …) or single sample (…).
kwargs – Additional keyword arguments forwarded to the SHAP explainer.

Returns:

Averaged SHAP attributions, shape (N, …) or (…) for single input.

Return type:

np.ndarray