shap_enhanced.explainers.ABSHAP¶

Adaptive Baseline SHAP (Sparse)¶

Theoretical Explanation¶

Adaptive Baseline SHAP (ABSHAP) is a feature attribution method built upon the SHAP framework. It is specifically designed to yield valid, interpretable explanations for both dense (e.g., continuous or tabular) and sparse (e.g., categorical or one-hot encoded) input data.

Unlike traditional SHAP methods that use static baselines (such as zeros or means), ABSHAP dynamically samples baselines for masked features from real observed background samples. This helps avoid out-of-distribution perturbations, which is particularly critical for sparse or categorical data where unrealistic combinations can easily arise.

Key Concepts¶

Adaptive Masking: Each feature’s masking method is chosen based on its distribution:
- Dense/continuous features use the mean value from the background dataset.
- Sparse/categorical features (e.g., those with >90% zeros) are replaced using values from real background examples.
Strategy Selection: The masking approach can be assigned automatically per feature or manually specified by the user.
Valid Perturbations: All masked samples are guaranteed to lie within the original data distribution, preventing unrealistic inputs.

Algorithm¶

Initialization:
- Accepts a model, background dataset, number of baselines, masking strategy, and device.
- Automatically determines a masking strategy per feature or uses a user-specified configuration.
- Computes mean values for dense-feature masking.
Masking:
- For each coalition (a selected subset of features to mask), masked values are replaced with:
  
  The feature-wise mean (dense features), or
  
  A sampled value from a real background example (sparse features).
SHAP Value Estimation:
- For each feature:
  
  Randomly sample subsets of other features to mask.
  
  For each sampled baseline:
  
  Compute model outputs on:
  
  Input with selected features masked.
  
  Input with selected features plus the current feature masked.
  
  Calculate the difference in model outputs.
  
  Average these differences to estimate the marginal contribution of the feature.
- Normalize the resulting attributions so their sum equals the difference between the original and fully-masked model outputs.

References

Lundberg & Lee (2017), “A Unified Approach to Interpreting Model Predictions” [SHAP foundation]
Merrick & Taly (2020), “Keep it Real: Towards Realistic and Efficient Shapley Value Explanations” [Proposes adaptive masking based on feature type and using real data to avoid out-of-distribution perturbations]
Molnar, “Interpretable Machine Learning” (2022), SHAP chapter [Summarizes best practices and practical warnings about feature masking in SHAP for different data types]

Classes

AdaptiveBaselineSHAPExplainer(model, background)

Adaptive Baseline SHAP (ABSHAP) Explainer for Dense and Sparse Features.

class shap_enhanced.explainers.ABSHAP.AdaptiveBaselineSHAPExplainer(model, background, n_baselines=10, mask_strategy='auto', device=None)[source]¶

Bases: BaseExplainer

Adaptive Baseline SHAP (ABSHAP) Explainer for Dense and Sparse Features.

Implements a SHAP explainer that adaptively masks features based on their data distribution: using mean-based masking for continuous features and sample-based masking for sparse or categorical features. This ensures valid perturbations and avoids out-of-distribution artifacts.

Note

Feature masking strategy can be determined automatically or manually specified.

Warning

Adaptive masking requires background data and introduces computational overhead.

Parameters:

model (Callable) – Model to be explained. Should accept PyTorch tensors as input.
background (Union[np.ndarray, torch.Tensor]) – Background dataset for baseline sampling. Shape: (N, F) or (N, T, F).
n_baselines (int) – Number of baselines to sample per explanation. Default is 10.
mask_strategy (Union[str, Sequence[str]]) – Either “auto” for detection or list per feature.
device (str) – PyTorch device identifier, e.g., “cpu” or “cuda”. Defaults to auto-detection.

property expected_value¶

Optional property returning the expected model output on the background dataset.

Returns:: Expected value if defined by the subclass, else None.
Return type:: float or None

explain(X, **kwargs)¶

Alias to shap_values for flexibility and API compatibility.

Parameters:

X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.
kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

shap_values(X, nsamples=100, random_seed=42, **kwargs)[source]¶

Estimates SHAP values for the given input X using the ABSHAP algorithm.

For each feature (t, f), estimates its marginal contribution by comparing model outputs with and without the feature masked, averaging over sampled coalitions and baselines.

\[\phi_{i} = \mathbb{E}_{S \subseteq N \setminus \{i\}} \left[ f(x_{S \cup \{i\}}) - f(x_S) \right]\]

The attributions are then normalized to match the output difference between the original and fully-masked prediction.

Parameters:

X (Union[np.ndarray, torch.Tensor]) – Input samples, shape (B, T, F) or (T, F).
nsamples (int) – Number of masking combinations per feature. Default is 100.
random_seed (int) – Seed for reproducibility. Default is 42.

Return np.ndarray:

SHAP values of shape (T, F) or (B, T, F).

Return type:

ndarray