shap_enhanced.explainers

SHAP Explainers Collection

Overview

This subpackage contains a suite of SHAP-style explainers, each designed to handle different data structures, baseline strategies, and attribution mechanisms for interpretable machine learning. These explainers extend beyond standard SHAP to provide specialized techniques for:

  • Temporal and sequential data (e.g., TimeSHAP, LatentSHAP)

  • Sparse and discrete structures (e.g., SPSHAP, SCSHAP)

  • Reinforcement learning–based strategies (e.g., RLSHAP)

  • Surrogate model approximations (e.g., SurroSHAP)

  • Multi-baseline and enhanced SHAP variants

Each module implements the BaseExplainer interface to ensure interoperability and consistent SHAP output.

Modules

  • LatentSHAP: Attribution in the latent space of an autoencoder.

  • TimeSHAP: Pruned SHAP for long temporal sequences.

  • SurroSHAP: Fast SHAP via surrogate regression.

  • MBSHAP: Multi-baseline SHAP with per-sample reference sets.

  • RLSHAP: SHAP value estimation via policy gradient masking.

  • SPSHAP: Support-preserving masking for sparse inputs.

  • SCSHAP: Sparse coalition enumeration for binary or one-hot inputs.

  • … and more: Variants prefixed with a unique identifier (e.g., ESSHAP, ERSHAP) for experimentation.

Usage

Import individual explainers or use the package to programmatically register all variants:

from shap_enhanced.explainers import LatentSHAP, TimeSHAP, SurroSHAP
class shap_enhanced.explainers.AdaptiveBaselineSHAPExplainer(model, background, n_baselines=10, mask_strategy='auto', device=None)[source]

Bases: BaseExplainer

Adaptive Baseline SHAP (ABSHAP) Explainer for Dense and Sparse Features.

Implements a SHAP explainer that adaptively masks features based on their data distribution: using mean-based masking for continuous features and sample-based masking for sparse or categorical features. This ensures valid perturbations and avoids out-of-distribution artifacts.

Note

Feature masking strategy can be determined automatically or manually specified.

Warning

Adaptive masking requires background data and introduces computational overhead.

Parameters:
  • model (Callable) – Model to be explained. Should accept PyTorch tensors as input.

  • background (Union[np.ndarray, torch.Tensor]) – Background dataset for baseline sampling. Shape: (N, F) or (N, T, F).

  • n_baselines (int) – Number of baselines to sample per explanation. Default is 10.

  • mask_strategy (Union[str, Sequence[str]]) – Either “auto” for detection or list per feature.

  • device (str) – PyTorch device identifier, e.g., “cpu” or “cuda”. Defaults to auto-detection.

property expected_value

Optional property returning the expected model output on the background dataset.

Returns:

Expected value if defined by the subclass, else None.

Return type:

float or None

explain(X, **kwargs)

Alias to shap_values for flexibility and API compatibility.

Parameters:
  • X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.

  • kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

shap_values(X, nsamples=100, random_seed=42, **kwargs)[source]

Estimates SHAP values for the given input X using the ABSHAP algorithm.

For each feature (t, f), estimates its marginal contribution by comparing model outputs with and without the feature masked, averaging over sampled coalitions and baselines.

\[\phi_{i} = \mathbb{E}_{S \subseteq N \setminus \{i\}} \left[ f(x_{S \cup \{i\}}) - f(x_S) \right]\]

The attributions are then normalized to match the output difference between the original and fully-masked prediction.

Parameters:
  • X (Union[np.ndarray, torch.Tensor]) – Input samples, shape (B, T, F) or (T, F).

  • nsamples (int) – Number of masking combinations per feature. Default is 100.

  • random_seed (int) – Seed for reproducibility. Default is 42.

Return np.ndarray:

SHAP values of shape (T, F) or (B, T, F).

Return type:

ndarray

class shap_enhanced.explainers.AttnSHAPExplainer(model, background, use_attention=True, proxy_attention='gradient', device=None)[source]

Bases: BaseExplainer

Attention-Guided SHAP Explainer for structured/sequential data.

This class implements an extension to the SHAP framework that leverages attention mechanisms (either native to the model or via proxy strategies) to guide the coalition sampling process, focusing attribution on informative feature regions.

Parameters:
  • model – PyTorch model to be explained.

  • background – Background dataset used for SHAP estimation.

  • use_attention (bool) – If True, uses attention weights (or proxy) for guiding feature masking.

  • proxy_attention (str) – Strategy to approximate attention when model does not provide it. Options: “gradient”, “input”, “perturb”.

  • device – Computation device (‘cuda’ or ‘cpu’).

property expected_value

Optional property returning the expected model output on the background dataset.

Returns:

Expected value if defined by the subclass, else None.

Return type:

float or None

explain(X, **kwargs)

Alias to shap_values for flexibility and API compatibility.

Parameters:
  • X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.

  • kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

shap_values(X, nsamples=100, coalition_size=3, check_additivity=True, random_seed=42, **kwargs)[source]

Compute SHAP values using attention-guided or proxy-guided coalition sampling.

For each feature at each time step, it estimates the marginal contribution by comparing model outputs when the feature is masked vs. when it is included in a masked coalition. Sampling is optionally biased using attention scores.

The final attributions are normalized to satisfy SHAP’s additivity constraint:

\[\sum_{t=1}^T \sum_{f=1}^F \phi_{t,f} \approx f(x) - f(x_{masked})\]
Parameters:
  • X (np.ndarray or torch.Tensor) – Input data of shape (B, T, F) or (T, F)

  • nsamples (int) – Number of coalitions sampled per feature.

  • coalition_size (int) – Number of features in each sampled coalition.

  • check_additivity (bool) – Whether to print additivity check results.

  • random_seed (int) – Seed for reproducible coalition sampling.

Returns:

SHAP values of shape (T, F) for single input or (B, T, F) for batch.

Return type:

np.ndarray

class shap_enhanced.explainers.BShapExplainer(model, input_range=None, n_samples=50, mask_strategy='random', device=None)[source]

Bases: BaseExplainer

BShap: Distribution-Free SHAP Explainer for Sequential Models

Implements a SHAP-based attribution method that avoids empirical data distribution assumptions by applying synthetic masking strategies (e.g., uniform noise, Gaussian noise, or zero). This is useful for evaluating model robustness or interpretability in data-agnostic contexts.

Parameters:
  • model – Sequence model to explain.

  • input_range (tuple or (np.ndarray, np.ndarray)) – Tuple of (min, max) or arrays defining per-feature value bounds. Used for random masking.

  • n_samples (int) – Number of coalitions sampled per feature.

  • mask_strategy (str) – Masking strategy: ‘random’, ‘noise’, or ‘zero’.

  • device (str) – Device identifier, e.g., ‘cpu’ or ‘cuda’.

property expected_value

Optional property returning the expected model output on the background dataset.

Returns:

Expected value if defined by the subclass, else None.

Return type:

float or None

explain(X, **kwargs)

Alias to shap_values for flexibility and API compatibility.

Parameters:
  • X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.

  • kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

shap_values(X, nsamples=None, check_additivity=True, random_seed=42, **kwargs)[source]

Compute SHAP values using distribution-free perturbations.

Estimates marginal feature contributions by averaging differences between model outputs under masked coalitions. Uses synthetic masking based on the configured strategy without any reliance on background data statistics.

Final attributions are normalized to satisfy the SHAP additivity constraint:

\[\sum_{t=1}^T \sum_{f=1}^F \phi_{t,f} \approx f(x) - f(x_{\text{masked}})\]
Parameters:
  • X (np.ndarray or torch.Tensor) – Input of shape (T, F) or (B, T, F)

  • nsamples (int) – Number of coalition samples per feature (defaults to self.n_samples).

  • check_additivity (bool) – Print diagnostic message for SHAP sum vs. model delta.

  • random_seed (int) – Seed for reproducibility.

Returns:

SHAP values of shape (T, F) or (B, T, F)

Return type:

np.ndarray

class shap_enhanced.explainers.CoalitionAwareSHAPExplainer(model, background=None, mask_strategy='zero', imputer=None, device=None)[source]

Bases: BaseExplainer

Coalition-Aware SHAP (CASHAP) Explainer

Estimates Shapley values for models processing structured inputs (e.g., time-series, sequences) by sampling coalitions of feature-time pairs and computing their marginal contributions using various imputation strategies.

Parameters:
  • model (Any) – Model to be explained.

  • background (Optional[np.ndarray or torch.Tensor]) – Background data used for mean imputation strategy.

  • mask_strategy (str) – Strategy for imputing/masking feature-time pairs. Options: ‘zero’, ‘mean’, or ‘custom’.

  • imputer (Optional[Callable]) – Custom callable for imputation. Required if mask_strategy is ‘custom’.

  • device (Optional[str]) – Device on which computation runs. Defaults to ‘cuda’ if available.

property expected_value

Optional property returning the expected model output on the background dataset.

Returns:

Expected value if defined by the subclass, else None.

Return type:

float or None

explain(X, **kwargs)

Alias to shap_values for flexibility and API compatibility.

Parameters:
  • X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.

  • kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

shap_values(X, nsamples=100, coalition_size=None, mask_strategy=None, check_additivity=True, random_seed=42, **kwargs)[source]
Return type:

ndarray

Compute CASHAP Shapley values for structured inputs via coalition-aware sampling.

For each feature-time pair ((t, f)), randomly sample coalitions excluding ((t, f)), compute model outputs with and without the pair added, and average the marginal contributions. Attribution values are normalized so their total matches the model output difference between the original and fully-masked input.

\[\phi_{t,f} pprox \mathbb{E}_{C \subseteq (T imes F) \setminus \{(t,f)\}} \left[ f(C \cup \{(t,f)\}) - f(C)\]

ight]

Note

Normalization ensures: sum_{t=1}^T sum_{f=1}^F phi_{t,f} pprox f(x) - f(x_{ ext{masked}})

class shap_enhanced.explainers.ContextualMaskingSHAPExplainer(model, device=None)[source]

Bases: BaseExplainer

Contextual Masking SHAP (CM-SHAP) Explainer for Sequential Models

Estimates SHAP values for sequential inputs by replacing masked feature values with interpolated values from neighboring time steps. This context-aware masking strategy preserves temporal coherence and enables more realistic feature perturbation in time-series data.

Parameters:
  • model (Any) – Model to explain. Must accept NumPy arrays or PyTorch tensors.

  • device (Optional[str]) – Device to perform computations on (‘cpu’ or ‘cuda’). Defaults to ‘cuda’ if available.

property expected_value

Optional property returning the expected model output on the background dataset.

Returns:

Expected value if defined by the subclass, else None.

Return type:

float or None

explain(X, **kwargs)

Alias to shap_values for flexibility and API compatibility.

Parameters:
  • X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.

  • kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

shap_values(X, nsamples=100, check_additivity=True, random_seed=42, **kwargs)[source]

Estimate SHAP values using contextual (interpolated) masking.

Each feature-time pair (t, f) is evaluated by sampling coalitions of other positions, applying context-aware masking, and averaging the difference in model outputs when (t, f) is added to the coalition.

Interpolation strategy ensures continuity in time series by replacing masked values with averages of adjacent time steps:

\[\begin{split}x_{t,f}^{masked} = \begin{cases} x_{t+1,f}, & \text{if } t = 0 \\ x_{t-1,f}, & \text{if } t = T-1 \\ \frac{x_{t-1,f} + x_{t+1,f}}{2}, & \text{otherwise} \end{cases}\end{split}\]

Final attributions are normalized such that:

\[\sum_{t=1}^T \sum_{f=1}^F \phi_{t,f} \approx f(x) - f(x_{masked})\]
Parameters:
  • X (np.ndarray or torch.Tensor) – Input array of shape (T, F) or (B, T, F)

  • nsamples (int) – Number of sampled coalitions per position.

  • check_additivity (bool) – Whether to normalize SHAP values to match output difference.

  • random_seed (int) – Random seed for reproducibility.

Returns:

SHAP values with same shape as input.

Return type:

np.ndarray

class shap_enhanced.explainers.ERSHAPExplainer(model, background, n_coalitions=100, mask_strategy='mean', weighting='uniform', feature_importance=None, device=None)[source]

Bases: BaseExplainer

ER-SHAP: Ensemble of Random SHAP Explainer

An efficient approximation of Shapley values using random coalition sampling over time-feature positions. Supports uniform and weighted sampling strategies and flexible masking (zero or mean) to generate perturbed inputs.

Parameters:
  • model (Any) – Model to explain, compatible with PyTorch tensors.

  • background (np.ndarray or torch.Tensor) – Background dataset for mean imputation; shape (N, T, F).

  • n_coalitions (int) – Number of coalitions to sample per (t, f) position.

  • mask_strategy (str) – Masking method: ‘zero’ or ‘mean’.

  • weighting (str) – Sampling scheme: ‘uniform’, ‘frequency’, or ‘importance’.

  • feature_importance (Optional[np.ndarray]) – Prior feature importances for weighted sampling; shape (T, F).

  • device (str) – Device identifier, ‘cpu’ or ‘cuda’.

property expected_value

Optional property returning the expected model output on the background dataset.

Returns:

Expected value if defined by the subclass, else None.

Return type:

float or None

explain(X, **kwargs)

Alias to shap_values for flexibility and API compatibility.

Parameters:
  • X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.

  • kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

shap_values(X, check_additivity=True, random_seed=42, **kwargs)[source]

Compute SHAP values via random coalition sampling.

For each position (t, f), sample coalitions of other positions, compute marginal contributions, and average over samples. Attributions are normalized to satisfy:

\[\sum_{t=1}^T \sum_{f=1}^F \phi_{t,f} \approx f(x) - f(x_{masked})\]
Parameters:
  • X (np.ndarray or torch.Tensor) – Input array or tensor of shape (T, F) or (B, T, F).

  • check_additivity (bool) – Whether to apply normalization for additivity.

  • random_seed (int) – Seed for reproducibility.

Returns:

SHAP values of shape (T, F) or (B, T, F).

Return type:

np.ndarray

class shap_enhanced.explainers.EmpiricalConditionalSHAPExplainer(model, background, skip_unmatched=True, use_closest=False, device=None)[source]

Bases: BaseExplainer

Empirical Conditional SHAP (EC-SHAP) Explainer for Discrete Data

This explainer estimates Shapley values for discrete (e.g., categorical, binary, or one-hot) feature inputs by imputing masked features from a background dataset using conditional matching. It ensures perturbed samples remain within the data manifold, preserving interpretability.

Parameters:
  • model (Any) – Model to explain, must support PyTorch tensors as input.

  • background (np.ndarray or torch.Tensor) – Background dataset used for empirical conditional imputation.

  • skip_unmatched (bool) – If True, skip coalitions where no matching background sample exists.

  • use_closest (bool) – If True, use the closest (Hamming distance) background sample when no exact match is found.

  • device (Optional[str]) – Device on which to run the model (‘cpu’ or ‘cuda’).

property expected_value

Optional property returning the expected model output on the background dataset.

Returns:

Expected value if defined by the subclass, else None.

Return type:

float or None

explain(X, **kwargs)

Alias to shap_values for flexibility and API compatibility.

Parameters:
  • X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.

  • kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

shap_values(X, nsamples=100, check_additivity=True, random_seed=42, **kwargs)[source]

Estimate SHAP values using empirical conditional imputation.

For each feature-time index (t, f), this method: - Samples coalitions of other features. - Finds background samples matching the unmasked portion of the input. - Imputes masked values with corresponding values from the matched sample. - Computes model output with and without the target feature masked. - Averages the differences over multiple coalitions.

Normalization ensures:

\[\sum_{t=1}^T \sum_{f=1}^F \phi_{t,f} \approx f(x) - f(x_{\text{masked}})\]

Note

If no exact match is found and use_closest is False, the coalition may be skipped. For continuous-looking data, the method will fallback to mean imputation.

Parameters:
  • X (np.ndarray or torch.Tensor) – Input data of shape (T, F) or (B, T, F)

  • nsamples (int) – Number of coalitions to sample per feature.

  • check_additivity (bool) – Whether to rescale SHAP values to match model output difference.

  • random_seed (int) – Seed for reproducibility.

Returns:

SHAP values of shape (T, F) or (B, T, F)

Return type:

np.ndarray

class shap_enhanced.explainers.EnsembleSHAPWithNoise(model, background=None, explainer_class=None, explainer_kwargs=None, n_runs=5, noise_level=0.1, noise_target='input', aggregation='mean', device=None)[source]

Bases: BaseExplainer

EnsembleSHAPWithNoise: Robust Ensemble Wrapper for SHAP/Custom Explainers

This class enhances the stability of SHAP (SHapley Additive exPlanations) values by performing multiple runs with Gaussian noise applied to inputs and/or background data, and aggregating the results. It wraps around standard SHAP explainers or custom user-defined ones, making them more robust in the presence of sensitivity or instability.

Note

This class automatically handles input conversion between NumPy and PyTorch, depending on the explainer type.

Parameters:
  • model – The model to explain.

  • background – Background data used for SHAP attribution (can be None if not required).

  • explainer_class – The SHAP or custom explainer class to wrap. Defaults to shap.DeepExplainer.

  • explainer_kwargs – Dictionary of keyword arguments to pass to the explainer during instantiation.

  • n_runs (int) – Number of noisy runs to perform for ensemble aggregation.

  • noise_level (float) – Standard deviation of Gaussian noise to inject.

  • noise_target (str) – Target for noise injection: “input”, “background”, or “both”.

  • aggregation (str) – Aggregation method across runs: “mean” or “median”.

  • device – Device context (e.g., ‘cpu’, ‘cuda’) for tensor-based explainers. Defaults to available GPU or CPU.

property expected_value

Optional property returning the expected model output on the background dataset.

Returns:

Expected value if defined by the subclass, else None.

Return type:

float or None

explain(X, **kwargs)

Alias to shap_values for flexibility and API compatibility.

Parameters:
  • X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.

  • kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

shap_values(X, **kwargs)[source]

Compute noise-robust SHAP values via ensemble averaging over multiple noisy runs.

For each run, Gaussian noise is added to the input and/or background (as configured), then the SHAP explainer is applied to compute attribution values. These are aggregated (mean or median) to produce a stable final output.

\[\begin{split}\text{Attribution}_{final}(i) = \begin{cases} \frac{1}{N} \sum_{j=1}^N \text{SHAP}_j(i) & \text{if aggregation = mean} \\ \text{median}\{\text{SHAP}_1(i), \ldots, \text{SHAP}_N(i)\} & \text{if aggregation = median} \end{cases}\end{split}\]
Parameters:
  • X (np.ndarray or torch.Tensor) – Input sample(s) to explain (NumPy array or torch.Tensor).

  • kwargs – Additional keyword arguments passed to the underlying explainer’s shap_values method.

Returns:

Aggregated attribution values across ensemble runs.

Return type:

np.ndarray

class shap_enhanced.explainers.HShapExplainer(model, background, hierarchy, mask_strategy='mean', device=None)[source]

Bases: BaseExplainer

HShapExplainer: Hierarchical SHAP Explainer

Implements the h-SHAP algorithm, which recursively computes SHAP values over structured groups of features using hierarchical masking. Suitable for time-series or block-structured feature inputs where interpretability benefits from grouped attributions.

Note

Features can be masked using hard zero-masking or soft imputation via background means.

Parameters:
  • model – Model to explain.

  • background (np.ndarray or torch.Tensor) – Background dataset for mean imputation. Shape: (N, T, F).

  • hierarchy (list) – Nested list of feature index groups (e.g., [[(t1, f1), (t2, f2)], …]).

  • mask_strategy (str) – Either “mean” for imputation or “zero” for hard masking.

  • device (str) – Device context, e.g., “cuda” or “cpu”.

property expected_value

Optional property returning the expected model output on the background dataset.

Returns:

Expected value if defined by the subclass, else None.

Return type:

float or None

explain(X, **kwargs)

Alias to shap_values for flexibility and API compatibility.

Parameters:
  • X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.

  • kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

shap_values(X, nsamples=50, check_additivity=True, random_seed=42, **kwargs)[source]

Compute hierarchical SHAP values for a batch of inputs.

The method recursively attributes model output to hierarchical feature groups. It also ensures additivity via normalization of final attributions.

\[\sum_{i=1}^{TF} \phi_i = f(x) - f(x_{\text{masked}})\]
Parameters:
  • X (np.ndarray or torch.Tensor) – Input batch, shape (B, T, F) or single instance (T, F).

  • nsamples (int) – Number of Monte Carlo samples per group.

  • check_additivity (bool) – If True, prints additivity check summary.

  • random_seed (int) – Seed for reproducible sampling.

Returns:

SHAP values, same shape as X.

Return type:

np.ndarray

class shap_enhanced.explainers.LatentSHAPExplainer(model, encoder, decoder, base_explainer_class, background, device=None, base_explainer_kwargs=None)[source]

Bases: BaseExplainer

LatentSHAPExplainer: SHAP Attribution in Autoencoded Latent Space

This class applies SHAP to the latent space of an autoencoder and projects the resulting attributions back into input space using the decoder’s Jacobian. It is especially useful for high-dimensional, structured inputs (e.g., time series) where direct SHAP attribution is noisy or expensive.

\[\phi_{\text{input}} = J_{\text{decoder}}(z) \cdot \phi_{\text{latent}}\]
Parameters:
  • model (torch.nn.Module) – Downstream predictive model, operating in input space.

  • encoder (torch.nn.Module) – Encoder network mapping input to latent space.

  • decoder (torch.nn.Module) – Decoder network mapping latent to input space.

  • base_explainer_class – SHAP explainer class (e.g., GradientExplainer).

  • background (np.ndarray or torch.Tensor) – Background dataset (N, T, F) used for SHAP estimation.

  • device – Device context (e.g., ‘cpu’ or ‘cuda’).

  • base_explainer_kwargs (dict) – Optional dictionary of kwargs passed to SHAP explainer.

property expected_value

Optional property returning the expected model output on the background dataset.

Returns:

Expected value if defined by the subclass, else None.

Return type:

float or None

explain(X, **kwargs)

Alias to shap_values for flexibility and API compatibility.

Parameters:
  • X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.

  • kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

shap_values(X, **kwargs)[source]

Compute SHAP values in latent space and project them to input space.

Steps: 1. Encode input and background into latent space. 2. Run SHAP (e.g., GradientExplainer) on latent input. 3. Compute Jacobian from latent to input. 4. Project latent SHAP values using the Jacobian. 5. Return attributions with original input shape.

\[\phi_{\text{input}} = J(z) \cdot \phi_{\text{latent}}\]
Parameters:

X (np.ndarray or torch.Tensor) – Input sample(s), shape (T, F) or (B, T, F)

Returns:

SHAP attributions in input space.

Return type:

np.ndarray

class shap_enhanced.explainers.NearestNeighborMultiBaselineSHAP(base_explainer_class, model, background, n_baselines=5, base_explainer_kwargs=None, device=None)[source]

Bases: BaseExplainer

NearestNeighborMultiBaselineSHAP: Multi-Baseline SHAP Explainer

This explainer improves attribution robustness by selecting the K nearest neighbors from a background dataset as baselines for each input sample, computing SHAP values individually for each baseline, and then averaging the results.

It is compatible with various SHAP explainers (e.g., DeepExplainer, GradientExplainer, KernelExplainer) and automatically adapts input types and parameter formats accordingly.

Note

Baseline selection is input-dependent and done per sample using L2 distance in flattened input space.

Parameters:
  • base_explainer_class – The SHAP explainer class to use (e.g., shap.DeepExplainer).

  • model (Any) – The predictive model to explain.

  • background (np.ndarray) – Background dataset (N, …) for nearest neighbor selection.

  • n_baselines (int) – Number of nearest neighbor baselines to use per sample.

  • base_explainer_kwargs (dict or None) – Additional keyword arguments passed to the SHAP explainer.

  • device (str) – Device context for torch-based explainers (‘cpu’ or ‘cuda’).

property expected_value

Optional property returning the expected model output on the background dataset.

Returns:

Expected value if defined by the subclass, else None.

Return type:

float or None

explain(X, **kwargs)

Alias to shap_values for flexibility and API compatibility.

Parameters:
  • X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.

  • kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

shap_values(X, **kwargs)[source]

Compute SHAP values using per-sample nearest neighbor baselines.

For each sample in X, this method: 1. Selects the n_baselines nearest neighbors from the background. 2. Instantiates the explainer with the selected baselines. 3. Computes SHAP values with respect to each baseline. 4. Averages SHAP values across baselines to produce a robust explanation.

\[\phi(x) = \frac{1}{K} \sum_{k=1}^{K} \text{SHAP}(x | b_k)\]
Parameters:
  • X (np.ndarray) – Input samples to explain, shape (N, …) or single sample (…).

  • kwargs – Additional keyword arguments forwarded to the SHAP explainer.

Returns:

Averaged SHAP attributions, shape (N, …) or (…) for single input.

Return type:

np.ndarray

class shap_enhanced.explainers.RLShapExplainer(model, background, device=None, policy_hidden=64)[source]

Bases: BaseExplainer

RLShapExplainer: Reinforcement Learning–based SHAP Explainer

This explainer uses a policy network trained via reinforcement learning to learn feature–time masking strategies that optimize attribution signal strength. Instead of enumerating coalitions randomly, it learns where to mask for maximal model impact and uses those masks to approximate SHAP values.

\[\text{SHAP}(i) \approx \mathbb{E}_{\pi} \left[ f(x \setminus i) - f(x) \right]\]
Parameters:
  • model (torch.nn.Module) – The predictive model to be explained.

  • background (np.ndarray or torch.Tensor) – Background dataset used for mean imputation.

  • device (str) – Torch device, either ‘cpu’ or ‘cuda’.

  • policy_hidden (int) – Hidden layer size for the masking policy network.

property expected_value

Optional property returning the expected model output on the background dataset.

Returns:

Expected value if defined by the subclass, else None.

Return type:

float or None

explain(X, **kwargs)

Alias to shap_values for flexibility and API compatibility.

Parameters:
  • X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.

  • kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

gumbel_sample(logits, tau=0.5)[source]

Perform Gumbel-Softmax sampling over logits to generate differentiable binary-like masks.

Adds Gumbel noise to logits and applies a sigmoid activation with temperature scaling to approximate binary sampling in a differentiable way.

\[y = \sigma\left(\frac{\logits + G}{\tau}\right), \quad G \sim \text{Gumbel}(0,1)\]
Parameters:
  • logits (torch.Tensor) – Raw logits over the (T, F) feature mask space.

  • tau (float) – Temperature parameter controlling sharpness of output (lower = harder mask).

Returns:

Differentiable soft mask tensor in [0, 1], same shape as logits.

Return type:

torch.Tensor

shap_values(X, nsamples=50, mask_frac=0.3, tau=0.5, **kwargs)[source]

Estimate SHAP values for input X using the trained masking policy.

For each feature (t, f), multiple masks are sampled with the feature masked and unmasked. The expected difference in model outputs estimates the marginal contribution of the feature.

\[\phi_{t,f} = \mathbb{E}_{m \sim \pi} \left[ f(x_{m \cup \{(t,f)\}}) - f(x_m) \right]\]
Parameters:
  • X (np.ndarray or torch.Tensor) – Input to explain, shape (T, F) or (B, T, F).

  • nsamples (int) – Number of mask samples to average over.

  • mask_frac (float) – Fraction of features masked per sample.

  • tau (float) – Gumbel-Softmax temperature.

  • kwargs – Additional keyword arguments (not used).

Returns:

Estimated SHAP values with same shape as input.

Return type:

np.ndarray

train_policy(n_steps=500, batch_size=16, mask_frac=0.3)[source]

Train the masking policy network using policy gradient optimization.

The network is optimized to generate masks that maximize the absolute change in the model’s prediction when masking certain input features.

Note

Gumbel-Softmax is used to approximate discrete mask sampling for differentiability.

Parameters:
  • n_steps (int) – Number of training iterations.

  • batch_size (int) – Batch size sampled from the background at each step.

  • mask_frac (float) – Fraction of features to mask in each sampled coalition.

class shap_enhanced.explainers.SparseCoalitionSHAPExplainer(model, background, onehot_groups=None, mask_strategy='zero', device=None)[source]

Bases: BaseExplainer

SparseCoalitionSHAPExplainer: Valid SHAP for Structured Sparse Inputs

This explainer approximates Shapley values by sampling valid sparse coalitions of features. It ensures that perturbed inputs remain syntactically valid, especially for inputs with structured sparsity such as one-hot encodings or binary indicator features.

Note

One-hot groups are masked as entire sets to simulate “no class selected”. General binary features are masked element-wise.

Parameters:
  • model (Any) – Predictive model to explain.

  • background (np.ndarray or torch.Tensor) – Background data (not directly used but required for base class).

  • onehot_groups (list[list[int]] or None) – List of one-hot index groups, e.g., [[0,1,2], [3,4]].

  • mask_strategy (str) – Currently supports only “zero” masking.

  • device (str) – Device context for evaluation (e.g., ‘cuda’ or ‘cpu’).

property expected_value

Optional property returning the expected model output on the background dataset.

Returns:

Expected value if defined by the subclass, else None.

Return type:

float or None

explain(X, **kwargs)

Alias to shap_values for flexibility and API compatibility.

Parameters:
  • X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.

  • kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

shap_values(X, nsamples=100, check_additivity=True, random_seed=42, **kwargs)[source]

Estimate SHAP values using sparse-valid coalitions.

For each input sample: - Iterates over all features (or one-hot groups). - Randomly samples subsets of other features/groups to form coalitions. - Computes model output difference when adding the current feature/group to the coalition. - Averages these differences to estimate the Shapley value.

\[\phi_i = \mathbb{E}_{S \subseteq N \setminus \{i\}} \left[ f(S \cup \{i\}) - f(S) \right]\]

Final attributions are normalized such that:

\[\sum_i \phi_i = f(x) - f(x_{\text{masked}})\]
Parameters:
  • X (np.ndarray or torch.Tensor) – Input instance(s), shape (T, F) or (B, T, F).

  • nsamples (int) – Number of coalition samples per feature/group.

  • check_additivity (bool) – If True, prints the additivity check.

  • random_seed (int) – Seed for reproducible sampling.

Returns:

SHAP attribution values, same shape as input.

Return type:

np.ndarray

class shap_enhanced.explainers.SupportPreservingSHAPExplainer(model, background, skip_unmatched=True, device=None)[source]

Bases: BaseExplainer

SupportPreservingSHAPExplainer: Real-Pattern-Constrained SHAP Estimator

This explainer approximates SHAP values by generating only masked inputs that match real examples in the dataset—preserving the discrete or sparse structure of the input space. It avoids out-of-distribution perturbations by requiring coalitions (masked variants) to have binary support patterns that exist in the original data.

If the data is not sparse (e.g., continuous), the method falls back to mean-masking, akin to standard SHAP explainers.

Parameters:
  • model (Any) – Predictive model to explain.

  • background (np.ndarray or torch.Tensor) – Dataset used to match support patterns (shape: (N, T, F) or (N, F)).

  • skip_unmatched (bool) – If True, coalitions without support-matching background samples are skipped.

  • device (str) – Device to evaluate model on (‘cpu’ or ‘cuda’).

property expected_value

Optional property returning the expected model output on the background dataset.

Returns:

Expected value if defined by the subclass, else None.

Return type:

float or None

explain(X, **kwargs)

Alias to shap_values for flexibility and API compatibility.

Parameters:
  • X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.

  • kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

shap_values(X, nsamples=100, check_additivity=True, random_seed=42, **kwargs)[source]

Compute SHAP values by evaluating only valid support-preserving perturbations.

For sparse inputs (e.g., one-hot or binary):

  • For each feature, sample coalitions of other features.

  • Construct masked inputs and locate matching background samples with same nonzero support.

  • Evaluate model differences with and without the feature of interest.

  • Average differences to estimate SHAP values.

For dense inputs: - Fallback to standard mean-based masking for each feature individually.

\[\phi_i = \mathbb{E}_{S \subseteq N \setminus \{i\}} \left[ f(x_{S \cup \{i\}}) - f(x_S) \right]\]

Final attributions are normalized such that:

\[\sum_i \phi_i = f(x) - f(x_{\text{masked}})\]
Parameters:
  • X (np.ndarray or torch.Tensor) – Input sample or batch of shape (T, F) or (B, T, F).

  • nsamples (int) – Number of coalition samples per feature.

  • check_additivity (bool) – If True, prints sum of SHAP vs model output difference.

  • random_seed (int) – Seed for reproducibility.

Returns:

SHAP attributions with same shape as input.

Return type:

np.ndarray

class shap_enhanced.explainers.SurrogateSHAPExplainer(model, background, base_explainer, regressor_class=<class 'sklearn.ensemble._forest.RandomForestRegressor'>, regressor_kwargs=None, nsamples_base=100, scale_inputs=True, scale_outputs=False, device=None)[source]

Bases: BaseExplainer

SurrogateSHAPExplainer: Fast SHAP Approximation via Supervised Regression

SurroSHAP accelerates SHAP attribution by training a surrogate regression model that maps input features to SHAP attributions. This is useful when repeated SHAP computation is too costly or when near-instant explanations are needed for deployment.

The surrogate model is trained on a background dataset where “true” SHAP values are first computed using a base explainer (e.g., DeepExplainer, KernelExplainer), and then used as regression targets.

Note

Any sklearn-style regressor can be used (e.g., RandomForestRegressor, KernelRidge, etc.).

Parameters:
  • model (Any) – Predictive model to be explained.

  • background (np.ndarray) – Background dataset for training surrogate and computing SHAP targets. Shape: (N, T, F).

  • base_explainer (Any) – A SHAP-style explainer instance (already constructed).

  • regressor_class (type) – Regressor class implementing fit/predict API. Defaults to RandomForestRegressor.

  • regressor_kwargs (dict) – Optional keyword arguments for the regressor.

  • nsamples_base (int) – Number of SHAP samples used for each background point.

  • scale_inputs (bool) – Whether to standardize input features during training.

  • scale_outputs (bool) – Whether to standardize SHAP values during training.

  • device (str or torch.device) – Torch device (e.g., ‘cpu’ or ‘cuda’).

property expected_value

Optional property returning the expected model output on the background dataset.

Returns:

Expected value if defined by the subclass, else None.

Return type:

float or None

explain(X, **kwargs)

Alias to shap_values for flexibility and API compatibility.

Parameters:
  • X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.

  • kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

shap_values(X, **kwargs)[source]

Predict SHAP attributions for new inputs using the trained surrogate model.

The input is reshaped and (optionally) standardized to match the format used during surrogate training, and the predicted SHAP values are inverse-transformed (if scaling was applied).

Note

This bypasses SHAP computation entirely and relies on the surrogate regressor.

Parameters:

X (np.ndarray or torch.Tensor) – Input instance or batch, shape (T, F) or (B, T, F).

Returns:

Approximated SHAP attributions, same shape as input.

Return type:

np.ndarray

class shap_enhanced.explainers.TimeSHAPExplainer(model, background, mask_strategy='mean', event_window=None, prune_topk=None, device=None)[source]

Bases: BaseExplainer

TimeSHAPExplainer: Pruned SHAP Attribution for Sequential Models

Implements a SHAP-style explainer for time-series and sequential data using pruning to efficiently estimate per-(t, f) or event-level attributions.

Combines:
  • Masking strategy (zero or mean-based).

  • Optional event windowing (for segment-level attribution).

  • Top-k pruning to reduce the coalition space before final SHAP estimation.

Parameters:
  • model (Any) – The model to be explained.

  • background (np.ndarray or torch.Tensor) – Background dataset for imputation and mean estimation.

  • mask_strategy (str) – Masking method, either ‘zero’ or ‘mean’.

  • event_window (int or None) – Optional window size for event-based attribution.

  • prune_topk (int or None) – If specified, retain only top-k units (based on rough attribution) for refinement.

  • device (str) – Computation device (‘cpu’ or ‘cuda’).

property expected_value

Optional property returning the expected model output on the background dataset.

Returns:

Expected value if defined by the subclass, else None.

Return type:

float or None

explain(X, **kwargs)

Alias to shap_values for flexibility and API compatibility.

Parameters:
  • X (Union[np.ndarray, torch.Tensor, list]) – Input samples to explain.

  • kwargs – Additional arguments.

Returns:

SHAP values.

Return type:

Union[np.ndarray, list]

shap_values(X, nsamples=100, level='timestep', check_additivity=True, random_seed=42, **kwargs)[source]

Compute SHAP values for sequential input with optional pruning and window-based attribution.

Note

Pruned estimation uses an initial coarse pass to identify important units (features, timesteps, or windows), followed by refined SHAP estimation over that subset.

Parameters:
  • X (Union[np.ndarray, torch.Tensor]) – Input tensor or array of shape (T, F) or (B, T, F).

  • nsamples (int) – Number of coalitions to sample per unit.

  • level (str) – Attribution level: ‘timestep’, ‘feature’, or ‘event’.

  • check_additivity (bool) – If True, print additivity diagnostics.

  • random_seed (int) – Random seed for reproducibility.

Returns:

SHAP values with shape (T, F) or (B, T, F).

Return type:

np.ndarray

Modules

ABSHAP

Adaptive Baseline SHAP (Sparse)

AttnSHAP

AttnSHAPExplainer: Attention-Guided SHAP with General Proxy Attention

BSHAP

BShapExplainer: Distribution-Free SHAP for Sequential Models

CASHAP

CASHAP: Coalition-Aware SHAP Explainer

CMSHAP

Contextual Masking SHAP (CM-SHAP) Explainer for Sequential Models

ECSHAP

EC-SHAP: Empirical Conditional SHAP for Discrete Data

ERSHAP

ER-SHAP: Ensemble of Random SHAP Explainer

ESSHAP

EnsembleSHAPWithNoise: Robust Ensemble Wrapper for SHAP/Custom Explainers

LatentSHAP

Latent SHAP with Autoencoding for Structured Time Series

MBSHAP

MB-SHAP: Multi-Baseline SHAP Explainer

RLSHAP

RL-SHAP: Reinforcement Learning SHAP Explainer

SCSHAP

Sparse Coalition SHAP Explainer

SPSHAP

Support-Preserving SHAP Explainer

SurroSHAP

SurroSHAP: Surrogate Model SHAP Explainer

TimeSHAP

TimeSHAP Explainer: Pruning-Enhanced SHAP for Sequential Models

hSHAP

h-SHAP: Hierarchical SHAP Explainer