API reference

This page documents the public Python API exposed by hugiml-core. The manual sections in the user guide explain how these APIs fit together in a modeling workflow; the reference below is generated from the source docstrings.

Core estimator

HUGIMLClassifier — C++ accelerated, scikit-learn compatible classifier.

HUGIMLClassifier is the primary public class name. HUGIMLClassifierNative remains as a backward-compatible alias.

Implements the High Utility Gain Interpretable Machine Learning (HUG-IML) algorithm from:

Krishnamoorthy, S. (2024). Interpretable Classifier Models for Decision Support Using High Utility Gain Patterns. IEEE Access, 12, 126088–126107. DOI: 10.1109/ACCESS.2024.3455563

Computationally intensive stages (discretisation, transaction construction, pattern mining, matrix assembly) run at native speed via a compiled C++ extension with optional OpenMP parallelism. The Python layer handles DataFrame ingestion, column-type detection, downstream estimation, explanation methods, monitoring, and drift detection.

Architecture

C++ extension (_hugiml_core):

Discretisation, transaction construction, top-K HUI pattern mining with information-gain filtering, bitmap-accelerated matrix assembly, OpenMP parallel pattern matching.

Python layer:

Column-type detection (prepareXy), NaN/Inf imputation, downstream sklearn estimator (LogisticRegression default), explanation methods (get_hug_features, get_pattern_info, feature_importances), versioned model serialisation, prediction monitoring, multi-method drift detection, latency SLA enforcement, and graceful degradation under memory pressure.

Quick start

Two usage paths are supported:

Path A — prepareXy (recommended when the full dataset is available upfront):

from hugiml import HUGIMLClassifier

clf = HUGIMLClassifier()
X, y = clf.prepareXy(X_df, y_series)
X_tr, X_te, y_tr, y_te = train_test_split(X, y, stratify=y)
clf.fit(X_tr, y_tr)
proba = clf.predict_proba(X_te)

print(clf.model_summary())
print(clf.feature_importances())

Path B — allCols + origColumns (cross-validation loops):

clf = HUGIMLClassifier(
    allCols=[int_cols, float_cols, cat_cols],
    origColumns=X_df.columns.tolist(),
)
clf.fit(X_train, y_train)

Monitoring and drift detection:

clf.enable_monitoring()
clf.predict_proba(X_new)
print(clf.monitor.report())

drift = clf.detect_drift(X_new)
print(drift)

Versioned serialisation:

clf.save_model("model.hugiml")
clf2 = HUGIMLClassifier.load_model("model.hugiml")
hugiml.classifier.HUGIMLClassifier

alias of HUGIMLClassifierNative

class hugiml.classifier.HUGIMLClassifierNative(allCols=None, origColumns=None, B=8, L=1, G=0.001, topK=30, base_estimator=None, n_jobs=1, max_predict_ms=None, max_fit_seconds=None, verbose=False, adaptive_binning=False, b_candidates=None, min_marginal_gain_ratio=0.02, feature_mode='patterns_only', use_hotpath=True, augmented_pair_transforms=True, augmented_pair_max_features=10, topk_budget_strict=False, dense_downstream_max_width=200, execution_mode='audit')[source]

Bases: TransformerMixin, ClassifierMixin, BaseEstimator

HUG-IML interpretable classifier — C++ accelerated, scikit-learn compatible.

Extracts High Utility Gain (HUG) patterns from labelled tabular data, transforms the input into a binary pattern-presence matrix, and fits an interpretable downstream classifier. The mined patterns are human-readable and serve as the primary source of model explanations.

Parameters:
  • allCols (list of 3 lists, optional) – [int_col_names, float_col_names, cat_col_names]. Must be paired with origColumns.

  • origColumns (list of str, optional) – Ordered column names matching the columns of X passed to fit/predict.

  • B (int, default 8) – Number of quantile bins per numerical feature. Use -1 for supervised auto-selection (maximises IG over [2, 20]).

  • L (int, default 2) – Maximum HUG pattern length. 1 = singletons; 2 = pairs; -1 = unlimited.

  • G (float, default 1e-4) – Minimum information-gain threshold.

  • topK (int, default 200) – Maximum number of patterns to retain. -1 computes automatically.

  • base_estimator (sklearn estimator, optional) – Downstream classifier trained on the binary pattern matrix. Defaults to LogisticRegression.

  • n_jobs (int, default 1) – Number of OpenMP threads. -1 uses all available cores.

  • max_predict_ms (float or None) – Prediction latency budget in milliseconds.

  • max_fit_seconds (float or None) – Wall-clock budget for the pattern-mining stage of fit(). Transaction preparation and downstream model fitting (e.g. LogisticRegression) are not bounded — total fit() time may exceed this value. When the budget is exhausted mid-mine, graceful degradation produces a smaller pattern set; if even the minimal fallback cannot finish in time, HUGIMLTimeoutError is raised.

  • verbose (bool, default False) – Emit INFO-level log messages during fit.

  • fit) (Attributes (available after)

  • ----------------------------------

  • classes (ndarray — unique class labels.)

  • n_features_in (int — number of input features.)

  • feature_names_in (list or None — column names from training data.)

  • cat_cols_mask (ndarray[bool] — True for categorical columns.)

  • is_int_mask (ndarray[bool] — True for integer columns.)

  • td (_TransactionDataWrapper — discretisation artefacts.)

  • patterns (list — mined HUG patterns.)

  • x_train_hup (csr_matrix — binary training pattern matrix.)

  • model (Pipeline — fitted downstream estimator.)

  • fit_metadata (FitMetadata — timings, memory, pattern stats.)

  • monitor (PredictionMonitor or None — prediction statistics.)

  • adaptive_binning (bool)

  • b_candidates (list | None)

  • min_marginal_gain_ratio (float)

  • feature_mode (str)

  • use_hotpath (bool)

  • augmented_pair_transforms (bool)

  • augmented_pair_max_features (int)

  • topk_budget_strict (bool)

  • dense_downstream_max_width (int)

  • execution_mode (str)

classmethod from_preset(name, **overrides)[source]

Create a classifier from a named configuration preset.

Parameters:
  • name ({'quick', 'balanced', 'thorough'}) – quick — B=5, L=1, G=1e-2, topK=50 balanced — B=7, L=1, G=5e-3, topK=-1 thorough — B=-1, L=2, G=1e-4, topK=-1

  • overrides (Any)

Return type:

HUGIMLClassifierNative

classmethod default_param_grid()[source]

Return the default validation grid for compact HUGIML tuning.

The grid uses adaptive binning (B=-1), searches L in {1, 2}, searches feature_mode in {'patterns_only', 'original_plus_patterns'}, keeps G fixed at 1e-3, and searches topK in {30, 50, 100}. For L > 1 and augmented_pair_transforms=True, native augmented-pair transforms are created internally from the top-10 native-IG numeric features and capped to the same topK budget by transform IG.

Return type:

dict[str, list]

get_params(deep=True)[source]

Return constructor parameters as a dict (sklearn protocol).

Parameters:

deep (bool)

Return type:

dict

set_params(**params)[source]

Set constructor parameters in-place and return self (sklearn protocol).

Parameters:

params (Any)

Return type:

HUGIMLClassifierNative

save_model(path)[source]

Persist the fitted model to a binary file with schema versioning.

Parameters:

path (str or Path)

Raises:

HUGIMLSerializationError

Return type:

None

classmethod load_model(path)[source]

Load a model previously saved with save_model().

Parameters:

path (str or Path)

Return type:

HUGIMLClassifierNative

Raises:

HUGIMLVersionError, HUGIMLSerializationError

prepareXy(X, y)[source]

Detect column types and encode the target variable.

Call on the full dataset before any train/test split. Records which columns are integer, float, or categorical, and performs basic label validation.

Parameters:
  • X (pd.DataFrame)

  • y (pd.Series or array-like)

Returns:

  • X (pd.DataFrame (copy with string column names))

  • y (np.ndarray of int64)

Return type:

tuple[DataFrame, ndarray]

fit(X, y)[source]

Fit the HUG-IML model on training data.

Parameters:
  • X (pd.DataFrame or ndarray, shape (n_samples, n_features))

  • y (array-like of int, shape (n_samples,))

Returns:

  • self

  • Thread safety

  • ————-

  • fit() acquires an exclusive lock. Concurrent fit() calls on the same

  • instance are serialized. predict/predict_proba/transform are read-only

  • on fitted state and safe for concurrent use after fit() returns.

Return type:

HUGIMLClassifierNative

predict_proba(X_test)[source]

Predict class probabilities for X_test.

When max_predict_ms is set large batches are processed in chunks. Rows exceeding the time budget receive uniform probabilities and a warning is emitted.

Parameters:

X_test (array-like or DataFrame)

Return type:

np.ndarray, shape (n_samples, n_classes)

predict(X_test)[source]

Predict class labels for X_test.

Parameters:

X_test (array-like or DataFrame)

Return type:

np.ndarray, shape (n_samples,)

get_augmented_pair_transforms()[source]

Return augmented pair transforms used by the downstream estimator.

Each catalog entry includes the raw pair formula, source-feature IG provenance, candidate coverage, unavailable-pair policy, and the standardization parameters used before the downstream estimator sees the feature. Candidate IG is scored on rows where both source values are observed. For selected features, rows where the pair value cannot be computed receive the pair feature’s training reference value before standardization, yielding a neutral standardized value.

Return type:

list[dict[str, Any]]

get_augmented_pair_standardization()[source]

Return standardization metadata for augmented pair features.

The returned columns are aligned to get_augmented_pair_transforms() and make the raw-to-estimator transformation explicit.

Return type:

DataFrame

explain_augmented_pair_effects()[source]

Explain augmented-pair effects in standardized and raw units.

The downstream estimator is fit on standardized augmented-pair values. This method converts each standardized coefficient back to the raw pair scale and states that the reference value is the training-cohort mean of the observed pair term, not a domain-specific baseline. Candidate scoring uses rows where both source values are observed. For selected features, rows where the pair cannot be computed receive the pair feature’s training reference raw value before standardization, yielding a neutral standardized value for that pair term. HUGIML pattern features keep their native missing-value handling.

For logistic-regression downstream models, coefficient columns are log-odds effects. Product-term effects are expressed on the product scale; changing one individual input does not have a fixed marginal effect because it depends on the current value of the other input.

Return type:

DataFrame

transform(X)[source]

Return the binary HUG pattern matrix for X.

Each column corresponds to one mined pattern. Entry (i, j) is 1 when all items of pattern j appear in row i.

Parameters:

X (array-like or DataFrame)

Return type:

csr_matrix, shape (n_samples, n_patterns)

enable_monitoring(window_size=1000)[source]

Enable prediction monitoring. Access via self.monitor.

Parameters:

window_size (int)

Return type:

HUGIMLClassifierNative

disable_monitoring()[source]

Disable prediction monitoring.

Return type:

HUGIMLClassifierNative

detect_drift(X_test, y_test=None, threshold=0.1)[source]

Run multi-method drift detection and return a human-readable report.

Uses PSI + KL divergence. When y_test is provided, also checks label distribution drift.

Notes

Drift metrics are computed on the numeric array retained by the mining path. Fixed-B numeric columns that contained NaN/Inf during training are converted to the categorical bin-label path so missingness is handled consistently at fit/predict time; those columns are therefore not represented as continuous numeric drift baselines. PSI/KL alerts for such columns should be interpreted through pattern/feature-importance diagnostics rather than through detect_drift().

Parameters:
  • X_test (array-like or DataFrame)

  • y_test (array-like, optional)

  • threshold (float)

Return type:

str

get_drift_psi(X_test)[source]

Return per-feature PSI values as a dict.

See detect_drift() for the fixed-B missing-numeric limitation: columns that were routed to categorical bin labels because they contained NaN/Inf during training do not have meaningful continuous PSI baselines.

Parameters:

X_test (Any)

Return type:

dict

cross_validate_monitored(X, y, cv=None, scoring='roc_auc')[source]

Cross-validation with per-fold monitoring and drift detection.

Parameters:
  • X (pd.DataFrame or ndarray)

  • y (array-like)

  • cv (int or CV splitter (default: StratifiedKFold(5)))

  • scoring (str)

Returns:

dict with keys

Return type:

test_scores, fit_times_ms, fold_monitors, fold_drift, fold_metadata

get_hug_features()[source]

Return a human-readable label for each mined HUG pattern.

Singleton patterns use the format feature=[lo,hi) for adaptive numerical columns (e.g. age=[35,50)) and feature=value for categorical columns (e.g. gender=F). Compound patterns (L > 1) are comma-separated, e.g. age=[35,50), gender=F.

When adaptive_binning=True and the integer-code path was used, C++ stores bin labels as feature=[k,k+1] (integer range). These are transparently remapped to the original-scale [lo,hi) labels via _adaptive_code_label_map_ so that the output is identical in appearance to the string-path output.

Production mode

This method remains available in execution_mode='production' because it only needs retained pattern labels. get_pattern_info() is intentionally audit-only because it additionally needs the retained training pattern matrix to compute support.

rtype:

list of str

Return type:

list[str]

get_transformed_shape()[source]

Return (n_samples, n_patterns) for the training pattern matrix.

In production mode the matrix itself is not retained, but its shape is persisted as lightweight diagnostic metadata.

Return type:

tuple[int, int]

get_pattern_info()[source]

Summary DataFrame with one row per mined HUG pattern.

Columns: pattern, utility, information_gain, support.

This is an audit/governance table. Unlike get_hug_features(), it requires the retained training pattern matrix to compute support and therefore raises a clear error in execution_mode='production'.

Return type:

DataFrame

get_downstream_features()[source]

Return names aligned with the downstream estimator input columns.

The returned names include a namespace prefix so feature provenance is explicit: orig: for original features, pattern: for mined HUG patterns, and augmented_pair: for augmented pair transforms. When topk_budget_strict=True, the returned list is already filtered to the columns retained by the fitted strict TopK mask.

Return type:

list[str]

get_model_composition()[source]

Return downstream feature composition and relevant fitted configuration.

The composition describes the actual feature families entering the downstream estimator after feature-mode construction and optional strict TopK filtering.

Return type:

dict[str, Any]

feature_importances()[source]

Map downstream estimator coefficients to final feature names.

Returns a DataFrame sorted by absolute coefficient magnitude. Feature names are aligned to the downstream estimator after feature-mode and optional strict TopK filtering have been applied. The feature_type column distinguishes original features, mined HUG patterns, and augmented pair transforms. pattern_support is populated only for mined HUG patterns; original and augmented-pair features use support_type='not_applicable' and pattern_support=NaN.

Raises:

AttributeError – When the downstream estimator does not expose coef_ (e.g. non-linear models).

Return type:

DataFrame

plot_bin_profiles(figsize=None)[source]

Bar chart of the chosen B per numerical feature (adaptive binning only).

Colour encodes position in the candidate range: blue = coarse end, green = mid, amber/red = fine end.

Return type:

(fig, ax)

Raises:
  • RuntimeError – When called on a non-adaptive or unfitted model.

  • ImportError – When matplotlib is not installed.

Parameters:

figsize (tuple | None)

ig_heatmap(figsize=None)[source]

Heatmap of IG score at every (feature, B) grid point (adaptive binning only).

The chosen B per feature is highlighted with a bounding box.

Return type:

(fig, ax)

Raises:
  • RuntimeError – When called on a non-adaptive or unfitted model, or when ig_scores_ is empty.

  • ImportError – When matplotlib is not installed.

Parameters:

figsize (tuple | None)

model_summary()[source]

Human-readable model summary including top patterns.

Return type:

str

classmethod fast_grid_tune(X_train, y_train, X_val, y_val, param_grid=None, *, base_params=None, scoring='roc_auc', refit_full=False, return_results=True)

Exact cached tuner for the compact adaptive HUGIML grid.

Requirements

  • adaptive_binning=True for every candidate.

  • G may vary; the tuner partitions candidates into fixed-G cache groups.

  • Only G, L, topK, and feature_mode vary. B may appear in the grid but is ignored for cache partitioning because adaptive binning chooses per-feature bins and fit() passes sentinel B=2 to the native transaction builder.

  • max_fit_seconds must be None to guarantee equivalence to the ordinary grid loop; timeout/degradation can make cached mining fits differ from standalone candidates.

Returns a dict with best_model, best_params, best_score, cv_results, and cache timings. Uses the same scorer as the ordinary grid path for all supported scoring values. During tuning it skips drift-baseline and rich final metadata; set refit_full=True to refit the selected model with normal fit().

Parameters:
  • X_train (Any)

  • y_train (Any)

  • X_val (Any)

  • y_val (Any)

  • param_grid (dict[str, list] | None)

  • base_params (dict[str, Any] | None)

  • scoring (str)

  • refit_full (bool)

  • return_results (bool)

Return type:

dict[str, Any]

set_predict_proba_request(*, X_test='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the predict_proba method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict_proba if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict_proba.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • X_test (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for X_test parameter in predict_proba.

  • self (HUGIMLClassifierNative)

Returns:

self – The updated object.

Return type:

object

classmethod tune(X, y, *, cv=5, scoring='roc_auc', param_grid=None, refit=True, base_params=None, random_state=42, shuffle=True, cv_splits=None, use_fast_path=True, return_dataframe=True)

Tune HUGIML on full X, y using stratified CV and optional fast-grid caching.

This is the main public convenience API for quick HUGIML model selection. The regular constructor remains a single-configuration estimator; this method owns grid search, cross-validation, aggregation, and optional refit.

Parameters:
  • X (array-like or DataFrame/Series) – Full training data.

  • y (array-like or DataFrame/Series) – Full training data.

  • cv (int or splitter, default=5) – Number of stratified folds, or any sklearn-compatible splitter with split(X, y). Integer cv uses StratifiedKFold.

  • scoring ({'roc_auc', 'accuracy', 'balanced_accuracy', 'f1', 'f1_macro', 'f1_weighted'}) – Validation metric. ‘roc_auc’ supports binary and multiclass OVR macro AUC.

  • param_grid (dict or None) – sklearn-style grid. None uses HUGIMLClassifier.default_param_grid().

  • refit (bool, default=True) – If True, refit the best configuration on the full X, y with normal fit().

  • base_params (dict or None) – Constructor parameters shared by every candidate.

  • random_state (int or None, default=42) – Random seed for StratifiedKFold when cv is an integer.

  • shuffle (bool, default=True) – Whether StratifiedKFold shuffles before splitting.

  • cv_splits (list of (train_idx, val_idx) or None, default=None) – Exact fold indices to use. When supplied, cv, shuffle, and random_state are ignored for split generation, and the same indices are returned in result.cv_splits_ for reuse by other models.

  • use_fast_path (bool, default=True) – Use exact cached fast-grid evaluation when the grid qualifies; otherwise fall back to ordinary per-candidate evaluation.

  • return_dataframe (bool, default=True) – Return results_ as a pandas DataFrame when pandas is available.

Returns:

GridSearchCV-like result object with best_estimator_, best_params_, best_score_, results_, fast_path_used_, elapsed_seconds_, and n_splits_.

Return type:

HUGIMLTuneResult

class hugiml.classifier.FitMetadata(n_samples, n_features, n_classes, n_items, n_patterns, n_compound, topK_used, stage_times_ms, total_fit_ms, matrix_density, config, n_augmented_pairs=0, n_downstream_features=0, downstream_feature_counts=<factory>, memory_peak_mb=0.0, memory_rss_mb=0.0, memory_cpp_mb=0.0, openmp_threads=1, degraded=False)[source]

Bases: object

Immutable record of everything that happened during fit().

Parameters:
  • n_samples (int)

  • n_features (int)

  • n_classes (int)

  • n_items (int)

  • n_patterns (int)

  • n_compound (int)

  • topK_used (int)

  • stage_times_ms (dict)

  • total_fit_ms (float)

  • matrix_density (float)

  • config (dict)

  • n_augmented_pairs (int)

  • n_downstream_features (int)

  • downstream_feature_counts (dict)

  • memory_peak_mb (float)

  • memory_rss_mb (float)

  • memory_cpp_mb (float)

  • openmp_threads (int)

  • degraded (bool)

n_samples, n_features

Training set dimensions.

Type:

int

n_classes

Number of distinct target classes.

Type:

int

n_items

Number of utility-annotated items (bins + categories).

Type:

int

n_patterns

Number of HUG patterns mined and retained.

Type:

int

n_compound

Compound patterns (length > 1).

Type:

int

n_augmented_pairs

Number of augmented pair features retained for the downstream estimator.

Type:

int

n_downstream_features

Number of columns used by the downstream estimator after feature-mode construction and optional strict TopK filtering.

Type:

int

downstream_feature_counts

Counts by downstream feature family, for example original, pattern, and augmented_pair.

Type:

dict

topK_used

Effective topK budget used during mining.

Type:

int

stage_times_ms

Wall-clock milliseconds per fit stage.

Type:

dict[str, float]

total_fit_ms

Total fit wall-clock milliseconds.

Type:

float

matrix_density

Fraction of non-zero entries in the training pattern matrix.

Type:

float

config

Snapshot of (B, L, G, topK) as used.

Type:

dict

memory_peak_mb

Python-traced peak memory during fit.

Type:

float

memory_rss_mb

RSS delta during fit (Unix only).

Type:

float

memory_cpp_mb

Estimated C++ extension memory usage.

Type:

float

openmp_threads

Number of OpenMP threads used.

Type:

int

degraded

True when fit fell back to reduced parameters.

Type:

bool

summary()[source]

Return a single-line human-readable summary of the fit outcome.

Return type:

str

class hugiml.classifier.HUGIMLTuneResult(best_estimator_, best_params_, best_score_, results_, fast_path_used_, elapsed_seconds_, n_splits_, scoring, cv_splits_, shuffle, random_state)[source]

Bases: object

Result object returned by HUGIMLClassifier.tune().

Attributes mirror the small subset of GridSearchCV-style fields users need for quick HUGIML tuning while keeping the API lightweight.

Parameters:
  • best_estimator_ (HUGIMLClassifierNative)

  • best_params_ (dict[str, Any])

  • best_score_ (float)

  • results_ (Any)

  • fast_path_used_ (bool)

  • elapsed_seconds_ (float)

  • n_splits_ (int)

  • scoring (str)

  • cv_splits_ (list[tuple[ndarray, ndarray]])

  • shuffle (bool)

  • random_state (int | None)

Adaptive binning

Per-feature adaptive binning for HUG-IML — HUGIMLAdaptive.

HUGIMLAdaptive is a thin, sklearn-compatible subclass of HUGIMLClassifierNative that hard-wires adaptive_binning=True and exposes a simplified constructor (no B, allCols, or origColumns parameters — those are managed internally).

All adaptive-binning mathematics live in hugiml._binning (single source of truth). Both this module and hugiml.classifier import from there; neither imports from the other at module level, so there is no circular dependency.

Adaptive-binning algorithm (three steps)

  1. Per-feature B selection — for each numerical feature, evaluate candidate B values by computing information gain against y and stop when the marginal gain from adding more bins drops below min_marginal_gain_ratio of the gain already achieved (elbow-stopping).

  2. Pre-discretisation — discretise each numerical feature to B_j equal-frequency quantile bins, computed on the training split only. Bin boundaries are stored in _bin_edges_ and reapplied at predict time. Each bin is encoded as a readable string label, e.g. "[12.0,24.0)".

  3. Categorical pass-through — pre-binned columns are treated as categorical by the C++ layer; the global B parameter is set to the sentinel value 2 (no effect on already-categorical columns).

Non-finite value handling

Non-finite cells (NaN, ±Inf) in any pre-binned column receive np.nan in the label array. The C++ transaction builder skips those cells, generating no item for that (row, feature) pair — semantically “not observed”, with no imputation.

Usage

Example:

from hugiml.adaptive import HUGIMLAdaptive

clf = HUGIMLAdaptive(b_candidates=[3, 5, 7, 10, 15],
                     min_marginal_gain_ratio=0.02,
                     L=2, G=1e-4)
X_enc, y_enc = clf.prepareXy(X_df, y)
X_tr, X_te, y_tr, y_te = train_test_split(X_enc, y_enc, stratify=y_enc)
clf.fit(X_tr, y_tr)

print(clf.per_feature_b_)      # chosen B_j per feature
print(clf.model_summary())
clf.plot_bin_profiles()        # requires matplotlib
clf.ig_heatmap()               # requires matplotlib

Diagnostic plots (plot_bin_profiles, ig_heatmap) and fitted attributes (per_feature_b_, ig_scores_, _bin_edges_) are defined on HUGIMLClassifierNative and inherited here.

class hugiml.adaptive.HUGIMLAdaptive(b_candidates=None, min_marginal_gain_ratio=0.02, L=1, G=0.005, topK=-1, n_jobs=1, verbose=False, max_fit_seconds=None)[source]

Bases: HUGIMLClassifierNative

HUG-IML with per-feature adaptive binning via elbow-stopping IG search.

Thin subclass of HUGIMLClassifierNative with adaptive_binning=True hard-wired and a simplified constructor that omits parameters which are managed internally (B, allCols, origColumns).

All public methods, fitted attributes, serialisation, monitoring, drift detection, and explanation helpers are inherited from HUGIMLClassifierNative. No logic is duplicated.

Parameters:
  • b_candidates (list of int, optional) – Candidate bin counts to evaluate per feature. Default: [2, 3, 5, 7, 10, 15].

  • min_marginal_gain_ratio (float, default 0.02) – Stop adding bins when the incremental IG gain relative to the current level falls below this fraction. 0.02 means stop when a new candidate adds less than 2 % more IG than the previous step. Lower values allow finer bins; higher values enforce coarser bins.

  • L (int, default 1) – Maximum HUG pattern length. 1 = singletons; 2 = pairs; -1 = unlimited.

  • G (float, default 5e-3) – Minimum information-gain threshold.

  • topK (int, default -1) – Maximum number of patterns to retain. -1 computes automatically.

  • n_jobs (int, default 1) – Number of OpenMP threads. -1 uses all available cores.

  • verbose (bool, default False) – Emit INFO-level log messages during fit.

  • max_fit_seconds (float or None) – Wall-clock budget for the pattern-mining stage of fit().

  • HUGIMLClassifierNative) (Attributes (after fit — inherited from)

  • --------------------------------------------------------------

  • per_feature_b (dict[str, int]) – Chosen bin count per numerical feature.

  • ig_scores (dict[str, dict[int, float]]) – Full IG score grid {feature: {B: ig_value}} for diagnostics.

  • _bin_edges_ (dict[str, np.ndarray]) – Quantile edges used during fit, reapplied at predict time.

  • patterns (list) – Mined HUG patterns.

  • classes (ndarray) – Unique class labels.

  • fit_metadata (FitMetadata) – Timings, memory, pattern count stats.

classmethod default_param_grid()[source]

Return the default compact tuning grid inherited from the native classifier.

Return type:

dict[str, list]

get_params(deep=True)[source]

Return the constructor parameters (sklearn protocol).

Only the parameters that HUGIMLAdaptive.__init__ accepts are returned, so sklearn.clone and cross-validation helpers reconstruct the correct subclass.

Parameters:

deep (bool)

Return type:

dict

fit(X_train, y_train)[source]

Fit with per-feature adaptive binning.

Delegates entirely to HUGIMLClassifierNative.fit with adaptive_binning=True. When X_train is a plain ndarray and prepareXy has been called previously, column names from feature_names_in_ are applied so that feature-name-aware operations (adaptive binning, bin-edge lookup, schema validation) work correctly.

Parameters:
  • X_train (pd.DataFrame or ndarray)

  • y_train (array-like of int)

Return type:

self

property clf_: HUGIMLAdaptive

Backward-compatibility alias.

Old code that accessed adaptive_clf.clf_ to reach the inner HUGIMLClassifierNative now gets self, because HUGIMLAdaptive is a HUGIMLClassifierNative. All methods and fitted attributes are directly on self.

set_predict_proba_request(*, X_test='$UNCHANGED$')

Configure whether metadata should be requested to be passed to the predict_proba method.

Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with enable_metadata_routing=True (see sklearn.set_config()). Please check the User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to predict_proba if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to predict_proba.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Parameters:
  • X_test (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for X_test parameter in predict_proba.

  • self (HUGIMLAdaptive)

Returns:

self – The updated object.

Return type:

object

Metrics

Interpretability-complexity metrics for a fitted HUGIMLClassifierNative.

All functions accept a fitted HUGIMLClassifierNative and (optionally) a data matrix X to compute sample-level statistics. They never re-train the model.

Quick reference

Example:

from hugiml.metrics import compute_all_metrics
m = compute_all_metrics(clf, X_test)
print(m)

Available metrics

  • n_patterns — total mined patterns.

  • avg_pattern_length — mean number of items per pattern.

  • coverage — fraction of samples matched by at least one pattern.

  • overlap_rate — mean number of patterns active per sample.

  • top_k_cumulative_contribution(k) — cumulative absolute-coefficient share of top-k patterns.

  • active_patterns_per_prediction — per-sample array.

  • explanation_sparsity — fraction of patterns never active on the supplied data.

class hugiml.metrics.InterpretabilityMetrics(n_patterns=0, avg_pattern_length=0.0, max_pattern_length=0, coverage=0.0, mean_active_patterns=0.0, std_active_patterns=0.0, overlap_rate=0.0, explanation_sparsity=0.0, top_k_cumulative_contribution=<factory>, n_samples=0)[source]

Bases: object

All interpretability metrics for one fitted model + dataset.

Parameters:
  • n_patterns (int)

  • avg_pattern_length (float)

  • max_pattern_length (int)

  • coverage (float)

  • mean_active_patterns (float)

  • std_active_patterns (float)

  • overlap_rate (float)

  • explanation_sparsity (float)

  • top_k_cumulative_contribution (dict)

  • n_samples (int)

n_patterns

Total number of mined HUG patterns.

Type:

int

avg_pattern_length

Mean items (conditions) per pattern.

Type:

float

max_pattern_length

Length of the longest pattern.

Type:

int

coverage

Fraction of samples covered by at least one active pattern.

Type:

float

mean_active_patterns

Average number of patterns active per sample.

Type:

float

std_active_patterns

Standard deviation of active patterns per sample.

Type:

float

overlap_rate

Alias for mean_active_patterns / n_patterns (normalised).

Type:

float

explanation_sparsity

Fraction of patterns that are never active on X (“dead” patterns).

Type:

float

top_k_cumulative_contribution

Mapping from k to cumulative share of total absolute coefficient magnitude for the top-k patterns. Keys: [1, 5, 10, 20, 50].

Type:

dict[int, float]

n_samples

Number of rows in X used for sample-level metrics.

Type:

int

to_dict()[source]

Return a flat dict suitable for DataFrame construction.

Return type:

dict

hugiml.metrics.compute_all_metrics(clf, X)[source]

Compute all interpretability metrics in a single call.

Parameters:
  • clf (fitted HUGIMLClassifierNative)

  • X (array-like or DataFrame)

Return type:

InterpretabilityMetrics

hugiml.metrics.metrics_dataframe(results)[source]

Convert a mapping of {model_name: InterpretabilityMetrics} to a DataFrame.

Useful for side-by-side comparisons across models or configurations.

Parameters:

results (dict) – Keys are model labels; values are InterpretabilityMetrics instances.

Return type:

pd.DataFrame

Calibration

Calibration evaluation for HUGIMLClassifierNative.

Provides Expected Calibration Error (ECE), Brier score decomposition, reliability diagram data, and calibration curve computation consistent with best practices for interpretable classifiers.

class hugiml.calibration.CalibrationResult(ece, mce, brier_score, brier_reliability, brier_resolution, brier_uncertainty, n_bins, bin_confidences=<factory>, bin_accuracies=<factory>, bin_counts=<factory>)[source]

Bases: object

Calibration evaluation summary for a fitted classifier.

Parameters:
  • ece (float)

  • mce (float)

  • brier_score (float)

  • brier_reliability (float)

  • brier_resolution (float)

  • brier_uncertainty (float)

  • n_bins (int)

  • bin_confidences (list[float])

  • bin_accuracies (list[float])

  • bin_counts (list[int])

ece

Expected Calibration Error (lower is better; 0 = perfect).

Type:

float

mce

Maximum Calibration Error across all bins.

Type:

float

brier_score

Mean Brier score (lower is better; 0 = perfect).

Type:

float

brier_reliability

Brier reliability component (miscalibration contribution).

Type:

float

brier_resolution

Brier resolution component (sharpness contribution).

Type:

float

brier_uncertainty

Brier uncertainty component (base rate uncertainty).

Type:

float

n_bins

Number of calibration bins used.

Type:

int

bin_confidences

Mean predicted confidence per bin.

Type:

list of float

bin_accuracies

Empirical accuracy per bin.

Type:

list of float

bin_counts

Sample count per bin.

Type:

list of int

summary()[source]

Human-readable calibration summary.

Return type:

str

to_dict()[source]

Return metrics as a plain dictionary.

Return type:

dict

hugiml.calibration.evaluate_calibration(y_true, y_proba, n_bins=10, strategy='uniform')[source]

Compute ECE, MCE, and Brier score decomposition.

Parameters:
  • y_true (np.ndarray of int, shape (n_samples,)) – True class labels (0 or 1 for binary; multi-class uses one-vs-rest).

  • y_proba (np.ndarray of float, shape (n_samples,) or (n_samples, n_classes)) – Predicted probabilities. For multi-class, pass the probability of the positive class or use the column for the class of interest.

  • n_bins (int) – Number of calibration bins.

  • strategy ({'uniform', 'quantile'}) – Bin strategy: uniform width or equal-frequency.

Return type:

CalibrationResult

hugiml.calibration.reliability_diagram_data(y_true, y_proba, n_bins=10)[source]

Return bin-level data for plotting a reliability diagram.

Parameters:
  • y_true (np.ndarray)

  • y_proba (np.ndarray)

  • n_bins (int)

Returns:

Three parallel lists, one entry per non-empty bin.

Return type:

(mean_predicted, fraction_positives, bin_counts)

hugiml.calibration.brier_decomposition(y_true, y_proba)[source]

Murphy decomposition of the Brier score.

Decomposes Brier = Reliability - Resolution + Uncertainty.

Parameters:
  • y_true (np.ndarray of {0, 1})

  • y_proba (np.ndarray of float in [0, 1])

Returns:

All three components as floats.

Return type:

(reliability, resolution, uncertainty)

Plots

HUG-IML first-class visualizations using Plotly.

Public API

from hugiml.plots import HUGPlotter

plotter = HUGPlotter(clf) fig = plotter.plot_marginal_bin_profile(“age”, X) # EBM shape-function equivalent fig = plotter.plot_feature_combinations(“age”) # compound patterns for one feature fig = plotter.plot_feature_importance(top_n=15) fig = plotter.plot_utility_vs_ig() # scatter: utility × IG × support fig = plotter.plot_top_patterns(top_n=20) fig = plotter.plot_feature_coverage() fig = plotter.plot_pattern_lengths() fig = plotter.plot_support_distribution() fig = plotter.plot_active_patterns(X, sample_idx=0) # local explanation fig = plotter.plot_dashboard(X) # full multi-panel HTML

class hugiml.plots.HUGPlotter(clf, height_default=380)[source]

Bases: object

Unified Plotly-based visualization interface for a fitted HUGIMLClassifierNative.

Parameters:
  • clf (fitted HUGIMLClassifierNative)

  • height_default (int) – Default figure height.

plot_marginal_bin_profile(feature_name, X=None, height=None, title=None)[source]

1-D HUG profile — EBM shape function equivalent.

For a given feature, shows every singleton pattern bin as a bar (x = bin label, y = utility, colour = information gain). An orange dotted line overlays the training support fraction on the right y-axis, mirroring the dashboard’s “Marginal Bin Profile” card.

Parameters:
  • feature_name (str)

  • X (ignored) – Support uses training data stored in clf.x_train_hup_.

  • height (int, optional)

  • title (str, optional)

Return type:

plotly.graph_objects.Figure

plot_feature_combinations(feature_name, top_n=25, height=None, title=None)[source]

Compound patterns that include a specific feature.

Each bar = one compound pattern; bars coloured by the number of extra features (+1 = green, +2 = orange, +3 = red), matching the dashboard’s “Feature Combinations” card.

Parameters:
  • feature_name (str)

  • top_n (int)

  • height (int, optional)

  • title (str, optional)

Return type:

go.Figure

plot_feature_importance(top_n=15, height=None, title=None)[source]

Feature importance: mean utility per feature, coloured by mean IG.

Matches the “Feature Importance” card in the governance dashboard.

Parameters:
  • top_n (int)

  • height (int, optional)

  • title (str, optional)

Return type:

go.Figure

plot_utility_vs_ig(feature_filter=None, height=None, title=None)[source]

Scatter: utility (x) × information gain (y), coloured by support.

Matches the “Utility vs Info Gain” card in the governance dashboard. Optionally filter to patterns containing one feature.

Parameters:
  • feature_filter (str, optional) – If given, highlight only patterns for this feature.

  • height (int, optional)

  • title (str, optional)

Return type:

go.Figure

plot_top_patterns(top_n=20, height=None, title=None)[source]

Horizontal bar chart of top-N patterns by utility, coloured by IG.

Matches the “Top Patterns” card in the governance dashboard.

Parameters:
  • top_n (int)

  • height (int, optional)

  • title (str, optional)

Return type:

go.Figure

plot_feature_coverage(top_n=15, height=None, title=None)[source]

Horizontal bar: how many patterns reference each feature.

Matches the “Feature Coverage” card in the governance dashboard.

Parameters:
  • top_n (int)

  • height (int | None)

  • title (str | None)

Return type:

plotly.graph_objects.Figure

plot_pattern_lengths(height=None, title=None)[source]

Bar chart of pattern length distribution.

Matches the “Pattern Lengths” card in the governance dashboard.

Parameters:
  • height (int | None)

  • title (str | None)

Return type:

plotly.graph_objects.Figure

plot_support_distribution(height=None, title=None)[source]

Histogram of pattern support values.

Matches the “Support Distribution” card in the governance dashboard.

Parameters:
  • height (int | None)

  • title (str | None)

Return type:

plotly.graph_objects.Figure

plot_active_patterns(X, sample_idx=0, max_patterns=20, height=None, title=None)[source]

Local explanation: active HUG patterns for a single sample.

Shows active patterns sorted by absolute coefficient magnitude, coloured blue for positive coefficients and red for negative coefficients.

Parameters:
  • X (array-like or DataFrame)

  • sample_idx (int)

  • max_patterns (int)

  • height (int, optional)

  • title (str, optional)

Return type:

go.Figure

plot_performance_radar(metrics, dataset_name='Dataset', height=None)[source]

Radar / spider chart of classification performance metrics.

Matches the “Performance” card in the governance dashboard.

Parameters:
  • metrics (dict) – Keys: ‘accuracy’, ‘balanced_accuracy’, ‘roc_auc’, ‘f1’ Values: floats in [0, 1].

  • dataset_name (str)

  • height (int, optional)

Return type:

go.Figure

plot_2d_profile(feature_a, feature_b, height=None, title=None)[source]

2-D HUG profile heatmap for compound patterns involving two features.

Parameters:
  • feature_a (str)

  • feature_b (str)

  • height (int, optional)

  • title (str, optional)

Return type:

go.Figure

plot_dashboard(X, dataset_name='Dataset', feature_names_for_profile=None, output_path=None)[source]

Generate a self-contained multi-panel HTML dashboard.

Produces performance overview, feature importance, utility-vs-IG, top patterns, pattern lengths, support distribution, feature coverage, and per-feature marginal bin profiles.

Parameters:
  • X (array-like or DataFrame) – Used for active-pattern coverage check.

  • dataset_name (str)

  • feature_names_for_profile (list of str, optional) – Which features to include marginal bin profiles for. Defaults to all features that have singleton patterns.

  • output_path (str, optional) – If given, writes the HTML to this path.

Return type:

str (HTML string)

Governance

Governance artifacts for HUGIMLClassifierNative.

Provides model card generation, audit artifact packaging, and governance metadata consistent with responsible model deployment practices and the HUG-IML paper’s emphasis on interpretability.

class hugiml.governance.ModelCard(model_id, model_type='HUGIMLClassifierNative', paper_reference='Krishnamoorthy, S. (2024). Interpretable Classifier Models for Decision Support Using High Utility Gain Patterns. IEEE Access, 12, 126088-126107. DOI: 10.1109/ACCESS.2024.3455563', license='Apache-2.0', intended_use='', out_of_scope_use='', training_data_description='', evaluation_data_description='', hyperparameters=<factory>, performance_metrics=<factory>, n_patterns=0, n_compound=0, top_patterns=<factory>, limitations=<factory>, ethical_considerations='', created_at=<factory>, framework_version='')[source]

Bases: object

Structured model card for a fitted HUGIMLClassifierNative.

Follows the Google Model Cards framework adapted for rule-based interpretable classifiers.

Parameters:
  • model_id (str)

  • model_type (str)

  • paper_reference (str)

  • license (str)

  • intended_use (str)

  • out_of_scope_use (str)

  • training_data_description (str)

  • evaluation_data_description (str)

  • hyperparameters (dict[str, Any])

  • performance_metrics (dict[str, Any])

  • n_patterns (int)

  • n_compound (int)

  • top_patterns (list[str])

  • limitations (list[str])

  • ethical_considerations (str)

  • created_at (str)

  • framework_version (str)

model_id

Unique identifier for this model version.

Type:

str

model_type

Always ‘HUGIMLClassifierNative’.

Type:

str

paper_reference

Citation for the HUG-IML algorithm.

Type:

str

license

Software license.

Type:

str

intended_use

Describe the intended classification task.

Type:

str

out_of_scope_use

Describe uses not covered by this model.

Type:

str

training_data_description

Description of training data.

Type:

str

evaluation_data_description

Description of evaluation data.

Type:

str

hyperparameters

B, L, G, topK as used during training.

Type:

dict

performance_metrics

Accuracy, F1, AUC, ECE, Brier score, etc.

Type:

dict

n_patterns

Number of mined HUG patterns.

Type:

int

n_compound

Number of compound patterns.

Type:

int

top_patterns

Most important patterns.

Type:

list of str

limitations

Known limitations.

Type:

list of str

ethical_considerations

Fairness, bias, and ethical notes.

Type:

str

created_at

ISO 8601 timestamp of creation.

Type:

str

framework_version

hugiml-core version.

Type:

str

to_dict()[source]

Serialize to a plain dictionary.

Return type:

dict

to_json(indent=2)[source]

Serialize to a JSON string.

Parameters:

indent (int)

Return type:

str

to_markdown()[source]

Render the model card as a Markdown document.

Return type:

str

save(path, fmt='json')[source]

Save the model card to a file.

Parameters:
  • path (str) – Output file path.

  • fmt ({'json', 'markdown', 'md'}) – Output format.

Return type:

None

class hugiml.governance.AuditArtifact(model_id, created_at=<factory>, training_hash='', model_card=None, governance=None, fit_metadata=None, pattern_info=None, calibration=None, explainability=None, framework_version='')[source]

Bases: object

Audit record for a model training run.

Captures all information needed for regulatory review or internal audit.

Parameters:
  • model_id (str)

  • created_at (str)

  • training_hash (str)

  • model_card (dict[str, Any] | None)

  • governance (dict[str, Any] | None)

  • fit_metadata (dict[str, Any] | None)

  • pattern_info (list[dict[str, Any]] | None)

  • calibration (dict[str, Any] | None)

  • explainability (dict[str, Any] | None)

  • framework_version (str)

to_dict()[source]

Return audit artifact fields as a plain dictionary.

Return type:

dict

to_json(indent=2)[source]

Serialise the audit artifact to a JSON string.

Parameters:

indent (int)

Return type:

str

save(path)[source]

Write the audit artifact to a JSON file.

Parameters:

path (str)

Return type:

None

class hugiml.governance.GovernanceMetadata(model_id, owner='', purpose='', data_classification='unclassified', review_status='draft', approved_by=None, approved_at=None, tags=<factory>)[source]

Bases: object

Minimal governance metadata attached to a model instance.

Parameters:
  • model_id (str)

  • owner (str)

  • purpose (str)

  • data_classification (str)

  • review_status (str)

  • approved_by (str | None)

  • approved_at (str | None)

  • tags (list[str])

model_id
Type:

str

owner

Person or team responsible for this model.

Type:

str

purpose

Business or scientific purpose.

Type:

str

data_classification

Sensitivity of training data (e.g. ‘public’, ‘internal’, ‘confidential’).

Type:

str

review_status

One of ‘draft’, ‘reviewed’, ‘approved’, ‘deprecated’.

Type:

str

approved_by
Type:

str or None

approved_at
Type:

str or None

tags
Type:

list of str

to_dict()[source]

Return governance metadata as a plain dictionary.

Return type:

dict

to_json(indent=2)[source]

Serialise governance metadata to a JSON string.

Parameters:

indent (int)

Return type:

str

hugiml.governance.generate_model_card(classifier, model_id, *, intended_use='', out_of_scope_use='', training_data_description='', evaluation_data_description='', performance_metrics=None, limitations=None, ethical_considerations='')[source]

Populate a ModelCard from a fitted classifier.

Parameters:
  • classifier (HUGIMLClassifierNative) – A fitted classifier.

  • model_id (str) – Unique identifier.

  • intended_use (str)

  • out_of_scope_use (str)

  • training_data_description (str)

  • evaluation_data_description (str)

  • performance_metrics (dict[str, Any] | None)

  • limitations (list[str] | None)

  • ethical_considerations (str)

Return type:

ModelCard

hugiml.governance.package_audit_artifacts(classifier, model_id, output_dir, *, model_card=None, governance=None, calibration_result=None, explainability_report=None)[source]

Package all audit artifacts for a trained model.

Writes model card, governance metadata, fit metadata, pattern info, and optional calibration/explainability reports to output_dir.

Returns:

Path to the audit manifest JSON file.

Return type:

str

Parameters:
  • classifier (Any)

  • model_id (str)

  • output_dir (str)

  • model_card (ModelCard | None)

  • governance (GovernanceMetadata | None)

  • calibration_result (Any | None)

  • explainability_report (Any | None)

Explainability

Enterprise explainability for HUGIMLClassifierNative.

Provides SHAP interoperability, feature lineage tracking, explanation stability metrics, and audit artifact generation. The core HUG patterns are human-readable by design; this module adds depth for downstream governance and audit workflows.

class hugiml.explainability.ExplainabilityReport(model_id, n_patterns, n_features, top_patterns=<factory>, feature_lineage=<factory>, model_composition=<factory>, augmented_pair_effects=<factory>, stability=None, shap_available=False)[source]

Bases: object

Full explainability report for a fitted classifier instance.

Contains pattern importances, feature lineage, and stability metrics. Serializable to JSON for audit workflows.

Parameters:
  • model_id (str)

  • n_patterns (int)

  • n_features (int)

  • top_patterns (list[dict[str, Any]])

  • feature_lineage (list[dict[str, Any]])

  • model_composition (dict[str, Any])

  • augmented_pair_effects (list[dict[str, Any]])

  • stability (dict[str, Any] | None)

  • shap_available (bool)

to_json(indent=2)[source]

Serialize the report to a JSON string.

Parameters:

indent (int)

Return type:

str

save(path)[source]

Write the report to a JSON file.

Parameters:

path (str)

Return type:

None

class hugiml.explainability.FeatureLineage(feature_name, feature_type, derived_patterns=<factory>, pattern_indices=<factory>, derived_augmented_pairs=<factory>, total_importance=0.0, pattern_importance=0.0, augmented_pair_importance=0.0, original_feature_importance=0.0)[source]

Bases: object

Provenance record linking an original feature to downstream features.

Parameters:
  • feature_name (str)

  • feature_type (str)

  • derived_patterns (list[str])

  • pattern_indices (list[int])

  • derived_augmented_pairs (list[str])

  • total_importance (float)

  • pattern_importance (float)

  • augmented_pair_importance (float)

  • original_feature_importance (float)

feature_name

Original feature name from the training DataFrame.

Type:

str

feature_type

One of ‘integer’, ‘float’, ‘categorical’.

Type:

str

derived_patterns

Human-readable HUG pattern labels that include this feature.

Type:

list of str

pattern_indices

Indices into the pattern list for each derived pattern.

Type:

list of int

derived_augmented_pairs

Augmented-pair feature names that use this source feature.

Type:

list of str

total_importance

Sum of absolute downstream coefficients for original, HUG pattern, and augmented-pair features linked to this source feature.

Type:

float

pattern_importance

Pattern-only contribution to total_importance.

Type:

float

augmented_pair_importance

Augmented-pair contribution to total_importance.

Type:

float

original_feature_importance

Direct original-feature contribution when original features are included in the downstream estimator.

Type:

float

class hugiml.explainability.ExplanationStabilityMetrics(jaccard_similarity=0.0, rank_correlation=0.0, pattern_overlap_count=0, n_patterns_a=0, n_patterns_b=0, by_feature_type=<factory>)[source]

Bases: object

Stability metrics for pattern-based explanations.

The top-level fields report stability for mined HUG patterns only. When original or augmented-pair downstream features are present, per-feature-type metrics are available in by_feature_type so derived feature stability is not conflated with human-readable pattern-rule stability.

Parameters:
  • jaccard_similarity (float)

  • rank_correlation (float)

  • pattern_overlap_count (int)

  • n_patterns_a (int)

  • n_patterns_b (int)

  • by_feature_type (dict[str, dict[str, float | int]])

class hugiml.explainability.HUGPatternExplainer(classifier)[source]

Bases: object

Enterprise explainability layer over a fitted HUGIMLClassifierNative.

Extracts feature lineage, computes explanation stability, and provides a SHAP-compatible interface where available. Designed to operate on the already-mined HUG patterns without re-running the algorithm.

Parameters:

classifier (HUGIMLClassifierNative) – A fitted classifier instance.

feature_lineage()[source]

Build feature lineage mapping each input feature to its patterns.

Returns:

One entry per original input feature.

Return type:

list of FeatureLineage

explanation_stability(X_a, y_a, X_b, y_b, top_n=20)[source]

Measure explanation stability across two data splits.

Fits two copies of the classifier on split A and split B. The headline metrics compare only mined HUG patterns. Additional metrics are returned by feature type so original features, HUG patterns, and augmented-pair transforms are not mixed into a single stability score.

Parameters:
  • X_a (split A data)

  • y_a (split A data)

  • X_b (split B data)

  • y_b (split B data)

  • top_n (int) – How many top patterns to compare.

Return type:

ExplanationStabilityMetrics

generate_report(model_id='hugiml_model', top_n=20)[source]

Generate a complete explainability report.

Parameters:
  • model_id (str) – Identifier for this model instance.

  • top_n (int) – Number of top patterns to include.

Return type:

ExplainabilityReport

hugiml.explainability.shap_values_from_pattern_matrix(classifier, X, *, background_samples=100, check_additivity=False, allow_incomplete=False)[source]

Compute SHAP values over the HUG pattern feature space.

Applies SHAP’s LinearExplainer (or KernelExplainer as fallback) on the binary pattern-presence matrix produced by the classifier’s transform() method. The resulting SHAP values are in pattern-space; use aggregate_shap_to_features() to roll them back to original features.

When the fitted downstream estimator also uses original or augmented-pair features, pattern-space SHAP is incomplete relative to the fitted model. In that case this function warns and returns None unless allow_incomplete=True is passed explicitly.

Requires the optional shap package (pip install shap).

Parameters:
  • classifier (HUGIMLClassifierNative) – A fitted classifier.

  • X (array-like) – Input data to explain.

  • background_samples (int) – Number of background samples for KernelExplainer.

  • check_additivity (bool) – Pass to SHAP’s explain call.

  • allow_incomplete (bool) – If False, return None when the fitted downstream estimator uses original or augmented-pair features in addition to HUG patterns.

Returns:

SHAP values in pattern space. Returns None when shap is not installed.

Return type:

np.ndarray of shape (n_samples, n_patterns) or None

Monitoring

Operational monitoring for HUGIMLClassifierNative.

Provides thread-safe prediction statistics tracking and multi-method distribution drift detection combining PSI, KL divergence, and label drift monitoring.

class hugiml.monitoring.PredictionMonitor(window_size=1000)[source]

Bases: object

Thread-safe prediction statistics tracker.

Attach to a fitted classifier via clf.enable_monitoring(). Access statistics via clf.monitor.report() or clf.monitor.stats.

Tracks prediction count, confidence distribution, per-class frequency, and latency percentiles over a rolling window.

Parameters:

window_size (int)

reset()[source]

Clear all accumulated statistics.

Return type:

None

record(proba, latency_ms)[source]

Record one batch of predictions.

Parameters:
  • proba (np.ndarray, shape (n_samples, n_classes)) – Predicted class probabilities.

  • latency_ms (float) – Wall-clock time for this batch in milliseconds.

Return type:

None

property stats: dict

Current monitoring statistics as a plain dict.

report()[source]

Human-readable monitoring report.

Return type:

str

class hugiml.monitoring.DriftDetector(n_bins=10)[source]

Bases: object

Multi-method distribution drift detector.

Combines Population Stability Index (PSI) and symmetric KL divergence for robust drift assessment. Optionally tracks label drift when ground truth is available.

PSI thresholds:

< 0.1 — stable 0.1–0.25 — moderate shift > 0.25 — significant drift

Parameters:

n_bins (int) – Number of histogram bins for numerical features.

fit_baseline(X, cat_mask, col_names=None, y=None)[source]

Store training distribution for later comparison.

Parameters:
  • X (np.ndarray, shape (n, p))

  • cat_mask (np.ndarray of bool, shape (p,))

  • col_names (list of str, optional)

  • y (np.ndarray of int, optional) – Training labels for label-drift baseline.

Return type:

None

compute_psi(X_test)[source]

Compute PSI per numerical feature between training and test.

Return type:

dict mapping column name to PSI value.

Parameters:

X_test (ndarray)

compute_kl(X_test)[source]

Compute symmetric KL divergence per feature.

Return type:

dict mapping column name to KL value.

Parameters:

X_test (ndarray)

compute_label_drift(y_test)[source]

Compute per-class proportion shift between training and test labels.

Returns None when no training label baseline is available.

Parameters:

y_test (ndarray)

Return type:

dict[str, float] | None

detect(X_test, y_test=None, threshold=0.1)[source]

Run full multi-method drift detection.

Parameters:
  • X_test (np.ndarray)

  • y_test (np.ndarray of int, optional)

  • threshold (float) – PSI threshold above which a feature is flagged.

Return type:

DriftReport

report(X_test, threshold=0.1)[source]

Return a human-readable drift report string (PSI only).

Parameters:
  • X_test (ndarray)

  • threshold (float)

Return type:

str

class hugiml.monitoring.DriftReport(psi, kl_divergence, label_drift, threshold)[source]

Bases: object

Structured result from a drift detection run.

Parameters:
  • psi (dict)

  • kl_divergence (dict)

  • label_drift (dict | None)

  • threshold (float)

psi

Population Stability Index per feature.

Type:

dict[str, float]

kl_divergence

Symmetric KL divergence per feature.

Type:

dict[str, float]

label_drift

Per-class label proportion shift (requires y_test).

Type:

dict[str, float] or None

overall_psi

Mean PSI across all numerical features.

Type:

float

overall_kl

Mean KL divergence across all numerical features.

Type:

float

drifted_features

Features exceeding the PSI threshold.

Type:

list[str]

severity

One of ‘none’, ‘moderate’, ‘significant’.

Type:

str

to_dict()[source]

Return all drift metrics as a plain dictionary.

Return type:

dict

Multiclass and imbalance

Helpers for three common HUG-IML deployment scenarios:

  1. Multiclass classification — HUGIMLClassifierNative supports multiclass natively via its base_estimator (LogisticRegression with solver='lbfgs' when n_classes > 2). This module provides a MulticlassHUGReport that extracts per-class pattern importances.

  2. Imbalanced data — wraps the classifier in a cost-sensitive or resampling pipeline via make_imbalanced_pipeline.

  3. High-cardinality categoricalsencode_high_cardinality replaces columns with many unique values with target-mean encoding or a frequency encoding before passing data to prepareXy.

class hugiml.multiclass.MulticlassHUGReport(clf)[source]

Bases: object

Per-class pattern importances for a multiclass HUG-IML model.

When the downstream estimator is LogisticRegression with > 2 classes, coef_ has shape (n_classes, n_patterns). This class exposes per-class top patterns.

Parameters:

clf (fitted HUGIMLClassifierNative)

importances_for_class(class_label, top_n=20)[source]

Return the top-N patterns for a specific class.

Parameters:
  • class_label (class value in clf.classes_)

  • top_n (int)

Returns:

pd.DataFrame with columns

Return type:

pattern, coefficient, abs_coefficient, support

summary(top_n=10)[source]

Human-readable summary of top patterns per class.

Parameters:

top_n (int)

Return type:

str

hugiml.multiclass.make_imbalanced_pipeline(clf, strategy='class_weight', sampling_ratio=1.0, random_state=42)[source]

Wrap a HUGIMLClassifierNative for use with imbalanced data.

Parameters:
  • clf (HUGIMLClassifierNative (unfitted))

  • strategy ({'class_weight', 'smote', 'random_oversample', 'random_undersample'}) –

    • class_weight — sets class_weight='balanced' on the downstream LR. Zero overhead; recommended first choice.

    • smote — SMOTE oversampling via imbalanced-learn.

    • random_oversample — random oversampling via imbalanced-learn.

    • random_undersample — random undersampling via imbalanced-learn.

  • sampling_ratio (float) – Target minority:majority ratio (only for imbalanced-learn strategies).

  • random_state (int)

Returns:

  • Fitted wrapper or HUGIMLClassifierNative (for ‘class_weight’) — the returned

  • object has fit(X, y), predict_proba(X), and predict(X) methods.

Notes

For ‘class_weight’: returns a copy of clf with base_estimator set to LogisticRegression(class_weight=’balanced’). For SMOTE/resampling: returns an ImbalancedHUGPipeline that applies resampling to the pattern matrix (post-transform) inside fit(). This ensures the HUG patterns are mined on the original distribution (as intended) while the downstream classifier trains on the resampled binary matrix.

hugiml.multiclass.encode_high_cardinality(X, y=None, threshold=20, method='target_mean', min_samples_leaf=5, smoothing=1.0, random_state=42)[source]

Replace high-cardinality categorical columns with numerical encodings.

This should be called before prepareXy; the returned mapping can be applied to test data via apply_encoding.

Parameters:
  • X (pd.DataFrame)

  • y (array-like, optional) – Required when method='target_mean'.

  • threshold (int) – Columns with more than this many unique values are considered high-cardinality.

  • method ({'target_mean', 'frequency', 'ordinal'}) –

    • target_mean — replace each category with its mean target value (smoothed towards the global mean). Reduces categories to a single float — most informative for tree/rule-based models.

    • frequency — replace with the category’s relative frequency.

    • ordinal — assign arbitrary integer codes (fast, no leakage, but loses any ordering meaning).

  • min_samples_leaf (int) – Minimum observations per category before smoothing kicks in (target_mean only).

  • smoothing (float) – Smoothing strength (target_mean only).

  • random_state (int) – Used internally for any random operations.

Returns:

  • X_encoded (pd.DataFrame (copy — original is unchanged))

  • encoding_map (dict) – Mapping {column_name: dict_or_array} to apply to unseen data via apply_encoding(X_test, encoding_map).

Return type:

tuple[DataFrame, dict]

Notes

Data-leakage safety: call encode_high_cardinality on the training split only. Use apply_encoding on test/validation data with the map returned from training. Never fit the encoding on combined train+test data.

hugiml.multiclass.apply_encoding(X, encoding_map, fill_value=0.0)[source]

Apply an encoding map (produced by encode_high_cardinality) to new data.

Parameters:
  • X (pd.DataFrame)

  • encoding_map (dict (from encode_high_cardinality))

  • fill_value (float) – Value for unseen categories.

Return type:

pd.DataFrame (copy)

Pattern pruning

Regulated “remove / refit / calibrate” workflow for HUG-IML.

EBMs are valued partly because model terms can be inspected and sometimes edited (e.g. to remove an ethically problematic interaction term). This module gives HUG-IML an analogous controlled editing workflow that is rigorous enough for regulated-domain review cycles.

Workflow

  1. Inspect patterns via clf.feature_importances() or clf.get_pattern_info().

  2. Create a PatternEditor and call remove() with a list of pattern indices (or keyword filters).

  3. Call refit(X_tr, y_tr) to re-train the downstream classifier on the pruned pattern matrix. The C++ mining results are unchanged.

  4. Optionally call calibrate(X_cal, y_cal) to wrap the refitted model with Platt scaling / isotonic regression.

  5. Call finalize() to get a new classifier instance with the edited pattern set baked in, and audit_report() for a JSON audit trail.

Example

from hugiml.pruning import PatternEditor

editor = PatternEditor(clf) editor.remove([3, 7, 12], reason=”pattern references protected attribute ‘gender’”) editor.remove_by_keyword(“income”, reason=”unstable feature (high PSI)”) new_clf = editor.refit(X_tr, y_tr).calibrate(X_cal, y_cal).finalize()

print(editor.audit_report()) new_clf.predict_proba(X_te)

class hugiml.pruning.PatternEditor(clf, operator_name='analyst')[source]

Bases: object

Controlled pattern editing with full audit trail.

Parameters:
  • clf (fitted HUGIMLClassifierNative) – The original model. This object is not mutated; all edits produce a fresh copy stored internally.

  • operator_name (str) – Human-readable identifier of the person/process making the edits (for the audit trail).

remove(pattern_indices, reason='unspecified')[source]

Remove patterns by index (0-based, relative to the current working set).

Parameters:
  • pattern_indices (list of int) – Indices into the current pattern list. Use list_patterns() to preview indices.

  • reason (str) – Audit reason (e.g. ‘protected attribute’, ‘operationally invalid’).

Return type:

self (for method chaining)

remove_by_keyword(keyword, reason='keyword match', case_sensitive=False)[source]

Remove all patterns whose label contains keyword.

Parameters:
  • keyword (str)

  • reason (str)

  • case_sensitive (bool)

Return type:

self

remove_low_support(min_support=0.01, reason='support below threshold')[source]

Remove patterns with training support below min_support.

Parameters:
  • min_support (float) – Minimum fraction of training samples (0 to 1).

  • reason (str)

Return type:

self

refit(X_tr, y_tr, estimator=None)[source]

Refit the downstream classifier on the (pruned) pattern matrix.

The HUG mining results (patterns_) are unchanged; only the downstream Pipeline (model_) is replaced.

Parameters:
  • X_tr (array-like or DataFrame) – Training data (should be the same split used to fit the original model).

  • y_tr (array-like)

  • estimator (sklearn estimator, optional) – If None, uses the original downstream estimator class with the same hyperparameters.

Return type:

self

calibrate(X_cal, y_cal, method='isotonic')[source]

Wrap the refitted downstream model with probability calibration.

Uses sklearn.calibration.CalibratedClassifierCV applied post-fit to a calibration set that should be held out from both training and test.

Parameters:
  • X_cal (array-like or DataFrame)

  • y_cal (array-like)

  • method ({'sigmoid', 'isotonic'})

Return type:

self

finalize()[source]

Return the edited classifier as a new standalone instance.

After calling finalize(), further edits on this editor are blocked. The returned object is a fully independent copy.

Return type:

HUGIMLClassifierNative (edited copy)

list_patterns()[source]

Return editable HUG patterns in the current working model.

PatternEditor edits mined HUG patterns only. Original features and augmented-pair downstream features are visible through list_downstream_features() but are not directly removable by this editor.

Return type:

DataFrame

list_downstream_features()[source]

Return all downstream features with PatternEditor editability.

The returned table includes original features, HUG patterns, and augmented-pair transforms when present. Only rows with feature_type == 'pattern' are directly editable through remove() and related PatternEditor methods.

Return type:

DataFrame

diff()[source]

Return a summary of changes made relative to the original model.

Returns:

dict with keys

Return type:

n_original, n_current, n_removed, removed_patterns

audit_report(indent=2)[source]

Return a JSON string describing all edits made.

The report includes operator name, timestamps, reasons, and the diff summary.

Parameters:

indent (int)

Return type:

str

save_audit_report(path)[source]

Write the audit report to a JSON file.

Parameters:

path (str)

Return type:

None

class hugiml.pruning.RemovalRecord(timestamp, pattern_indices, pattern_labels, reason, removed_by)[source]

Bases: object

Audit record for a single pattern-removal action.

Parameters:
  • timestamp (str)

  • pattern_indices (list[int])

  • pattern_labels (list[str])

  • reason (str)

  • removed_by (str)

Serialization

Versioned serialization and SBOM generation for HUGIMLClassifierNative.

Format (v3 — default)

A ZIP archive containing JSON manifests and NumPy array bundles. No pickle is required to round-trip the model, eliminating the gadget-chain attack surface that exists in any pickle-based format.

Archive layout:

manifest.json          – format_version, schema_version, timestamp
clf_init.json          – __init__ hyperparameters
clf_fit.json           – scalar / list fitted attributes
patterns.json          – list of {utility, items, ig} dicts
arrays.npz             – cat_cols_mask_, is_int_mask_, classes_
td_config.json         – TransactionDataWrapper non-array state
td_arrays.npz          – TransactionDataWrapper numpy arrays
estimator.json         – downstream estimator class + parameters
estimator_arrays.npz   – downstream estimator numpy arrays
hmac.sig               – HMAC-SHA256 over all content files (hex)

Authentication

Set HUGIML_MODEL_HMAC_KEY (hex-encoded, 32+ bytes) before saving or loading. Files saved without a key have an all-zero hmac.sig and can still be loaded unless HUGIML_REQUIRE_MODEL_HMAC=true is set.

Backward compatibility (v1/v2)

Models saved with schema version 1 or 2 (the legacy HMAC-pickle format) are still loadable via a restricted Unpickler that permits only known HUG-IML and sklearn modules. v1/v2 writing is not supported.

hugiml.serialization.save_model(clf, path)[source]

Persist a fitted classifier to a v3 ZIP/JSON/NumPy model file.

Parameters:
Raises:

HUGIMLSerializationError – When the model is unfitted, a component cannot be serialized, or the write fails.

Return type:

None

hugiml.serialization.load_model(path, expected_type=None)[source]

Load a classifier from a file saved by save_model().

Supports: * v3 — ZIP/JSON/NumPy format (default since 2.1) * v1/v2 — legacy HMAC-pickle format (read-only; still authenticated)

Parameters:
  • path (str or Path)

  • expected_type (type, optional)

Return type:

HUGIMLClassifierNative

Raises:
hugiml.serialization.generate_sbom(output_path=None)[source]

Generate a Software Bill of Materials for the installed hugiml-core.

Parameters:

output_path (str, optional)

Return type:

dict — CycloneDX-lite SBOM document.

Telemetry

OpenTelemetry and Prometheus instrumentation for HUGIMLClassifierNative.

Both integrations are strictly optional: if the respective packages are not installed the module degrades gracefully to no-op stubs. Import and use of this module never breaks the classifier itself.

OpenTelemetry

Wraps fit(), predict_proba(), and predict() with OTEL spans and attributes. Set HUGIML_OTEL_ENABLED=1 to activate.

Prometheus

Exposes prediction count, latency histogram, and confidence gauge. Set HUGIML_PROMETHEUS_ENABLED=1 to activate.

Debug logging

All non-fatal telemetry and metrics failures are logged at DEBUG level (logger = logging.getLogger("hugiml.telemetry")) with exc_info=True so that stack traces are available when the root logger is configured at DEBUG without any user-visible noise at INFO or above.

class hugiml.telemetry.HUGIMLTracer[source]

Bases: object

OpenTelemetry tracer wrapper for HUGIMLClassifierNative.

Emits spans for fit, predict_proba, and predict with attributes including n_samples, n_patterns, and latency.

When opentelemetry-api is not installed all operations are no-ops.

classmethod span(name, attributes=None)[source]

Context manager yielding an OTEL span (or no-op).

Parameters:
  • name (str)

  • attributes (dict | None)

Return type:

Generator[Any, None, None]

class hugiml.telemetry.HUGIMLMetrics[source]

Bases: object

Prometheus metrics for HUGIMLClassifierNative.

Exposes:
  • hugiml_predictions_total counter

  • hugiml_prediction_latency_seconds histogram

  • hugiml_confidence_mean gauge

  • hugiml_drift_psi gauge (per-feature)

When prometheus_client is not installed all metrics are no-ops.

classmethod record_prediction(model_id, n_samples, latency_s, mean_confidence, success=True)[source]

Record prediction metrics.

Parameters:
  • model_id (str)

  • n_samples (int)

  • latency_s (float)

  • mean_confidence (float)

  • success (bool)

Return type:

None

classmethod record_drift(model_id, psi_dict)[source]

Update per-feature PSI gauges.

Parameters:
  • model_id (str)

  • psi_dict (dict)

Return type:

None

hugiml.telemetry.instrument_classifier(classifier, model_id='default')[source]

Wrap a fitted classifier with telemetry instrumentation.

Patches predict_proba and predict methods in-place to emit OTEL spans and Prometheus metrics. The classifier itself is modified and returned.

Parameters:
Return type:

The same classifier instance with patched methods.

Exceptions

Structured exception and warning hierarchy for HUG-IML.

Taxonomy:

HUGIMLError (base)
├── HUGIMLFitError          — any failure during fit()
│   ├── HUGIMLMiningError   — pattern mining specifically
│   ├── HUGIMLTimeoutError  — max_fit_seconds exceeded
│   └── HUGIMLMemoryError   — native/Python memory budget exceeded
├── HUGIMLValidationError   — input data / param validation
│   ├── HUGIMLSchemaError   — column mismatch at predict time
│   └── HUGIMLParamError    — bad hyperparameter values / types
├── HUGIMLSerializationError — load/save failures
│   └── HUGIMLVersionError  — schema version incompatibility
└── HUGIMLPredictionError   — failures during predict/transform

HUGIMLWarning (base, UserWarning subclass)
├── HUGIMLConvergenceWarning — model converged to minimal patterns
├── HUGIMLDtypeDriftWarning  — categorical column dtype changed
├── HUGIMLRangeWarning       — feature values outside training range
├── HUGIMLDegradedWarning    — model degraded due to timeout/memory
└── HUGIMLDeprecationWarning — deprecated API usage
exception hugiml.exceptions.HUGIMLError[source]

Bases: Exception

Base exception for all HUG-IML errors.

exception hugiml.exceptions.HUGIMLFitError[source]

Bases: HUGIMLError

Raised when fit() fails for any reason.

exception hugiml.exceptions.HUGIMLMiningError[source]

Bases: HUGIMLFitError

Raised when pattern mining fails or produces zero patterns.

exception hugiml.exceptions.HUGIMLTimeoutError[source]

Bases: HUGIMLFitError

Raised when fit exceeds max_fit_seconds.

exception hugiml.exceptions.HUGIMLMemoryError[source]

Bases: HUGIMLFitError, MemoryError

Raised when fit cannot safely allocate required memory.

exception hugiml.exceptions.HUGIMLValidationError[source]

Bases: HUGIMLError, ValueError

Raised when input data or configuration is invalid.

Inherits from ValueError for backward compatibility with existing except-ValueError handlers.

exception hugiml.exceptions.HUGIMLSchemaError[source]

Bases: HUGIMLValidationError

Raised when predict-time data does not match training schema (wrong columns, wrong order, wrong count).

exception hugiml.exceptions.HUGIMLParamError[source]

Bases: HUGIMLValidationError, TypeError

Raised when hyperparameters have wrong types or values.

Inherits from both TypeError and ValueError for backward compatibility.

exception hugiml.exceptions.HUGIMLSerializationError[source]

Bases: HUGIMLError

Raised when model save/load fails.

exception hugiml.exceptions.HUGIMLVersionError[source]

Bases: HUGIMLSerializationError

Raised when loading a model whose schema version is incompatible.

exception hugiml.exceptions.HUGIMLPredictionError[source]

Bases: HUGIMLError, RuntimeError

Raised when predict/transform fails on a fitted model.

Inherits from RuntimeError for backward compatibility.

exception hugiml.exceptions.HUGIMLWarning[source]

Bases: UserWarning

Base warning for all HUG-IML warnings.

exception hugiml.exceptions.HUGIMLConvergenceWarning[source]

Bases: HUGIMLWarning

Issued when the model converges to a minimal number of patterns (e.g. due to very restrictive G or low-information data).

exception hugiml.exceptions.HUGIMLDtypeDriftWarning[source]

Bases: HUGIMLWarning

Issued when a categorical column is passed as numeric at predict time.

exception hugiml.exceptions.HUGIMLRangeWarning[source]

Bases: HUGIMLWarning

Issued when feature values fall far outside the training range.

exception hugiml.exceptions.HUGIMLDegradedWarning[source]

Bases: HUGIMLWarning

Issued when the model entered degraded mode due to timeout or memory pressure during fit().

exception hugiml.exceptions.HUGIMLDeprecationWarning[source]

Bases: HUGIMLWarning, DeprecationWarning

Issued for deprecated API usage.