Model-Agnostic Assurance (ALSP)

Assurance scores across explainability, fairness, and security for deployed ML
📄 Paper (PDF) 💻 Code (GitHub)

Abstract

State-of-the-art AI assurance is often model- and domain-specific. We present two model-agnostic pipelines—Adversarial Logging Scoring Pipeline (ALSP) and Requirements Feedback Scoring Pipeline (RFSP)—that score explainability, safety, security, fairness, trustworthiness, and ethics. ALSP uses game-theoretic weighting, adversarial logging, and secret inversion to detect malicious inputs and quantify assurance. RFSP is user-driven: it gathers assurance weight preferences, segments data, and optimizes hyper-parameters (grid and Bayesian) to reflect desired goals. Both pipelines are validated on SCADA, telco, water, and telecom datasets, producing quantifiable assurance scores and surfacing trade-offs among AI goals.

Highlights

Unified assurance

Combines XAI, fairness, and security signals into a single actionable index for each prediction.

Game-theoretic weights

Shapley-based weighting of indices prioritizes features and outcomes that most impact assurance.

Adversarial logging

Tracks adversarial examples and secret inversion attempts to flag drift and malicious behavior.

Production-friendly

Model-agnostic hooks for tabular and vision models with per-sample reporting.

Approach

ALSP (model-driven). Three algorithms: (1) Weight Assessment multiplies Shapley values with domain-provided assurance labels to yield per-sample scores; (2) Reverse Learning logs every boosting epoch to surface loss minima and drift; (3) Secret Inversion trains an autoencoder to detect adversarial data via reconstruction errors (SAI/CAI).

RFSP (user-driven). Three algorithms: (1) Economic Equilibrium collects user-weighted goals (must sum to 100); (2) Extreme Data Segmentation builds a dedicated AIA set and maps statistical measures to assurance goals; (3) Model Optimization tunes hyper-parameters via grid search and Bayesian optimization, reporting trust scores from F1 deltas.

Experiments & Results

Datasets. SCADA (critical water network, intrusion detection), Telco (churn/plan selection, bias tests), Pima diabetes (8 features, 768 samples), Bank loans (GBDT logging), and synthetic water/telco benchmarks for assurance stress tests.

Weight Assessment. Shapley-weighted AI assurance columns (AIAC) produce per-sample scores; injected Gaussian bias visibly shifts score distributions (Fig. Distributions & Histogram of indices).

Reverse Learning. Custom GBDT logging finds loss minima at epoch 13; epochs beyond that degrade accuracy and are pruned.

Secret Inversion. Autoencoder detects adversarial SCADA inputs; thresholding top 1% reconstruction error separates attack traffic with >91% accuracy.

RFSP. User-weighted goals (16.6 each) combined with statistical measures (ANOVA, Kendall, MI, Chi2, KS, outliers, bias/variance) to yield weighted AIA scores. Bayesian hyper-parameter search improves trust (TAI) over defaults and grid search.

Statistical measures mapped to assurance goals
MeasureImplicationLimitationRelated goals
ANOVA-FLinear dependenceNumeric→categoricalXAI, TAI
KendallNonlinear dependenceNumeric→categoricalXAI, TAI
Mutual InformationGeneral dependenceData-agnosticXAI, TAI
Chi-squaredCategory dependenceCat→catXAI, TAI
KS testDistribution distanceNumericSAI, CAI
Outlier ratePoints outside 3σNumericSAI, CAI
BiasPrediction accuracyAll colsFAI, EAI
VariancePrediction stabilityAll colsFAI, EAI
Intrusion detection via Secret Inversion (AE)
Test setAccuracyF1PrecisionRecall
SCADA-10.94660.71940.94380.5813
SCADA-20.91280.71820.97070.5700
CAI score vs injected bias (SCADA)
Bias injectedCAI score
3.33%54.4
2.67%52.7
2.00%53.4
1.34%57.3
0.67%63.6
0%71.5
LightGBM hyper-parameters (Telco)
Hyper-parameterDefaultBayesianGrid
Learning rate0.100.050.07
Max depth-1 (unlimited)149
Bagging fraction1.00.81.0
Number of trees100396100
Number of leaves312512
TAI (F1) improvements
Model setupTelco F1SCADA F1
Default0.7670.802
Grid search0.7460.794
Bayesian opt.0.7720.842