SlideShare a Scribd company logo
Distributionally Robust
StatisticalVerification
with Imprecise Neural Networks
Souradeep Dutta*, Michele Caprio*,Vivian Lin, Matthew Cleaveland*, Kuk Jin Jang,
Ivan Ruchkin*, Oleg Sokolsky, Insup Lee
PRECISE Center, University of Pennsylvania
* former members
HSCC 2025
2
Self-Evident Truths
• Safety guarantees for autonomous systems with
high-dimensional states are important — but difficult
○ Reachability analysis struggles to scale in state
dimensions
○ Statistical methods often assume fixed input
distributions → vulnerable to distribution shifts
3
Our Contributions (= Agenda)
1. Formulation of distributionally robust statistical
verification
2. Imprecise neural networks (INNs) to represent
state-wise performance uncertainty
3. Scalable uncertainty-guided active learning algorithm
4. Empirical results on Mujoco with RL controllers
4
Our Contributions (= Agenda)
1. Formulation of distributionally robust statistical
verification
2. Imprecise neural networks (INNs) to represent
state-wise performance uncertainty
3. Scalable uncertainty-guided active learning algorithm
4. Empirical results on Mujoco with RL controllers
• Autonomous system: , trajectory
• Aspiration: guarantee
– Find and distribution set s.t. the above holds over any
• Formally:
5
Distributionally Robust StatisticalVerification
Distributions
of
6
Our Contributions (= Agenda)
1. Formulation of distributionally robust statistical
verification
2. Imprecise neural networks (INNs) to represent
state-wise performance uncertainty
3. Scalable uncertainty-guided active learning algorithm
4. Empirical results on Mujoco with RL controllers
7
Imprecise Neural Networks (INNs)
• INN := a set of N feedforward NNs
• Ensemble bounds:
– Property: contains the predicted function with avg. chance
– Uncertainty := the max disagreement among models:
• Idea: use an INN to:
(a) give performance guarantees; (b) guide sampling of states
• Convex combination of uniform distributions:
• “Contamination” with an arbitrary distribution Q:
8
Construction of Distribution Sets
Not sampled, but within
our distribution set
1.0
1.0
Uniform distributions
Combined region
• Given region , Sherlock verifier computes
• Set the performance threshold , which applies to all
distributions in the distribution set
• Our distributional guarantee: the expected probability
of is greater than for all
9
Guarantee on System Performance
10
Our Contributions (= Agenda)
1. Formulation of distributionally robust statistical verification
2. Imprecise neural networks (INNs) to represent
state-wise performance uncertainty
3. Scalable uncertainty-guided active learning algorithm
4. Empirical results on Mujoco with RL controllers
11
Two-Step Overall Approach
1. Active learning: Sample a high-uncertainty batch & train an INN
2. Verification: Instantiate performance bounds with the INN
12
Active Learning Algorithm
Greedy strategy for exploring the high-dimensional space:
1. Draw initial samples from and label them with the true
performance function
2. Train an INN to estimate the bounds on and compute
uncertainty
3. Use Sherlock verifier to identify the point with highest
4. Sample a batch from the point’s δ-ball neighborhood
5. Retrain the INN with the updated dataset
13
Our Contributions (= Agenda)
1. Formulation of distributionally robust statistical verification
2. Imprecise neural networks (INNs) to represent
state-wise performance uncertainty
3. Scalable uncertainty-guided active learning algorithm
4. Empirical results on Mujoco with RL controllers
14
Experiments: Environments
• 10 MuJoCo environments from OpenAI Gym
• Control policies trained using Deep Deterministic Policy Gradient
• Performance function to verify: average reward overT steps
Half-Cheetah
18 Dimensions
Ant
29 Dimensions
Humanoid
47 Dimensions
INN (Ours)
… from a family of
distributions.
15
Experiments: Baselines
• For training distribution …
• To obtain similar guarantees, the test point must be...
Conformal
Prediction
… from the training
distribution.
Robust Conformal
Prediction
… a bounded distance from
the training distribution.
16
Experiments: Methodology
• Evaluation procedure:
o Set tightness β, exploration radius δ, and target coverage 1-λ-1
= 95%
o Collect a starting set of states
o Compute the distribution and performance threshold ε
o Sample initial states from that distribution
o Get the true ψ values, compare them to the INN/baseline intervals
• Evaluation metrics:
o Coverage should be above the target 95%
o Interval width should be low
• Two distributional settings for initial states:
o In-distribution: matches the INN training distribution
o Out-of-distribution: not the INN training distribution (but inside our set)
17
Evaluation Highlights
• INNs maintain ≥99% coverage, unlike baselines
• INNs yield slightly larger intervals than baselines
• INNs demonstrate robustness to distribution shift
• Our verification takes dozens of seconds
18
Results: In-Distribution
L Conformal Prediction INN (Ours)
Environment State
Dimension
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Ant 29 93 3.2 99 4.2
Half-Cheetah 18 93 4.9 x 10-1
100 3
Hopper 12 97 2.9 100 2.2
Humanoid 47 97 2.7 100 3
Humanoid-Standup 47 96 143 99 260
Inverted Double Pend. 6 95 6.7 x 10-3
100 5.1 x 10-2
Inverted Pendulum 4 95 1.2 x 10-3
100 4.1 x 10-2
Reacher 8 93 3.3 x 10-2
100 7.6 x 10-2
Swimmer 10 93 2.8 x 10-3
100 4.9 x 10-2
Walker2d 18 95 5.2 99 5.8
19
Results: In-Distribution
L Conformal Prediction INN (Ours)
Environment State
Dimension
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Ant 29 93 3.2 99 4.2
Half-Cheetah 18 93 4.9 x 10-1
100 3
Hopper 12 97 2.9 100 2.2
Humanoid 47 97 2.7 100 3
Humanoid-Standup 47 96 143 99 260
Inverted Double Pend. 6 95 6.7 x 10-3
100 5.1 x 10-2
Inverted Pendulum 4 95 1.2 x 10-3
100 4.1 x 10-2
Reacher 8 93 3.3 x 10-2
100 7.6 x 10-2
Swimmer 10 93 2.8 x 10-3
100 4.9 x 10-2
Walker2d 18 95 5.2 99 5.8
Conformal prediction does
not always achieve the
target coverage of 95%
20
Results: In-Distribution
L Conformal Prediction INN (Ours)
Environment State
Dimension
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Ant 29 93 3.2 99 4.2
Half-Cheetah 18 93 4.9 x 10-1
100 3
Hopper 12 97 2.9 100 2.2
Humanoid 47 97 2.7 100 3
Humanoid-Standup 47 96 143 99 260
Inverted Double Pend. 6 95 6.7 x 10-3
100 5.1 x 10-2
Inverted Pendulum 4 95 1.2 x 10-3
100 4.1 x 10-2
Reacher 8 93 3.3 x 10-2
100 7.6 x 10-2
Swimmer 10 93 2.8 x 10-3
100 4.9 x 10-2
Walker2d 18 95 5.2 99 5.8
INNs always
exceed the target
coverage rate...
… with some
small cost in
interval size.
21
Results: Out-of-Distribution
L Conformal Prediction Robust Conformal Prediction INN (Ours)
Environment State
Dimension
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Ant 29 94 3.2 95 3.5 98 4.8 100 4.3
Half-Cheetah 18 93 4.9 x 10-1
95 5.2 x 10-1
98 6.5 x 10-1
100 1.9
Hopper 12 97 2.9 98 3.3 99 3.3 100 2.5
Humanoid 47 98 2.7 97 2.9 99 3.5 99 2.8
Humanoid-Standup 47 95 143 97 167 99 297 99 305
Inverted Double
Pend.
6
94 6.7 x 10-3
97 7.5 x 10-3
99 8.8 x 10-3
100 5.1 x 10-2
Inverted Pendulum 4 100 1.2 x 10-3
96 1.2 x 10-3
99 1.2 x 10-3
100 4.2 x 10-2
Reacher 8 91 3.3 x 10-2
94 3.4 x 10-2
99 4.2 x 10-2
100 8.5 x 10-2
Swimmer 10 92 2.8 x 10-3
94 3.0 x 10-3
98 9.0 x 10-3
100 5 x 10-2
Walker2d 18 94 5.2 95 5.4 99 7.1 99 5.2
L Conformal Prediction Robust Conformal Prediction INN (Ours)
Environment State
Dimension
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Ant 29 94 3.2 95 3.5 98 4.8 100 4.3
Half-Cheetah 18 93 4.9 x 10-1
95 5.2 x 10-1
98 6.5 x 10-1
100 1.9
Hopper 12 97 2.9 98 3.3 99 3.3 100 2.5
Humanoid 47 98 2.7 97 2.9 99 3.5 99 2.8
Humanoid-Standup 47 95 143 97 167 99 297 99 305
Inverted Double
Pend.
6
94 6.7 x 10-3
97 7.5 x 10-3
99 8.8 x 10-3
100 5.1 x 10-2
Inverted Pendulum 4 100 1.2 x 10-3
96 1.2 x 10-3
99 1.2 x 10-3
100 4.2 x 10-2
Reacher 8 91 3.3 x 10-2
94 3.4 x 10-2
99 4.2 x 10-2
100 8.5 x 10-2
Swimmer 10 92 2.8 x 10-3
94 3.0 x 10-3
98 9.0 x 10-3
100 5 x 10-2
Walker2d 18 94 5.2 95 5.4 99 7.1 99 5.2
22
Results: Out-of-Distribution
Conformal prediction
does not always achieve
the target coverage
L Conformal Prediction Robust Conformal Prediction INN (Ours)
Environment State
Dimension
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Ant 29 94 3.2 95 3.5 98 4.8 100 4.3
Half-Cheetah 18 93 4.9 x 10-1
95 5.2 x 10-1
98 6.5 x 10-1
100 1.9
Hopper 12 97 2.9 98 3.3 99 3.3 100 2.5
Humanoid 47 98 2.7 97 2.9 99 3.5 99 2.8
Humanoid-Standup 47 95 143 97 167 99 297 99 305
Inverted Double
Pend.
6
94 6.7 x 10-3
97 7.5 x 10-3
99 8.8 x 10-3
100 5.1 x 10-2
Inverted Pendulum 4 100 1.2 x 10-3
96 1.2 x 10-3
99 1.2 x 10-3
100 4.2 x 10-2
Reacher 8 91 3.3 x 10-2
94 3.4 x 10-2
99 4.2 x 10-2
100 8.5 x 10-2
Swimmer 10 92 2.8 x 10-3
94 3.0 x 10-3
98 9.0 x 10-3
100 5 x 10-2
Walker2d 18 94 5.2 95 5.4 99 7.1 99 5.2
23
Results: Out-of-Distribution
Robust conformal prediction
almost always achieves the
target coverage
L Conformal Prediction Robust Conformal Prediction INN (Ours)
Environment State
Dimension
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Ant 29 94 3.2 95 3.5 98 4.8 100 4.3
Half-Cheetah 18 93 4.9 x 10-1
95 5.2 x 10-1
98 6.5 x 10-1
100 1.9
Hopper 12 97 2.9 98 3.3 99 3.3 100 2.5
Humanoid 47 98 2.7 97 2.9 99 3.5 99 2.8
Humanoid-Standup 47 95 143 97 167 99 297 99 305
Inverted Double
Pend.
6
94 6.7 x 10-3
97 7.5 x 10-3
99 8.8 x 10-3
100 5.1 x 10-2
Inverted Pendulum 4 100 1.2 x 10-3
96 1.2 x 10-3
99 1.2 x 10-3
100 4.2 x 10-2
Reacher 8 91 3.3 x 10-2
94 3.4 x 10-2
99 4.2 x 10-2
100 8.5 x 10-2
Swimmer 10 92 2.8 x 10-3
94 3.0 x 10-3
98 9.0 x 10-3
100 5 x 10-2
Walker2d 18 94 5.2 95 5.4 99 7.1 99 5.2
24
Results: Out-of-Distribution
Robust conformal prediction
almost always achieves the
target coverage
Coverage increases with
the amount of allowable
distribution shift
L Conformal Prediction Robust Conformal Prediction INN (Ours)
Environment State
Dimension
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Ant 29 94 3.2 95 3.5 98 4.8 100 4.3
Half-Cheetah 18 93 4.9 x 10-1
95 5.2 x 10-1
98 6.5 x 10-1
100 1.9
Hopper 12 97 2.9 98 3.3 99 3.3 100 2.5
Humanoid 47 98 2.7 97 2.9 99 3.5 99 2.8
Humanoid-Standup 47 95 143 97 167 99 297 99 305
Inverted Double
Pend.
6
94 6.7 x 10-3
97 7.5 x 10-3
99 8.8 x 10-3
100 5.1 x 10-2
Inverted Pendulum 4 100 1.2 x 10-3
96 1.2 x 10-3
99 1.2 x 10-3
100 4.2 x 10-2
Reacher 8 91 3.3 x 10-2
94 3.4 x 10-2
99 4.2 x 10-2
100 8.5 x 10-2
Swimmer 10 92 2.8 x 10-3
94 3.0 x 10-3
98 9.0 x 10-3
100 5 x 10-2
Walker2d 18 94 5.2 95 5.4 99 7.1 99 5.2
25
Results: Out-of-Distribution
… but so does
interval size.
Coverage increases with
the amount of allowable
distribution shift...
Robust conformal prediction
almost always achieves the
target coverage
L Conformal Prediction Robust Conformal Prediction INN (Ours)
Environment State
Dimension
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Coverage
(%)
Interval
Size
Ant 29 94 3.2 95 3.5 98 4.8 100 4.3
Half-Cheetah 18 93 4.9 x 10-1
95 5.2 x 10-1
98 6.5 x 10-1
100 1.9
Hopper 12 97 2.9 98 3.3 99 3.3 100 2.5
Humanoid 47 98 2.7 97 2.9 99 3.5 99 2.8
Humanoid-Standup 47 95 143 97 167 99 297 99 305
Inverted Double
Pend.
6
94 6.7 x 10-3
97 7.5 x 10-3
99 8.8 x 10-3
100 5.1 x 10-2
Inverted Pendulum 4 100 1.2 x 10-3
96 1.2 x 10-3
99 1.2 x 10-3
100 4.2 x 10-2
Reacher 8 91 3.3 x 10-2
94 3.4 x 10-2
99 4.2 x 10-2
100 8.5 x 10-2
Swimmer 10 92 2.8 x 10-3
94 3.0 x 10-3
98 9.0 x 10-3
100 5 x 10-2
Walker2d 18 94 5.2 95 5.4 99 7.1 99 5.2
26
Results: Out-of-Distribution
INNs always
achieve the target
coverage of 95%
27
Results:Time toVerify
L
Environment State
Dimension
Execution Time (s)
Mean ± Std
Execution Time (s)
Mean ± Std
Ant 29 127 ± 199 184 ± 289
Half-Cheetah 18 10 ± 3 13 ± 7
Hopper 12 15 ± 13 6 ± 3
Humanoid 47 31 ± 11 149 ± 209
Humanoid-Standup 47 20 ± 5 51 ± 46
Inverted Double Pend. 6 2.4 ± 1 2.5 ± 1
Inverted Pendulum 4 1.3 ± 0.3 1.5 ± 0.3
Reacher 8 9 ± 3.7 40 ± 46
Swimmer 10 5 ± 1.4 6 ± 2.6
Walker2d 18 8.5 ± 3.8 11.6 ± 5.1
At design stage,
execution times
are reasonable...
even for higher
dimensions
L
Environment State
Dimension
Execution Time (s)
Mean ± Std
Execution Time (s)
Mean ± Std
Ant 29 127 ± 199 184 ± 289
Half-Cheetah 18 10 ± 3 13 ± 7
Hopper 12 15 ± 13 6 ± 3
Humanoid 47 31 ± 11 149 ± 209
Humanoid-Standup 47 20 ± 5 51 ± 46
Inverted Double Pend. 6 2.4 ± 1 2.5 ± 1
Inverted Pendulum 4 1.3 ± 0.3 1.5 ± 0.3
Reacher 8 9 ± 3.7 40 ± 46
Swimmer 10 5 ± 1.4 6 ± 2.6
Walker2d 18 8.5 ± 3.8 11.6 ± 5.1
28
Results:Time toVerify
Execution takes
longer when the
search space is larger
Limitations
Strengths:
• Robustness to distributional (epistemic) uncertainty
• No assumptions on the system dynamics or performance function
• Handles dozens of state dimensions
Limitations:
• Conservatism in both coverage and intervals
• A quite particular shape of the distribution set
• Many hyperparameters to tune
29
30
Summary
1. Formulation of distributionally
robust statistical verification
2. INNs: imprecise
neural networks
3. Scalable active learning 4. Mujoco experiments
31
Backup
32
Experimental Setup
• Evaluated on 10 Mujoco environments:Ant, Humanoid,
Hopper, …
• Control policies trained using Deep Deterministic Policy
Gradient (DDPG).
• Each policy’s performance is the temporal average reward
R_avg over T steps.
• INN architecture: ensemble of 3 DNNs (2 layers, width 50
neurons, ReLU activations)
• Confidence level λ = 20 (95%), exploration δ = 0.05, M = 20
active learning iterations.
• Comparison with Conformal Prediction (CP) and Robust CP
(RCP) under in- and out-of-distribution.
33
Parameters
• α (alpha) – Contamination level: controls robustness to
distribution shift; higher α means more adversarial tolerance.
• β (beta) – Interval tightness: balances precision of predictions
against conservativeness; higher β yields wider, safer intervals.
• λ (lambda) – Confidence level: sets the strength of probabilistic
guarantees; higher λ gives stronger but wider guarantees.
• δ (delta) – Exploration radius: defines neighborhood size in active
learning; controls granularity of sampling around uncertain points.
• ε (epsilon) – Performance threshold: the guaranteed lower bound
on system performance, derived from learned model and
confidence level.
34
Parameter Relationships
• β – λ: Together they set the half‑width λβ of the confidence band
around the INN’s central estimate.
• λ – ε: Larger λ (stronger confidence) → wider band λβ →
smaller ε (more conservative).
• α – robustness: α directly controls how adversarial your
allowed shift in the marginal can be.
• δ – exploration: δ fixes the radius of regions you sample
around points of high uncertainty.
• ε – ψ-threshold: ε is finally set to the INN’s worst‑case lower
estimate minus λβ, leading to the main guarantee.
35
Distributional Upper Probabilities
Not sampled, but
guarantee holds
• Convex combination of uniform distributions:
• “Contamination” with an arbitrary distribution:
• Leads to an upper probability
1.0
Sampling distributions
1.0

More Related Content

Similar to Distributionally Robust Statistical Verification with Imprecise Neural Networks (20)

Handling displacement effects in on-body sensor-based activity recognition
Handling displacement effects in on-body sensor-based activity recognitionHandling displacement effects in on-body sensor-based activity recognition
Handling displacement effects in on-body sensor-based activity recognition
Oresti Banos
 
A Study of Wearable Accelerometers Layout for Human Activity Recognition(Asia...
A Study of Wearable Accelerometers Layout for Human Activity Recognition(Asia...A Study of Wearable Accelerometers Layout for Human Activity Recognition(Asia...
A Study of Wearable Accelerometers Layout for Human Activity Recognition(Asia...
sugiuralab
 
5-Propability-2-87.pdf
5-Propability-2-87.pdf5-Propability-2-87.pdf
5-Propability-2-87.pdf
elenashahriari
 
Lesson04_new
Lesson04_newLesson04_new
Lesson04_new
shengvn
 
CMU Trecvid sed11
CMU Trecvid sed11CMU Trecvid sed11
CMU Trecvid sed11
Lu Jiang
 
Normal Distribution
Normal DistributionNormal Distribution
Normal Distribution
CIToolkit
 
Chapter 7 note Estimation.ppt biostatics
Chapter 7 note Estimation.ppt biostaticsChapter 7 note Estimation.ppt biostatics
Chapter 7 note Estimation.ppt biostatics
mohammedibrahim237048
 
Bayesian Inference for front-tracking problems - 2013 IPDO conference
Bayesian Inference for front-tracking problems - 2013 IPDO conferenceBayesian Inference for front-tracking problems - 2013 IPDO conference
Bayesian Inference for front-tracking problems - 2013 IPDO conference
Mélanie Rochoux
 
Sampling Theory Part 3
Sampling Theory Part 3Sampling Theory Part 3
Sampling Theory Part 3
FellowBuddy.com
 
BUS173 Lecture 5.pdf
BUS173 Lecture 5.pdfBUS173 Lecture 5.pdf
BUS173 Lecture 5.pdf
SusantoSaha1
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
Long Beach City College
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or VarianceEstimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
Long Beach City College
 
Normal distribution
Normal distribution  Normal distribution
Normal distribution
Unitedworld School Of Business
 
ANSWERS
ANSWERSANSWERS
ANSWERS
Yogi Sarumaha
 
Estimation powerpoint presentation statistics
Estimation powerpoint presentation statisticsEstimation powerpoint presentation statistics
Estimation powerpoint presentation statistics
TiffanyGailClamucha
 
2-Measures_of_Spreadddddddddddddddd-K.pptx
2-Measures_of_Spreadddddddddddddddd-K.pptx2-Measures_of_Spreadddddddddddddddd-K.pptx
2-Measures_of_Spreadddddddddddddddd-K.pptx
nupuraajesh0202
 
Normal distribution - Unitedworld School of Business
Normal distribution - Unitedworld School of BusinessNormal distribution - Unitedworld School of Business
Normal distribution - Unitedworld School of Business
Arnab Roy Chowdhury
 
Estimating a Population Mean
Estimating a Population MeanEstimating a Population Mean
Estimating a Population Mean
Long Beach City College
 
Business Statistics Chapter 8
Business Statistics Chapter 8Business Statistics Chapter 8
Business Statistics Chapter 8
Lux PP
 
Estimation part I
Estimation part IEstimation part I
Estimation part I
Nadeem Uddin
 
Handling displacement effects in on-body sensor-based activity recognition
Handling displacement effects in on-body sensor-based activity recognitionHandling displacement effects in on-body sensor-based activity recognition
Handling displacement effects in on-body sensor-based activity recognition
Oresti Banos
 
A Study of Wearable Accelerometers Layout for Human Activity Recognition(Asia...
A Study of Wearable Accelerometers Layout for Human Activity Recognition(Asia...A Study of Wearable Accelerometers Layout for Human Activity Recognition(Asia...
A Study of Wearable Accelerometers Layout for Human Activity Recognition(Asia...
sugiuralab
 
5-Propability-2-87.pdf
5-Propability-2-87.pdf5-Propability-2-87.pdf
5-Propability-2-87.pdf
elenashahriari
 
Lesson04_new
Lesson04_newLesson04_new
Lesson04_new
shengvn
 
CMU Trecvid sed11
CMU Trecvid sed11CMU Trecvid sed11
CMU Trecvid sed11
Lu Jiang
 
Normal Distribution
Normal DistributionNormal Distribution
Normal Distribution
CIToolkit
 
Chapter 7 note Estimation.ppt biostatics
Chapter 7 note Estimation.ppt biostaticsChapter 7 note Estimation.ppt biostatics
Chapter 7 note Estimation.ppt biostatics
mohammedibrahim237048
 
Bayesian Inference for front-tracking problems - 2013 IPDO conference
Bayesian Inference for front-tracking problems - 2013 IPDO conferenceBayesian Inference for front-tracking problems - 2013 IPDO conference
Bayesian Inference for front-tracking problems - 2013 IPDO conference
Mélanie Rochoux
 
BUS173 Lecture 5.pdf
BUS173 Lecture 5.pdfBUS173 Lecture 5.pdf
BUS173 Lecture 5.pdf
SusantoSaha1
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
Long Beach City College
 
Estimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or VarianceEstimating a Population Standard Deviation or Variance
Estimating a Population Standard Deviation or Variance
Long Beach City College
 
Estimation powerpoint presentation statistics
Estimation powerpoint presentation statisticsEstimation powerpoint presentation statistics
Estimation powerpoint presentation statistics
TiffanyGailClamucha
 
2-Measures_of_Spreadddddddddddddddd-K.pptx
2-Measures_of_Spreadddddddddddddddd-K.pptx2-Measures_of_Spreadddddddddddddddd-K.pptx
2-Measures_of_Spreadddddddddddddddd-K.pptx
nupuraajesh0202
 
Normal distribution - Unitedworld School of Business
Normal distribution - Unitedworld School of BusinessNormal distribution - Unitedworld School of Business
Normal distribution - Unitedworld School of Business
Arnab Roy Chowdhury
 
Business Statistics Chapter 8
Business Statistics Chapter 8Business Statistics Chapter 8
Business Statistics Chapter 8
Lux PP
 

More from Ivan Ruchkin (20)

Four Principles for Physically Interpretable World Models
Four Principles for Physically Interpretable World ModelsFour Principles for Physically Interpretable World Models
Four Principles for Physically Interpretable World Models
Ivan Ruchkin
 
Accelerating Neural Policy Repair with Preservation via Stability-Plasticity ...
Accelerating Neural Policy Repair with Preservation via Stability-Plasticity ...Accelerating Neural Policy Repair with Preservation via Stability-Plasticity ...
Accelerating Neural Policy Repair with Preservation via Stability-Plasticity ...
Ivan Ruchkin
 
Autonomous Drift Detection and Online Road Friction Estimation
Autonomous Drift Detection and Online Road Friction EstimationAutonomous Drift Detection and Online Road Friction Estimation
Autonomous Drift Detection and Online Road Friction Estimation
Ivan Ruchkin
 
Neuro-Symbolic Bridge: From Perception to Estimation & Control
Neuro-Symbolic Bridge: From Perception to Estimation & ControlNeuro-Symbolic Bridge: From Perception to Estimation & Control
Neuro-Symbolic Bridge: From Perception to Estimation & Control
Ivan Ruchkin
 
Towards Physically Interpretable World Models: Meaningful Weakly Supervised R...
Towards Physically Interpretable World Models: Meaningful Weakly Supervised R...Towards Physically Interpretable World Models: Meaningful Weakly Supervised R...
Towards Physically Interpretable World Models: Meaningful Weakly Supervised R...
Ivan Ruchkin
 
How Safe Will I Be Given What I See? Calibrated Visual Safety Chance Predict...
How Safe Will I Be Given What I See?  Calibrated Visual Safety Chance Predict...How Safe Will I Be Given What I See?  Calibrated Visual Safety Chance Predict...
How Safe Will I Be Given What I See? Calibrated Visual Safety Chance Predict...
Ivan Ruchkin
 
Bridging Dimensions: Confident Reachability for High-Dimensional Controllers...
Bridging Dimensions:  Confident Reachability for High-Dimensional Controllers...Bridging Dimensions:  Confident Reachability for High-Dimensional Controllers...
Bridging Dimensions: Confident Reachability for High-Dimensional Controllers...
Ivan Ruchkin
 
Poster: Bridging Dimensions: Confident Reachability for High-Dimensional Cont...
Poster: Bridging Dimensions: Confident Reachability for High-Dimensional Cont...Poster: Bridging Dimensions: Confident Reachability for High-Dimensional Cont...
Poster: Bridging Dimensions: Confident Reachability for High-Dimensional Cont...
Ivan Ruchkin
 
Bridging Dimensions: Confident Reachability for High-Dimensional Controllers
Bridging Dimensions: Confident Reachability for High-Dimensional ControllersBridging Dimensions: Confident Reachability for High-Dimensional Controllers
Bridging Dimensions: Confident Reachability for High-Dimensional Controllers
Ivan Ruchkin
 
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Ivan Ruchkin
 
Language-Enhanced Latent Representations for Out-of-Distribution Detection in...
Language-Enhanced Latent Representations for Out-of-Distribution Detection in...Language-Enhanced Latent Representations for Out-of-Distribution Detection in...
Language-Enhanced Latent Representations for Out-of-Distribution Detection in...
Ivan Ruchkin
 
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...
Ivan Ruchkin
 
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...
Ivan Ruchkin
 
Repairing Learning-Enabled Controllers While Preserving What Works
Repairing Learning-Enabled Controllers While Preserving What WorksRepairing Learning-Enabled Controllers While Preserving What Works
Repairing Learning-Enabled Controllers While Preserving What Works
Ivan Ruchkin
 
Poster: Conservative Safety Monitors of Stochastic Dynamical Systems
Poster: Conservative Safety Monitors of Stochastic Dynamical SystemsPoster: Conservative Safety Monitors of Stochastic Dynamical Systems
Poster: Conservative Safety Monitors of Stochastic Dynamical Systems
Ivan Ruchkin
 
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Ivan Ruchkin
 
Verify-then-Monitor: Calibration Guarantees for Safety Confidence
Verify-then-Monitor: Calibration Guarantees for Safety ConfidenceVerify-then-Monitor: Calibration Guarantees for Safety Confidence
Verify-then-Monitor: Calibration Guarantees for Safety Confidence
Ivan Ruchkin
 
Causal Repair of Learning-Enabled Cyber-physical Systems
Causal Repair of Learning-Enabled Cyber-physical SystemsCausal Repair of Learning-Enabled Cyber-physical Systems
Causal Repair of Learning-Enabled Cyber-physical Systems
Ivan Ruchkin
 
Conservative Safety Monitors of Stochastic Dynamical Systems
Conservative Safety Monitors of Stochastic Dynamical SystemsConservative Safety Monitors of Stochastic Dynamical Systems
Conservative Safety Monitors of Stochastic Dynamical Systems
Ivan Ruchkin
 
Confidence Composition for Monitors of Verification Assumptions
Confidence Composition for Monitors of Verification AssumptionsConfidence Composition for Monitors of Verification Assumptions
Confidence Composition for Monitors of Verification Assumptions
Ivan Ruchkin
 
Four Principles for Physically Interpretable World Models
Four Principles for Physically Interpretable World ModelsFour Principles for Physically Interpretable World Models
Four Principles for Physically Interpretable World Models
Ivan Ruchkin
 
Accelerating Neural Policy Repair with Preservation via Stability-Plasticity ...
Accelerating Neural Policy Repair with Preservation via Stability-Plasticity ...Accelerating Neural Policy Repair with Preservation via Stability-Plasticity ...
Accelerating Neural Policy Repair with Preservation via Stability-Plasticity ...
Ivan Ruchkin
 
Autonomous Drift Detection and Online Road Friction Estimation
Autonomous Drift Detection and Online Road Friction EstimationAutonomous Drift Detection and Online Road Friction Estimation
Autonomous Drift Detection and Online Road Friction Estimation
Ivan Ruchkin
 
Neuro-Symbolic Bridge: From Perception to Estimation & Control
Neuro-Symbolic Bridge: From Perception to Estimation & ControlNeuro-Symbolic Bridge: From Perception to Estimation & Control
Neuro-Symbolic Bridge: From Perception to Estimation & Control
Ivan Ruchkin
 
Towards Physically Interpretable World Models: Meaningful Weakly Supervised R...
Towards Physically Interpretable World Models: Meaningful Weakly Supervised R...Towards Physically Interpretable World Models: Meaningful Weakly Supervised R...
Towards Physically Interpretable World Models: Meaningful Weakly Supervised R...
Ivan Ruchkin
 
How Safe Will I Be Given What I See? Calibrated Visual Safety Chance Predict...
How Safe Will I Be Given What I See?  Calibrated Visual Safety Chance Predict...How Safe Will I Be Given What I See?  Calibrated Visual Safety Chance Predict...
How Safe Will I Be Given What I See? Calibrated Visual Safety Chance Predict...
Ivan Ruchkin
 
Bridging Dimensions: Confident Reachability for High-Dimensional Controllers...
Bridging Dimensions:  Confident Reachability for High-Dimensional Controllers...Bridging Dimensions:  Confident Reachability for High-Dimensional Controllers...
Bridging Dimensions: Confident Reachability for High-Dimensional Controllers...
Ivan Ruchkin
 
Poster: Bridging Dimensions: Confident Reachability for High-Dimensional Cont...
Poster: Bridging Dimensions: Confident Reachability for High-Dimensional Cont...Poster: Bridging Dimensions: Confident Reachability for High-Dimensional Cont...
Poster: Bridging Dimensions: Confident Reachability for High-Dimensional Cont...
Ivan Ruchkin
 
Bridging Dimensions: Confident Reachability for High-Dimensional Controllers
Bridging Dimensions: Confident Reachability for High-Dimensional ControllersBridging Dimensions: Confident Reachability for High-Dimensional Controllers
Bridging Dimensions: Confident Reachability for High-Dimensional Controllers
Ivan Ruchkin
 
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Ivan Ruchkin
 
Language-Enhanced Latent Representations for Out-of-Distribution Detection in...
Language-Enhanced Latent Representations for Out-of-Distribution Detection in...Language-Enhanced Latent Representations for Out-of-Distribution Detection in...
Language-Enhanced Latent Representations for Out-of-Distribution Detection in...
Ivan Ruchkin
 
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...
​Poster: Zero-shot Safety Prediction for Autonomous Robots with Foundation Wo...
Ivan Ruchkin
 
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...
Curating Naturally Adversarial Datasets for Learning-Enabled Medical Cyber-Ph...
Ivan Ruchkin
 
Repairing Learning-Enabled Controllers While Preserving What Works
Repairing Learning-Enabled Controllers While Preserving What WorksRepairing Learning-Enabled Controllers While Preserving What Works
Repairing Learning-Enabled Controllers While Preserving What Works
Ivan Ruchkin
 
Poster: Conservative Safety Monitors of Stochastic Dynamical Systems
Poster: Conservative Safety Monitors of Stochastic Dynamical SystemsPoster: Conservative Safety Monitors of Stochastic Dynamical Systems
Poster: Conservative Safety Monitors of Stochastic Dynamical Systems
Ivan Ruchkin
 
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Poster: How Safe Am I Given What I See? Calibrated Prediction of Safety Chanc...
Ivan Ruchkin
 
Verify-then-Monitor: Calibration Guarantees for Safety Confidence
Verify-then-Monitor: Calibration Guarantees for Safety ConfidenceVerify-then-Monitor: Calibration Guarantees for Safety Confidence
Verify-then-Monitor: Calibration Guarantees for Safety Confidence
Ivan Ruchkin
 
Causal Repair of Learning-Enabled Cyber-physical Systems
Causal Repair of Learning-Enabled Cyber-physical SystemsCausal Repair of Learning-Enabled Cyber-physical Systems
Causal Repair of Learning-Enabled Cyber-physical Systems
Ivan Ruchkin
 
Conservative Safety Monitors of Stochastic Dynamical Systems
Conservative Safety Monitors of Stochastic Dynamical SystemsConservative Safety Monitors of Stochastic Dynamical Systems
Conservative Safety Monitors of Stochastic Dynamical Systems
Ivan Ruchkin
 
Confidence Composition for Monitors of Verification Assumptions
Confidence Composition for Monitors of Verification AssumptionsConfidence Composition for Monitors of Verification Assumptions
Confidence Composition for Monitors of Verification Assumptions
Ivan Ruchkin
 

Recently uploaded (20)

Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Maarten Verwaest
 
Top Hyper-Casual Game Studio Services
Top  Hyper-Casual  Game  Studio ServicesTop  Hyper-Casual  Game  Studio Services
Top Hyper-Casual Game Studio Services
Nova Carter
 
Cybersecurity Tools and Technologies - Microsoft Certificate
Cybersecurity Tools and Technologies - Microsoft CertificateCybersecurity Tools and Technologies - Microsoft Certificate
Cybersecurity Tools and Technologies - Microsoft Certificate
VICTOR MAESTRE RAMIREZ
 
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Gary Arora
 
DNF 2.0 Implementations Challenges in Nepal
DNF 2.0 Implementations Challenges in NepalDNF 2.0 Implementations Challenges in Nepal
DNF 2.0 Implementations Challenges in Nepal
ICT Frame Magazine Pvt. Ltd.
 
accessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electricaccessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electric
UXPA Boston
 
Top 5 Qualities to Look for in Salesforce Partners in 2025
Top 5 Qualities to Look for in Salesforce Partners in 2025Top 5 Qualities to Look for in Salesforce Partners in 2025
Top 5 Qualities to Look for in Salesforce Partners in 2025
Damco Salesforce Services
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
Understanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdfUnderstanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdf
Fulcrum Concepts, LLC
 
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdfICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
Eryk Budi Pratama
 
Master Data Management - Enterprise Application Integration
Master Data Management - Enterprise Application IntegrationMaster Data Management - Enterprise Application Integration
Master Data Management - Enterprise Application Integration
Sherif Rasmy
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
Agentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community MeetupAgentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community Meetup
Manoj Batra (1600 + Connections)
 
Sustainable_Development_Goals_INDIANWraa
Sustainable_Development_Goals_INDIANWraaSustainable_Development_Goals_INDIANWraa
Sustainable_Development_Goals_INDIANWraa
03ANMOLCHAURASIYA
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Alan Dix
 
Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Limecraft Webinar - 2025.3 release, featuring Content Delivery, Graphic Conte...
Maarten Verwaest
 
Top Hyper-Casual Game Studio Services
Top  Hyper-Casual  Game  Studio ServicesTop  Hyper-Casual  Game  Studio Services
Top Hyper-Casual Game Studio Services
Nova Carter
 
Cybersecurity Tools and Technologies - Microsoft Certificate
Cybersecurity Tools and Technologies - Microsoft CertificateCybersecurity Tools and Technologies - Microsoft Certificate
Cybersecurity Tools and Technologies - Microsoft Certificate
VICTOR MAESTRE RAMIREZ
 
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Gary Arora
 
accessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electricaccessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electric
UXPA Boston
 
Top 5 Qualities to Look for in Salesforce Partners in 2025
Top 5 Qualities to Look for in Salesforce Partners in 2025Top 5 Qualities to Look for in Salesforce Partners in 2025
Top 5 Qualities to Look for in Salesforce Partners in 2025
Damco Salesforce Services
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
Understanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdfUnderstanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdf
Fulcrum Concepts, LLC
 
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdfICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
Eryk Budi Pratama
 
Master Data Management - Enterprise Application Integration
Master Data Management - Enterprise Application IntegrationMaster Data Management - Enterprise Application Integration
Master Data Management - Enterprise Application Integration
Sherif Rasmy
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
Sustainable_Development_Goals_INDIANWraa
Sustainable_Development_Goals_INDIANWraaSustainable_Development_Goals_INDIANWraa
Sustainable_Development_Goals_INDIANWraa
03ANMOLCHAURASIYA
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Alan Dix
 

Distributionally Robust Statistical Verification with Imprecise Neural Networks

  • 1. Distributionally Robust StatisticalVerification with Imprecise Neural Networks Souradeep Dutta*, Michele Caprio*,Vivian Lin, Matthew Cleaveland*, Kuk Jin Jang, Ivan Ruchkin*, Oleg Sokolsky, Insup Lee PRECISE Center, University of Pennsylvania * former members HSCC 2025
  • 2. 2 Self-Evident Truths • Safety guarantees for autonomous systems with high-dimensional states are important — but difficult ○ Reachability analysis struggles to scale in state dimensions ○ Statistical methods often assume fixed input distributions → vulnerable to distribution shifts
  • 3. 3 Our Contributions (= Agenda) 1. Formulation of distributionally robust statistical verification 2. Imprecise neural networks (INNs) to represent state-wise performance uncertainty 3. Scalable uncertainty-guided active learning algorithm 4. Empirical results on Mujoco with RL controllers
  • 4. 4 Our Contributions (= Agenda) 1. Formulation of distributionally robust statistical verification 2. Imprecise neural networks (INNs) to represent state-wise performance uncertainty 3. Scalable uncertainty-guided active learning algorithm 4. Empirical results on Mujoco with RL controllers
  • 5. • Autonomous system: , trajectory • Aspiration: guarantee – Find and distribution set s.t. the above holds over any • Formally: 5 Distributionally Robust StatisticalVerification Distributions of
  • 6. 6 Our Contributions (= Agenda) 1. Formulation of distributionally robust statistical verification 2. Imprecise neural networks (INNs) to represent state-wise performance uncertainty 3. Scalable uncertainty-guided active learning algorithm 4. Empirical results on Mujoco with RL controllers
  • 7. 7 Imprecise Neural Networks (INNs) • INN := a set of N feedforward NNs • Ensemble bounds: – Property: contains the predicted function with avg. chance – Uncertainty := the max disagreement among models: • Idea: use an INN to: (a) give performance guarantees; (b) guide sampling of states
  • 8. • Convex combination of uniform distributions: • “Contamination” with an arbitrary distribution Q: 8 Construction of Distribution Sets Not sampled, but within our distribution set 1.0 1.0 Uniform distributions Combined region
  • 9. • Given region , Sherlock verifier computes • Set the performance threshold , which applies to all distributions in the distribution set • Our distributional guarantee: the expected probability of is greater than for all 9 Guarantee on System Performance
  • 10. 10 Our Contributions (= Agenda) 1. Formulation of distributionally robust statistical verification 2. Imprecise neural networks (INNs) to represent state-wise performance uncertainty 3. Scalable uncertainty-guided active learning algorithm 4. Empirical results on Mujoco with RL controllers
  • 11. 11 Two-Step Overall Approach 1. Active learning: Sample a high-uncertainty batch & train an INN 2. Verification: Instantiate performance bounds with the INN
  • 12. 12 Active Learning Algorithm Greedy strategy for exploring the high-dimensional space: 1. Draw initial samples from and label them with the true performance function 2. Train an INN to estimate the bounds on and compute uncertainty 3. Use Sherlock verifier to identify the point with highest 4. Sample a batch from the point’s δ-ball neighborhood 5. Retrain the INN with the updated dataset
  • 13. 13 Our Contributions (= Agenda) 1. Formulation of distributionally robust statistical verification 2. Imprecise neural networks (INNs) to represent state-wise performance uncertainty 3. Scalable uncertainty-guided active learning algorithm 4. Empirical results on Mujoco with RL controllers
  • 14. 14 Experiments: Environments • 10 MuJoCo environments from OpenAI Gym • Control policies trained using Deep Deterministic Policy Gradient • Performance function to verify: average reward overT steps Half-Cheetah 18 Dimensions Ant 29 Dimensions Humanoid 47 Dimensions
  • 15. INN (Ours) … from a family of distributions. 15 Experiments: Baselines • For training distribution … • To obtain similar guarantees, the test point must be... Conformal Prediction … from the training distribution. Robust Conformal Prediction … a bounded distance from the training distribution.
  • 16. 16 Experiments: Methodology • Evaluation procedure: o Set tightness β, exploration radius δ, and target coverage 1-λ-1 = 95% o Collect a starting set of states o Compute the distribution and performance threshold ε o Sample initial states from that distribution o Get the true ψ values, compare them to the INN/baseline intervals • Evaluation metrics: o Coverage should be above the target 95% o Interval width should be low • Two distributional settings for initial states: o In-distribution: matches the INN training distribution o Out-of-distribution: not the INN training distribution (but inside our set)
  • 17. 17 Evaluation Highlights • INNs maintain ≥99% coverage, unlike baselines • INNs yield slightly larger intervals than baselines • INNs demonstrate robustness to distribution shift • Our verification takes dozens of seconds
  • 18. 18 Results: In-Distribution L Conformal Prediction INN (Ours) Environment State Dimension Coverage (%) Interval Size Coverage (%) Interval Size Ant 29 93 3.2 99 4.2 Half-Cheetah 18 93 4.9 x 10-1 100 3 Hopper 12 97 2.9 100 2.2 Humanoid 47 97 2.7 100 3 Humanoid-Standup 47 96 143 99 260 Inverted Double Pend. 6 95 6.7 x 10-3 100 5.1 x 10-2 Inverted Pendulum 4 95 1.2 x 10-3 100 4.1 x 10-2 Reacher 8 93 3.3 x 10-2 100 7.6 x 10-2 Swimmer 10 93 2.8 x 10-3 100 4.9 x 10-2 Walker2d 18 95 5.2 99 5.8
  • 19. 19 Results: In-Distribution L Conformal Prediction INN (Ours) Environment State Dimension Coverage (%) Interval Size Coverage (%) Interval Size Ant 29 93 3.2 99 4.2 Half-Cheetah 18 93 4.9 x 10-1 100 3 Hopper 12 97 2.9 100 2.2 Humanoid 47 97 2.7 100 3 Humanoid-Standup 47 96 143 99 260 Inverted Double Pend. 6 95 6.7 x 10-3 100 5.1 x 10-2 Inverted Pendulum 4 95 1.2 x 10-3 100 4.1 x 10-2 Reacher 8 93 3.3 x 10-2 100 7.6 x 10-2 Swimmer 10 93 2.8 x 10-3 100 4.9 x 10-2 Walker2d 18 95 5.2 99 5.8 Conformal prediction does not always achieve the target coverage of 95%
  • 20. 20 Results: In-Distribution L Conformal Prediction INN (Ours) Environment State Dimension Coverage (%) Interval Size Coverage (%) Interval Size Ant 29 93 3.2 99 4.2 Half-Cheetah 18 93 4.9 x 10-1 100 3 Hopper 12 97 2.9 100 2.2 Humanoid 47 97 2.7 100 3 Humanoid-Standup 47 96 143 99 260 Inverted Double Pend. 6 95 6.7 x 10-3 100 5.1 x 10-2 Inverted Pendulum 4 95 1.2 x 10-3 100 4.1 x 10-2 Reacher 8 93 3.3 x 10-2 100 7.6 x 10-2 Swimmer 10 93 2.8 x 10-3 100 4.9 x 10-2 Walker2d 18 95 5.2 99 5.8 INNs always exceed the target coverage rate... … with some small cost in interval size.
  • 21. 21 Results: Out-of-Distribution L Conformal Prediction Robust Conformal Prediction INN (Ours) Environment State Dimension Coverage (%) Interval Size Coverage (%) Interval Size Coverage (%) Interval Size Coverage (%) Interval Size Ant 29 94 3.2 95 3.5 98 4.8 100 4.3 Half-Cheetah 18 93 4.9 x 10-1 95 5.2 x 10-1 98 6.5 x 10-1 100 1.9 Hopper 12 97 2.9 98 3.3 99 3.3 100 2.5 Humanoid 47 98 2.7 97 2.9 99 3.5 99 2.8 Humanoid-Standup 47 95 143 97 167 99 297 99 305 Inverted Double Pend. 6 94 6.7 x 10-3 97 7.5 x 10-3 99 8.8 x 10-3 100 5.1 x 10-2 Inverted Pendulum 4 100 1.2 x 10-3 96 1.2 x 10-3 99 1.2 x 10-3 100 4.2 x 10-2 Reacher 8 91 3.3 x 10-2 94 3.4 x 10-2 99 4.2 x 10-2 100 8.5 x 10-2 Swimmer 10 92 2.8 x 10-3 94 3.0 x 10-3 98 9.0 x 10-3 100 5 x 10-2 Walker2d 18 94 5.2 95 5.4 99 7.1 99 5.2
  • 22. L Conformal Prediction Robust Conformal Prediction INN (Ours) Environment State Dimension Coverage (%) Interval Size Coverage (%) Interval Size Coverage (%) Interval Size Coverage (%) Interval Size Ant 29 94 3.2 95 3.5 98 4.8 100 4.3 Half-Cheetah 18 93 4.9 x 10-1 95 5.2 x 10-1 98 6.5 x 10-1 100 1.9 Hopper 12 97 2.9 98 3.3 99 3.3 100 2.5 Humanoid 47 98 2.7 97 2.9 99 3.5 99 2.8 Humanoid-Standup 47 95 143 97 167 99 297 99 305 Inverted Double Pend. 6 94 6.7 x 10-3 97 7.5 x 10-3 99 8.8 x 10-3 100 5.1 x 10-2 Inverted Pendulum 4 100 1.2 x 10-3 96 1.2 x 10-3 99 1.2 x 10-3 100 4.2 x 10-2 Reacher 8 91 3.3 x 10-2 94 3.4 x 10-2 99 4.2 x 10-2 100 8.5 x 10-2 Swimmer 10 92 2.8 x 10-3 94 3.0 x 10-3 98 9.0 x 10-3 100 5 x 10-2 Walker2d 18 94 5.2 95 5.4 99 7.1 99 5.2 22 Results: Out-of-Distribution Conformal prediction does not always achieve the target coverage
  • 23. L Conformal Prediction Robust Conformal Prediction INN (Ours) Environment State Dimension Coverage (%) Interval Size Coverage (%) Interval Size Coverage (%) Interval Size Coverage (%) Interval Size Ant 29 94 3.2 95 3.5 98 4.8 100 4.3 Half-Cheetah 18 93 4.9 x 10-1 95 5.2 x 10-1 98 6.5 x 10-1 100 1.9 Hopper 12 97 2.9 98 3.3 99 3.3 100 2.5 Humanoid 47 98 2.7 97 2.9 99 3.5 99 2.8 Humanoid-Standup 47 95 143 97 167 99 297 99 305 Inverted Double Pend. 6 94 6.7 x 10-3 97 7.5 x 10-3 99 8.8 x 10-3 100 5.1 x 10-2 Inverted Pendulum 4 100 1.2 x 10-3 96 1.2 x 10-3 99 1.2 x 10-3 100 4.2 x 10-2 Reacher 8 91 3.3 x 10-2 94 3.4 x 10-2 99 4.2 x 10-2 100 8.5 x 10-2 Swimmer 10 92 2.8 x 10-3 94 3.0 x 10-3 98 9.0 x 10-3 100 5 x 10-2 Walker2d 18 94 5.2 95 5.4 99 7.1 99 5.2 23 Results: Out-of-Distribution Robust conformal prediction almost always achieves the target coverage
  • 24. L Conformal Prediction Robust Conformal Prediction INN (Ours) Environment State Dimension Coverage (%) Interval Size Coverage (%) Interval Size Coverage (%) Interval Size Coverage (%) Interval Size Ant 29 94 3.2 95 3.5 98 4.8 100 4.3 Half-Cheetah 18 93 4.9 x 10-1 95 5.2 x 10-1 98 6.5 x 10-1 100 1.9 Hopper 12 97 2.9 98 3.3 99 3.3 100 2.5 Humanoid 47 98 2.7 97 2.9 99 3.5 99 2.8 Humanoid-Standup 47 95 143 97 167 99 297 99 305 Inverted Double Pend. 6 94 6.7 x 10-3 97 7.5 x 10-3 99 8.8 x 10-3 100 5.1 x 10-2 Inverted Pendulum 4 100 1.2 x 10-3 96 1.2 x 10-3 99 1.2 x 10-3 100 4.2 x 10-2 Reacher 8 91 3.3 x 10-2 94 3.4 x 10-2 99 4.2 x 10-2 100 8.5 x 10-2 Swimmer 10 92 2.8 x 10-3 94 3.0 x 10-3 98 9.0 x 10-3 100 5 x 10-2 Walker2d 18 94 5.2 95 5.4 99 7.1 99 5.2 24 Results: Out-of-Distribution Robust conformal prediction almost always achieves the target coverage Coverage increases with the amount of allowable distribution shift
  • 25. L Conformal Prediction Robust Conformal Prediction INN (Ours) Environment State Dimension Coverage (%) Interval Size Coverage (%) Interval Size Coverage (%) Interval Size Coverage (%) Interval Size Ant 29 94 3.2 95 3.5 98 4.8 100 4.3 Half-Cheetah 18 93 4.9 x 10-1 95 5.2 x 10-1 98 6.5 x 10-1 100 1.9 Hopper 12 97 2.9 98 3.3 99 3.3 100 2.5 Humanoid 47 98 2.7 97 2.9 99 3.5 99 2.8 Humanoid-Standup 47 95 143 97 167 99 297 99 305 Inverted Double Pend. 6 94 6.7 x 10-3 97 7.5 x 10-3 99 8.8 x 10-3 100 5.1 x 10-2 Inverted Pendulum 4 100 1.2 x 10-3 96 1.2 x 10-3 99 1.2 x 10-3 100 4.2 x 10-2 Reacher 8 91 3.3 x 10-2 94 3.4 x 10-2 99 4.2 x 10-2 100 8.5 x 10-2 Swimmer 10 92 2.8 x 10-3 94 3.0 x 10-3 98 9.0 x 10-3 100 5 x 10-2 Walker2d 18 94 5.2 95 5.4 99 7.1 99 5.2 25 Results: Out-of-Distribution … but so does interval size. Coverage increases with the amount of allowable distribution shift... Robust conformal prediction almost always achieves the target coverage
  • 26. L Conformal Prediction Robust Conformal Prediction INN (Ours) Environment State Dimension Coverage (%) Interval Size Coverage (%) Interval Size Coverage (%) Interval Size Coverage (%) Interval Size Ant 29 94 3.2 95 3.5 98 4.8 100 4.3 Half-Cheetah 18 93 4.9 x 10-1 95 5.2 x 10-1 98 6.5 x 10-1 100 1.9 Hopper 12 97 2.9 98 3.3 99 3.3 100 2.5 Humanoid 47 98 2.7 97 2.9 99 3.5 99 2.8 Humanoid-Standup 47 95 143 97 167 99 297 99 305 Inverted Double Pend. 6 94 6.7 x 10-3 97 7.5 x 10-3 99 8.8 x 10-3 100 5.1 x 10-2 Inverted Pendulum 4 100 1.2 x 10-3 96 1.2 x 10-3 99 1.2 x 10-3 100 4.2 x 10-2 Reacher 8 91 3.3 x 10-2 94 3.4 x 10-2 99 4.2 x 10-2 100 8.5 x 10-2 Swimmer 10 92 2.8 x 10-3 94 3.0 x 10-3 98 9.0 x 10-3 100 5 x 10-2 Walker2d 18 94 5.2 95 5.4 99 7.1 99 5.2 26 Results: Out-of-Distribution INNs always achieve the target coverage of 95%
  • 27. 27 Results:Time toVerify L Environment State Dimension Execution Time (s) Mean ± Std Execution Time (s) Mean ± Std Ant 29 127 ± 199 184 ± 289 Half-Cheetah 18 10 ± 3 13 ± 7 Hopper 12 15 ± 13 6 ± 3 Humanoid 47 31 ± 11 149 ± 209 Humanoid-Standup 47 20 ± 5 51 ± 46 Inverted Double Pend. 6 2.4 ± 1 2.5 ± 1 Inverted Pendulum 4 1.3 ± 0.3 1.5 ± 0.3 Reacher 8 9 ± 3.7 40 ± 46 Swimmer 10 5 ± 1.4 6 ± 2.6 Walker2d 18 8.5 ± 3.8 11.6 ± 5.1 At design stage, execution times are reasonable... even for higher dimensions
  • 28. L Environment State Dimension Execution Time (s) Mean ± Std Execution Time (s) Mean ± Std Ant 29 127 ± 199 184 ± 289 Half-Cheetah 18 10 ± 3 13 ± 7 Hopper 12 15 ± 13 6 ± 3 Humanoid 47 31 ± 11 149 ± 209 Humanoid-Standup 47 20 ± 5 51 ± 46 Inverted Double Pend. 6 2.4 ± 1 2.5 ± 1 Inverted Pendulum 4 1.3 ± 0.3 1.5 ± 0.3 Reacher 8 9 ± 3.7 40 ± 46 Swimmer 10 5 ± 1.4 6 ± 2.6 Walker2d 18 8.5 ± 3.8 11.6 ± 5.1 28 Results:Time toVerify Execution takes longer when the search space is larger
  • 29. Limitations Strengths: • Robustness to distributional (epistemic) uncertainty • No assumptions on the system dynamics or performance function • Handles dozens of state dimensions Limitations: • Conservatism in both coverage and intervals • A quite particular shape of the distribution set • Many hyperparameters to tune 29
  • 30. 30 Summary 1. Formulation of distributionally robust statistical verification 2. INNs: imprecise neural networks 3. Scalable active learning 4. Mujoco experiments
  • 32. 32 Experimental Setup • Evaluated on 10 Mujoco environments:Ant, Humanoid, Hopper, … • Control policies trained using Deep Deterministic Policy Gradient (DDPG). • Each policy’s performance is the temporal average reward R_avg over T steps. • INN architecture: ensemble of 3 DNNs (2 layers, width 50 neurons, ReLU activations) • Confidence level λ = 20 (95%), exploration δ = 0.05, M = 20 active learning iterations. • Comparison with Conformal Prediction (CP) and Robust CP (RCP) under in- and out-of-distribution.
  • 33. 33 Parameters • α (alpha) – Contamination level: controls robustness to distribution shift; higher α means more adversarial tolerance. • β (beta) – Interval tightness: balances precision of predictions against conservativeness; higher β yields wider, safer intervals. • λ (lambda) – Confidence level: sets the strength of probabilistic guarantees; higher λ gives stronger but wider guarantees. • δ (delta) – Exploration radius: defines neighborhood size in active learning; controls granularity of sampling around uncertain points. • ε (epsilon) – Performance threshold: the guaranteed lower bound on system performance, derived from learned model and confidence level.
  • 34. 34 Parameter Relationships • β – λ: Together they set the half‑width λβ of the confidence band around the INN’s central estimate. • λ – ε: Larger λ (stronger confidence) → wider band λβ → smaller ε (more conservative). • α – robustness: α directly controls how adversarial your allowed shift in the marginal can be. • δ – exploration: δ fixes the radius of regions you sample around points of high uncertainty. • ε – ψ-threshold: ε is finally set to the INN’s worst‑case lower estimate minus λβ, leading to the main guarantee.
  • 35. 35 Distributional Upper Probabilities Not sampled, but guarantee holds • Convex combination of uniform distributions: • “Contamination” with an arbitrary distribution: • Leads to an upper probability 1.0 Sampling distributions 1.0
  翻译: