SlideShare a Scribd company logo
PRINS: Scalable Model Inference for
Component-based System Logs*
Donghwan Shin1), Domenico Bianculli2), and Lionel Briand2,3)
1) University of She
ffi
eld
2) University of Luxembourg
3) University of Ottawa
* This presentation is for the Journal-First Track at ICSE 2023; the original paper was accepted in Empirical Software Engineering (EMSE) journal.
A
B
Y
Z
…
Model
Inference
Technique
ith execution
20190621.001 A
20190621.002 B
20190621.002 Z
20190621.002 B
…
ith execution
20190621.001 A
20190621.002 B
20190621.002 Z
20190621.002 B
…
ith execution
20221101.001 A
20221101.004 B
20221101.011 Z
20221101.013 B
20221101.101 Y
…
System Logs System Model
Log = A sequence of log entries representing a single execution
fl
ow
Too large
Not Scalable
Enough
No Models
2
081111 090711 25010 INFO dfs.DataNode$DataXceiver: Receiving block blk_5652408071925555972 src: /10.251.65.203:38382 dest: /10.251.65.203:50010
081111 090711 25181 INFO dfs.DataNode$DataXceiver: Receiving block blk_5652408071925555972 src: /10.251.27.63:54730 dest: /10.251.27.63:50010
081111 090711 25487 INFO dfs.DataNode$DataXceiver: Receiving block blk_5652408071925555972 src: /10.251.65.203:40305 dest: /10.251.65.203:50010
081111 090711 00031 INFO dfs.FSNamesystem: BLOCK* NameSystem.allocateBlock: /user/root/rand8/_temporary/part-00156. blk_5652408071925555972
081111 090756 25011 INFO dfs.DataNode$PacketResponder: PacketResponder 2 for block blk_5652408071925555972 terminating
081111 090756 25011 INFO dfs.DataNode$PacketResponder: Received block blk_5652408071925555972 of size 67108864 from /10.251.65.203
081111 090756 25184 INFO dfs.DataNode$PacketResponder: PacketResponder 0 for block blk_5652408071925555972 terminating
081111 090756 25184 INFO dfs.DataNode$PacketResponder: Received block blk_5652408071925555972 of size 67108864 from /10.251.27.63
081111 090756 25488 INFO dfs.DataNode$PacketResponder: PacketResponder 1 for block blk_5652408071925555972 terminating
081111 090756 25488 INFO dfs.DataNode$PacketResponder: Received block blk_5652408071925555972 of size 67108864 from /10.251.65.203
081111 090756 00027 INFO dfs.FSNamesystem: BLOCK* NameSystem.addStoredBlock: blockMap updated: 10.251.71.16:50010 is added to blk_5652408071925555972
081111 111345 00013 INFO dfs.DataBlockScanner: Veri
fi
cation succeeded for blk_5652408071925555972
Example HDFS Log
Component IDs
3
Observation: Systems are often composed of multiple components
What if we infer INDIVIDUAL component
models and then stitch them together?
4
System Logs
eA
1
eA
2
eB
4
eB
4
eA
1
eA
2
eB
4
eA
1
eA
3
eB
5
eA
1
eA
2
eB
4
eB
4
eA
1
eA
2
eB
4
eA
1
eA
3
eB
5
ax
bx
dy
dy
ax
bx
dy
ax
cx
ey
PRINS: PRojection-INference-Stitching
s0
s1
s2
s3
s4
a b
c
d
e
INference
Model of x
Model of y
INference
Component x
Component y
PRojection
eA
1
eA
2
eA
1
eA
2
eA
1
eA
3
eA
1
eA
2
eA
1
eA
2
eA
1
eA
3
ax
bx
ax
bx
ax
cx
eB
4
eB
4
eB
4
eB
5
eB
4
eB
4
eB
4
eB
5
dy
dy
dy
ey
s0
s1
s2
a b
c
d
s4
e
Stitching
System Model
+ (optional) Heuristic
Determinisation (HD)
Research Questions
• RQ1: How does the execution time of PRINS change according to the parallel
inference tasks in the inference stage?
• RQ2: How does the execution time of change according to parameter ?
• RQ3: How does the accuracy of the models (in the form of gFSMs) generated
by change according to parameter ?
• RQ4: How fast is PRINS when compared to state-of-the-art model inference
techniques?
• RQ5: How accurate are the models generated by PRINS compared to those
generated by state-of-the-art model inference techniques?
HDu u
HDu u
6
Parallel
inference
Heuristic
Determinisation
PRINS
(compared
to
MINT)
Research Questions
• RQ1: How does the execution time of PRINS change according to the parallel
inference tasks in the inference stage?
• RQ2: How does the execution time of change according to parameter ?
• RQ3: How does the accuracy of the models (in the form of gFSMs) generated
by change according to parameter ?
• RQ4: How fast is PRINS when compared to state-of-the-art model inference
techniques?
• RQ5: How accurate are the models generated by PRINS compared to those
generated by state-of-the-art model inference techniques?
HDu u
HDu u
7
Parallel
inference
Heuristic
Determinisation
PRINS
(compared
to
MINT)
RQ4: Execution Time of PRINS compared to MINT
2 4 6 8
5
10
15
20
Execution
Time
(s)
Hadoop
MINT
PRINS-N
PRINS-P
2 4 6 8
0
5000
10000
HDFS
MINT
PRINS-N
PRINS-P
2 4 6 8
0
5000
10000
15000
Linux
MINT
PRINS-N
PRINS-P
2 4
0
2500
5000
7500
10000
Zookeeper
MINT
PRINS-N
PRINS-P
2 4 6 8
Duplication Factor
0
5000
10000
15000
Execution
Time
(s)
CoreSync
MINT
PRINS-N
PRINS-P
2 4 6 8
Duplication Factor
2.5
5.0
7.5
10.0
12.5
NGLClient
MINT
PRINS-N
PRINS-P
2 4 6 8
Duplication Factor
0
10000
20000
30000
Oobelib
MINT
PRINS-N
PRINS-P
2 4 6 8
Duplication Factor
0
5000
10000
15000
PDApp
MINT
PRINS-N
PRINS-P
PRINS-N = PRINS with No parallel inference (HD is enabled to be fair with MINT)
PRINS-P = PRINS with Parallel inference (HD is enabled to be fair with MINT)
Duplication Factor = How many times each log is duplicated to increase the input log size systematically
8
RQ5: Accuracy of PRINS compared to MINT
9
Downside: Size of System Models
10
Contributions
• Tame the scalability issue of model
inference using divide-and-conquer.
• Present an empirical evaluation of
PRINS and its comparison with the
state-of-the-art model inference tool.
• It works especially well when the
components appearing in di
ff
erent
executions are similar.
• Provide a publicly available
implementation of PRINS.
11
Paper (Open Access) Replication Package
Ad

More Related Content

Similar to PRINS: Scalable Model Inference for Component-based System Logs (20)

software effort estimation
 software effort estimation software effort estimation
software effort estimation
Besharam Dil
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
Alexios Lekidis
 
Busy Polling: Past, Present, Future
Busy Polling: Past,      Present, FutureBusy Polling: Past,      Present, Future
Busy Polling: Past, Present, Future
VenkatPulimi
 
Deep Learning Initiative @ NECSTLab
Deep Learning Initiative @ NECSTLabDeep Learning Initiative @ NECSTLab
Deep Learning Initiative @ NECSTLab
NECST Lab @ Politecnico di Milano
 
SDN and metrics from the SDOs
SDN and metrics from the SDOsSDN and metrics from the SDOs
SDN and metrics from the SDOs
Open Networking Summit
 
slides
slidesslides
slides
Cesar Bernardini
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...
I MT
 
A GitOps model for High Availability and Disaster Recovery on EKS
A GitOps model for High Availability and Disaster Recovery on EKSA GitOps model for High Availability and Disaster Recovery on EKS
A GitOps model for High Availability and Disaster Recovery on EKS
Weaveworks
 
The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes by...
The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes by...The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes by...
The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes by...
ScyllaDB
 
Data acquisition and storage in Wireless Sensor Network
Data acquisition and storage in Wireless Sensor NetworkData acquisition and storage in Wireless Sensor Network
Data acquisition and storage in Wireless Sensor Network
Rutvik Pensionwar
 
Civil Infrastructure Platform: Industrial Grade SLTS Kernel and Base-layer De...
Civil Infrastructure Platform: Industrial Grade SLTS Kernel and Base-layer De...Civil Infrastructure Platform: Industrial Grade SLTS Kernel and Base-layer De...
Civil Infrastructure Platform: Industrial Grade SLTS Kernel and Base-layer De...
Yoshitake Kobayashi
 
Learn about Tensorflow for Deep Learning now! Part 1
Learn about Tensorflow for Deep Learning now! Part 1Learn about Tensorflow for Deep Learning now! Part 1
Learn about Tensorflow for Deep Learning now! Part 1
Tyrone Systems
 
Simulation Management and Execution Control
Simulation Management and Execution ControlSimulation Management and Execution Control
Simulation Management and Execution Control
Daniel Wheeler
 
Addressing Network Operator Challenges in YANG push Data Mesh Integration
Addressing Network Operator Challenges in YANG push Data Mesh IntegrationAddressing Network Operator Challenges in YANG push Data Mesh Integration
Addressing Network Operator Challenges in YANG push Data Mesh Integration
ThomasGraf42
 
optimizing_ceph_flash
optimizing_ceph_flashoptimizing_ceph_flash
optimizing_ceph_flash
Vijayendra Shamanna
 
It5304 syllabus
It5304 syllabusIt5304 syllabus
It5304 syllabus
nimal83
 
Enhanced Embedded Linux Board Support Package Field Upgrade – A Cost Effectiv...
Enhanced Embedded Linux Board Support Package Field Upgrade – A Cost Effectiv...Enhanced Embedded Linux Board Support Package Field Upgrade – A Cost Effectiv...
Enhanced Embedded Linux Board Support Package Field Upgrade – A Cost Effectiv...
ijesajournal
 
Support of Hostname and Sequencing in YANG Notifications
Support of Hostname and Sequencing in YANG NotificationsSupport of Hostname and Sequencing in YANG Notifications
Support of Hostname and Sequencing in YANG Notifications
ThomasGraf42
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
Alex Maestretti
 
re:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflix
Brendan Gregg
 
software effort estimation
 software effort estimation software effort estimation
software effort estimation
Besharam Dil
 
Busy Polling: Past, Present, Future
Busy Polling: Past,      Present, FutureBusy Polling: Past,      Present, Future
Busy Polling: Past, Present, Future
VenkatPulimi
 
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...
Colloque IMT -04/04/2019- L'IA au cœur des mutations industrielles - L'IA pou...
I MT
 
A GitOps model for High Availability and Disaster Recovery on EKS
A GitOps model for High Availability and Disaster Recovery on EKSA GitOps model for High Availability and Disaster Recovery on EKS
A GitOps model for High Availability and Disaster Recovery on EKS
Weaveworks
 
The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes by...
The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes by...The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes by...
The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes by...
ScyllaDB
 
Data acquisition and storage in Wireless Sensor Network
Data acquisition and storage in Wireless Sensor NetworkData acquisition and storage in Wireless Sensor Network
Data acquisition and storage in Wireless Sensor Network
Rutvik Pensionwar
 
Civil Infrastructure Platform: Industrial Grade SLTS Kernel and Base-layer De...
Civil Infrastructure Platform: Industrial Grade SLTS Kernel and Base-layer De...Civil Infrastructure Platform: Industrial Grade SLTS Kernel and Base-layer De...
Civil Infrastructure Platform: Industrial Grade SLTS Kernel and Base-layer De...
Yoshitake Kobayashi
 
Learn about Tensorflow for Deep Learning now! Part 1
Learn about Tensorflow for Deep Learning now! Part 1Learn about Tensorflow for Deep Learning now! Part 1
Learn about Tensorflow for Deep Learning now! Part 1
Tyrone Systems
 
Simulation Management and Execution Control
Simulation Management and Execution ControlSimulation Management and Execution Control
Simulation Management and Execution Control
Daniel Wheeler
 
Addressing Network Operator Challenges in YANG push Data Mesh Integration
Addressing Network Operator Challenges in YANG push Data Mesh IntegrationAddressing Network Operator Challenges in YANG push Data Mesh Integration
Addressing Network Operator Challenges in YANG push Data Mesh Integration
ThomasGraf42
 
It5304 syllabus
It5304 syllabusIt5304 syllabus
It5304 syllabus
nimal83
 
Enhanced Embedded Linux Board Support Package Field Upgrade – A Cost Effectiv...
Enhanced Embedded Linux Board Support Package Field Upgrade – A Cost Effectiv...Enhanced Embedded Linux Board Support Package Field Upgrade – A Cost Effectiv...
Enhanced Embedded Linux Board Support Package Field Upgrade – A Cost Effectiv...
ijesajournal
 
Support of Hostname and Sequencing in YANG Notifications
Support of Hostname and Sequencing in YANG NotificationsSupport of Hostname and Sequencing in YANG Notifications
Support of Hostname and Sequencing in YANG Notifications
ThomasGraf42
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
Alex Maestretti
 
re:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflix
Brendan Gregg
 

More from Lionel Briand (20)

FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
Lionel Briand
 
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Lionel Briand
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
Lionel Briand
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
Lionel Briand
 
Metamorphic Testing for Web System Security
Metamorphic Testing for Web System SecurityMetamorphic Testing for Web System Security
Metamorphic Testing for Web System Security
Lionel Briand
 
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Lionel Briand
 
Fuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation TestingFuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation Testing
Lionel Briand
 
Data-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical SystemsData-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical Systems
Lionel Briand
 
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled SystemsMany-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Lionel Briand
 
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
Lionel Briand
 
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Lionel Briand
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
Lionel Briand
 
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Lionel Briand
 
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and SafetyAutonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Lionel Briand
 
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Lionel Briand
 
Reinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case PrioritizationReinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case Prioritization
Lionel Briand
 
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Lionel Briand
 
On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...
Lionel Briand
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Lionel Briand
 
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Lionel Briand
 
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
FlakyFix: Using Large Language Models for Predicting Flaky Test Fix Categorie...
Lionel Briand
 
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Requirements in Engineering AI- Enabled Systems: Open Problems and Safe AI Sy...
Lionel Briand
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
Lionel Briand
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
Lionel Briand
 
Metamorphic Testing for Web System Security
Metamorphic Testing for Web System SecurityMetamorphic Testing for Web System Security
Metamorphic Testing for Web System Security
Lionel Briand
 
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Lionel Briand
 
Fuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation TestingFuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation Testing
Lionel Briand
 
Data-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical SystemsData-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical Systems
Lionel Briand
 
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled SystemsMany-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Lionel Briand
 
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
Lionel Briand
 
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Lionel Briand
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
Lionel Briand
 
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Lionel Briand
 
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and SafetyAutonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Lionel Briand
 
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Lionel Briand
 
Reinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case PrioritizationReinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case Prioritization
Lionel Briand
 
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Lionel Briand
 
On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...
Lionel Briand
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Lionel Briand
 
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Lionel Briand
 
Ad

Recently uploaded (20)

UI/UX Design & Development and Servicess
UI/UX Design & Development and ServicessUI/UX Design & Development and Servicess
UI/UX Design & Development and Servicess
marketing810348
 
Mobile Application Developer Dubai | Custom App Solutions by Ajath
Mobile Application Developer Dubai | Custom App Solutions by AjathMobile Application Developer Dubai | Custom App Solutions by Ajath
Mobile Application Developer Dubai | Custom App Solutions by Ajath
Ajath Infotech Technologies LLC
 
Unit Two - Java Architecture and OOPS
Unit Two  -   Java Architecture and OOPSUnit Two  -   Java Architecture and OOPS
Unit Two - Java Architecture and OOPS
Nabin Dhakal
 
Wilcom Embroidery Studio Crack Free Latest 2025
Wilcom Embroidery Studio Crack Free Latest 2025Wilcom Embroidery Studio Crack Free Latest 2025
Wilcom Embroidery Studio Crack Free Latest 2025
Web Designer
 
Hydraulic Modeling And Simulation Software Solutions.pptx
Hydraulic Modeling And Simulation Software Solutions.pptxHydraulic Modeling And Simulation Software Solutions.pptx
Hydraulic Modeling And Simulation Software Solutions.pptx
julia smits
 
iTop VPN With Crack Lifetime Activation Key
iTop VPN With Crack Lifetime Activation KeyiTop VPN With Crack Lifetime Activation Key
iTop VPN With Crack Lifetime Activation Key
raheemk1122g
 
How I solved production issues with OpenTelemetry
How I solved production issues with OpenTelemetryHow I solved production issues with OpenTelemetry
How I solved production issues with OpenTelemetry
Cees Bos
 
How to Troubleshoot 9 Types of OutOfMemoryError
How to Troubleshoot 9 Types of OutOfMemoryErrorHow to Troubleshoot 9 Types of OutOfMemoryError
How to Troubleshoot 9 Types of OutOfMemoryError
Tier1 app
 
File Viewer Plus 7.5.5.49 Crack Full Version
File Viewer Plus 7.5.5.49 Crack Full VersionFile Viewer Plus 7.5.5.49 Crack Full Version
File Viewer Plus 7.5.5.49 Crack Full Version
raheemk1122g
 
GC Tuning: A Masterpiece in Performance Engineering
GC Tuning: A Masterpiece in Performance EngineeringGC Tuning: A Masterpiece in Performance Engineering
GC Tuning: A Masterpiece in Performance Engineering
Tier1 app
 
Programs as Values - Write code and don't get lost
Programs as Values - Write code and don't get lostPrograms as Values - Write code and don't get lost
Programs as Values - Write code and don't get lost
Pierangelo Cecchetto
 
Do not let staffing shortages and limited fiscal view hamper your cause
Do not let staffing shortages and limited fiscal view hamper your causeDo not let staffing shortages and limited fiscal view hamper your cause
Do not let staffing shortages and limited fiscal view hamper your cause
Fexle Services Pvt. Ltd.
 
cram_advancedword2007version2025final.ppt
cram_advancedword2007version2025final.pptcram_advancedword2007version2025final.ppt
cram_advancedword2007version2025final.ppt
ahmedsaadtax2025
 
Reinventing Microservices Efficiency and Innovation with Single-Runtime
Reinventing Microservices Efficiency and Innovation with Single-RuntimeReinventing Microservices Efficiency and Innovation with Single-Runtime
Reinventing Microservices Efficiency and Innovation with Single-Runtime
Natan Silnitsky
 
Medical Device Cybersecurity Threat & Risk Scoring
Medical Device Cybersecurity Threat & Risk ScoringMedical Device Cybersecurity Threat & Risk Scoring
Medical Device Cybersecurity Threat & Risk Scoring
ICS
 
S3 + AWS Athena how to integrate s3 aws plus athena
S3 + AWS Athena how to integrate s3 aws plus athenaS3 + AWS Athena how to integrate s3 aws plus athena
S3 + AWS Athena how to integrate s3 aws plus athena
aianand98
 
Welcome to QA Summit 2025.
Welcome to QA Summit 2025.Welcome to QA Summit 2025.
Welcome to QA Summit 2025.
QA Summit
 
Buy vs. Build: Unlocking the right path for your training tech
Buy vs. Build: Unlocking the right path for your training techBuy vs. Build: Unlocking the right path for your training tech
Buy vs. Build: Unlocking the right path for your training tech
Rustici Software
 
Drawing Heighway’s Dragon - Part 4 - Interactive and Animated Dragon Creation
Drawing Heighway’s Dragon - Part 4 - Interactive and Animated Dragon CreationDrawing Heighway’s Dragon - Part 4 - Interactive and Animated Dragon Creation
Drawing Heighway’s Dragon - Part 4 - Interactive and Animated Dragon Creation
Philip Schwarz
 
Multi-Agent Era will Define the Future of Software
Multi-Agent Era will Define the Future of SoftwareMulti-Agent Era will Define the Future of Software
Multi-Agent Era will Define the Future of Software
Ivo Andreev
 
UI/UX Design & Development and Servicess
UI/UX Design & Development and ServicessUI/UX Design & Development and Servicess
UI/UX Design & Development and Servicess
marketing810348
 
Mobile Application Developer Dubai | Custom App Solutions by Ajath
Mobile Application Developer Dubai | Custom App Solutions by AjathMobile Application Developer Dubai | Custom App Solutions by Ajath
Mobile Application Developer Dubai | Custom App Solutions by Ajath
Ajath Infotech Technologies LLC
 
Unit Two - Java Architecture and OOPS
Unit Two  -   Java Architecture and OOPSUnit Two  -   Java Architecture and OOPS
Unit Two - Java Architecture and OOPS
Nabin Dhakal
 
Wilcom Embroidery Studio Crack Free Latest 2025
Wilcom Embroidery Studio Crack Free Latest 2025Wilcom Embroidery Studio Crack Free Latest 2025
Wilcom Embroidery Studio Crack Free Latest 2025
Web Designer
 
Hydraulic Modeling And Simulation Software Solutions.pptx
Hydraulic Modeling And Simulation Software Solutions.pptxHydraulic Modeling And Simulation Software Solutions.pptx
Hydraulic Modeling And Simulation Software Solutions.pptx
julia smits
 
iTop VPN With Crack Lifetime Activation Key
iTop VPN With Crack Lifetime Activation KeyiTop VPN With Crack Lifetime Activation Key
iTop VPN With Crack Lifetime Activation Key
raheemk1122g
 
How I solved production issues with OpenTelemetry
How I solved production issues with OpenTelemetryHow I solved production issues with OpenTelemetry
How I solved production issues with OpenTelemetry
Cees Bos
 
How to Troubleshoot 9 Types of OutOfMemoryError
How to Troubleshoot 9 Types of OutOfMemoryErrorHow to Troubleshoot 9 Types of OutOfMemoryError
How to Troubleshoot 9 Types of OutOfMemoryError
Tier1 app
 
File Viewer Plus 7.5.5.49 Crack Full Version
File Viewer Plus 7.5.5.49 Crack Full VersionFile Viewer Plus 7.5.5.49 Crack Full Version
File Viewer Plus 7.5.5.49 Crack Full Version
raheemk1122g
 
GC Tuning: A Masterpiece in Performance Engineering
GC Tuning: A Masterpiece in Performance EngineeringGC Tuning: A Masterpiece in Performance Engineering
GC Tuning: A Masterpiece in Performance Engineering
Tier1 app
 
Programs as Values - Write code and don't get lost
Programs as Values - Write code and don't get lostPrograms as Values - Write code and don't get lost
Programs as Values - Write code and don't get lost
Pierangelo Cecchetto
 
Do not let staffing shortages and limited fiscal view hamper your cause
Do not let staffing shortages and limited fiscal view hamper your causeDo not let staffing shortages and limited fiscal view hamper your cause
Do not let staffing shortages and limited fiscal view hamper your cause
Fexle Services Pvt. Ltd.
 
cram_advancedword2007version2025final.ppt
cram_advancedword2007version2025final.pptcram_advancedword2007version2025final.ppt
cram_advancedword2007version2025final.ppt
ahmedsaadtax2025
 
Reinventing Microservices Efficiency and Innovation with Single-Runtime
Reinventing Microservices Efficiency and Innovation with Single-RuntimeReinventing Microservices Efficiency and Innovation with Single-Runtime
Reinventing Microservices Efficiency and Innovation with Single-Runtime
Natan Silnitsky
 
Medical Device Cybersecurity Threat & Risk Scoring
Medical Device Cybersecurity Threat & Risk ScoringMedical Device Cybersecurity Threat & Risk Scoring
Medical Device Cybersecurity Threat & Risk Scoring
ICS
 
S3 + AWS Athena how to integrate s3 aws plus athena
S3 + AWS Athena how to integrate s3 aws plus athenaS3 + AWS Athena how to integrate s3 aws plus athena
S3 + AWS Athena how to integrate s3 aws plus athena
aianand98
 
Welcome to QA Summit 2025.
Welcome to QA Summit 2025.Welcome to QA Summit 2025.
Welcome to QA Summit 2025.
QA Summit
 
Buy vs. Build: Unlocking the right path for your training tech
Buy vs. Build: Unlocking the right path for your training techBuy vs. Build: Unlocking the right path for your training tech
Buy vs. Build: Unlocking the right path for your training tech
Rustici Software
 
Drawing Heighway’s Dragon - Part 4 - Interactive and Animated Dragon Creation
Drawing Heighway’s Dragon - Part 4 - Interactive and Animated Dragon CreationDrawing Heighway’s Dragon - Part 4 - Interactive and Animated Dragon Creation
Drawing Heighway’s Dragon - Part 4 - Interactive and Animated Dragon Creation
Philip Schwarz
 
Multi-Agent Era will Define the Future of Software
Multi-Agent Era will Define the Future of SoftwareMulti-Agent Era will Define the Future of Software
Multi-Agent Era will Define the Future of Software
Ivo Andreev
 
Ad

PRINS: Scalable Model Inference for Component-based System Logs

  • 1. PRINS: Scalable Model Inference for Component-based System Logs* Donghwan Shin1), Domenico Bianculli2), and Lionel Briand2,3) 1) University of She ffi eld 2) University of Luxembourg 3) University of Ottawa * This presentation is for the Journal-First Track at ICSE 2023; the original paper was accepted in Empirical Software Engineering (EMSE) journal.
  • 2. A B Y Z … Model Inference Technique ith execution 20190621.001 A 20190621.002 B 20190621.002 Z 20190621.002 B … ith execution 20190621.001 A 20190621.002 B 20190621.002 Z 20190621.002 B … ith execution 20221101.001 A 20221101.004 B 20221101.011 Z 20221101.013 B 20221101.101 Y … System Logs System Model Log = A sequence of log entries representing a single execution fl ow Too large Not Scalable Enough No Models 2
  • 3. 081111 090711 25010 INFO dfs.DataNode$DataXceiver: Receiving block blk_5652408071925555972 src: /10.251.65.203:38382 dest: /10.251.65.203:50010 081111 090711 25181 INFO dfs.DataNode$DataXceiver: Receiving block blk_5652408071925555972 src: /10.251.27.63:54730 dest: /10.251.27.63:50010 081111 090711 25487 INFO dfs.DataNode$DataXceiver: Receiving block blk_5652408071925555972 src: /10.251.65.203:40305 dest: /10.251.65.203:50010 081111 090711 00031 INFO dfs.FSNamesystem: BLOCK* NameSystem.allocateBlock: /user/root/rand8/_temporary/part-00156. blk_5652408071925555972 081111 090756 25011 INFO dfs.DataNode$PacketResponder: PacketResponder 2 for block blk_5652408071925555972 terminating 081111 090756 25011 INFO dfs.DataNode$PacketResponder: Received block blk_5652408071925555972 of size 67108864 from /10.251.65.203 081111 090756 25184 INFO dfs.DataNode$PacketResponder: PacketResponder 0 for block blk_5652408071925555972 terminating 081111 090756 25184 INFO dfs.DataNode$PacketResponder: Received block blk_5652408071925555972 of size 67108864 from /10.251.27.63 081111 090756 25488 INFO dfs.DataNode$PacketResponder: PacketResponder 1 for block blk_5652408071925555972 terminating 081111 090756 25488 INFO dfs.DataNode$PacketResponder: Received block blk_5652408071925555972 of size 67108864 from /10.251.65.203 081111 090756 00027 INFO dfs.FSNamesystem: BLOCK* NameSystem.addStoredBlock: blockMap updated: 10.251.71.16:50010 is added to blk_5652408071925555972 081111 111345 00013 INFO dfs.DataBlockScanner: Veri fi cation succeeded for blk_5652408071925555972 Example HDFS Log Component IDs 3 Observation: Systems are often composed of multiple components
  • 4. What if we infer INDIVIDUAL component models and then stitch them together? 4
  • 5. System Logs eA 1 eA 2 eB 4 eB 4 eA 1 eA 2 eB 4 eA 1 eA 3 eB 5 eA 1 eA 2 eB 4 eB 4 eA 1 eA 2 eB 4 eA 1 eA 3 eB 5 ax bx dy dy ax bx dy ax cx ey PRINS: PRojection-INference-Stitching s0 s1 s2 s3 s4 a b c d e INference Model of x Model of y INference Component x Component y PRojection eA 1 eA 2 eA 1 eA 2 eA 1 eA 3 eA 1 eA 2 eA 1 eA 2 eA 1 eA 3 ax bx ax bx ax cx eB 4 eB 4 eB 4 eB 5 eB 4 eB 4 eB 4 eB 5 dy dy dy ey s0 s1 s2 a b c d s4 e Stitching System Model + (optional) Heuristic Determinisation (HD)
  • 6. Research Questions • RQ1: How does the execution time of PRINS change according to the parallel inference tasks in the inference stage? • RQ2: How does the execution time of change according to parameter ? • RQ3: How does the accuracy of the models (in the form of gFSMs) generated by change according to parameter ? • RQ4: How fast is PRINS when compared to state-of-the-art model inference techniques? • RQ5: How accurate are the models generated by PRINS compared to those generated by state-of-the-art model inference techniques? HDu u HDu u 6 Parallel inference Heuristic Determinisation PRINS (compared to MINT)
  • 7. Research Questions • RQ1: How does the execution time of PRINS change according to the parallel inference tasks in the inference stage? • RQ2: How does the execution time of change according to parameter ? • RQ3: How does the accuracy of the models (in the form of gFSMs) generated by change according to parameter ? • RQ4: How fast is PRINS when compared to state-of-the-art model inference techniques? • RQ5: How accurate are the models generated by PRINS compared to those generated by state-of-the-art model inference techniques? HDu u HDu u 7 Parallel inference Heuristic Determinisation PRINS (compared to MINT)
  • 8. RQ4: Execution Time of PRINS compared to MINT 2 4 6 8 5 10 15 20 Execution Time (s) Hadoop MINT PRINS-N PRINS-P 2 4 6 8 0 5000 10000 HDFS MINT PRINS-N PRINS-P 2 4 6 8 0 5000 10000 15000 Linux MINT PRINS-N PRINS-P 2 4 0 2500 5000 7500 10000 Zookeeper MINT PRINS-N PRINS-P 2 4 6 8 Duplication Factor 0 5000 10000 15000 Execution Time (s) CoreSync MINT PRINS-N PRINS-P 2 4 6 8 Duplication Factor 2.5 5.0 7.5 10.0 12.5 NGLClient MINT PRINS-N PRINS-P 2 4 6 8 Duplication Factor 0 10000 20000 30000 Oobelib MINT PRINS-N PRINS-P 2 4 6 8 Duplication Factor 0 5000 10000 15000 PDApp MINT PRINS-N PRINS-P PRINS-N = PRINS with No parallel inference (HD is enabled to be fair with MINT) PRINS-P = PRINS with Parallel inference (HD is enabled to be fair with MINT) Duplication Factor = How many times each log is duplicated to increase the input log size systematically 8
  • 9. RQ5: Accuracy of PRINS compared to MINT 9
  • 10. Downside: Size of System Models 10
  • 11. Contributions • Tame the scalability issue of model inference using divide-and-conquer. • Present an empirical evaluation of PRINS and its comparison with the state-of-the-art model inference tool. • It works especially well when the components appearing in di ff erent executions are similar. • Provide a publicly available implementation of PRINS. 11 Paper (Open Access) Replication Package
  翻译: