SlideShare a Scribd company logo
Workflow Programming &
Provenance Query Model -
The Theory
Rayhan Ferdous
SR Lab
Dept. of CS
U of S
Definitions
• Workflow
• Data
• Module
• Dataflow
• Module Invocation
• Query
Workflow (Definition)
• A Workflow W = (V, E) is a Directed Acyclic Graph where,
• V = {v : v  D  M}
• E = {e : e  F}
• D = A set to represent Data (Defined later)
• M = A set to represent Module (Defined later)
• F = A set to represent Dataflow (Defined later)
Data (Definition)
• Data is a workflow element from W = (V, E) where,
• D = {d : d  V, d = (p1, p2, …, pn) }
• p  P
• P = An ordered set to represent properties (defined later)
Module (definition)
• Module is a workflow element from W = (V, E) where,
• M = {m : m  V, m = (p1, p2, …, pn) }
• p  P
• P = An ordered set to represent properties (defined later)
Properties (definition)
• Properties P = (p1, p2, …, pn) is an ordered set associated with any
data d  D and module m  M from a workflow W where,
• p = Null | Any
Dataflow (definition)
• A Dataflow F is an ordered tuple containing two workflow elements to
represent a directed edge in workflow W where,
• F = (d, m) | (m, d)
• d  D
• m  M
• (d, m) is considered as input dataflow, written as d → m
• (m, d) is considered as output dataflow, written as m → d
Module invocation
• A module invocation K is an ordered tuple containing two dataflows
where,
• K = { (f1 , f2) : f1 = (di , m), f2 = (m, do) }
Mathematical Rules Support
• Workflow W containing all the defined elements is a graph itself
• So, it supports all the rules of Graph Theory inherently
Data Analytics Lifecycle and Query
• Workflow can be used in any phase of data analytics
• If a query system can be applied upon all the phases of analytics
lifecycle → it can be taken as elementary
• We present a fundamental query system (only one)
• And 2 others that are derived from the first one (Total 3 queries)
• That can be used to answer any provenance question
Answering Query
• In our approach, to answer a query we use data visualization
• Data visualization enables storytelling about data as well as answer
multiple queries at the same time
Analytics Lifecycle phase – Base question –
Problem type
1. Discovery (Do I have enough information?) (Discovery)
2. Data Preparation (Do I have enough good quality data?) (Discovery)
3. Model Planning (Can I refine the analytics pipeline?) (Prediction/Recommendation)
4. Model Building (Is the model robust enough?) (Prediction/Recommendation)
5. Communicate Results (Reporting) (Prediction/Recommendation)
6. Operationalize (Execution)
Category of questions that queries could
answer
Analytics
Past event
Discovery
Elementary Query
Present event
Monitoring
Online Time
Series Query
Future event
Prediction/
Recommendation
Mapping Query
The elementary Query (our proposal)
• An elementary query Q is a function that returns an ordered element
of type R satisfying certain condition(s) from any Workflow W.
• Q = f(R, W, C)
• R = (r1, r2, … , rk) is an ordered set where r  P
• C = Boolean condition
A workflow can be regenerated with a
sequence of Query results
• We can use the first elementary query to generate a sequence of
dataflows.
• Example:
• Q1 = A → B
• Q2 = B → C
• Q3 = C → D
• That gives A → B → C → D
A dataset can be prepared by mapping results
from two distinct queries
• The mapping function can be regenerated using the first elementary
query
• Example:
• Q1 = (x1 , x2 , … , xn)
• Q2 = (y1 , y2 , … , yn)
• Q1 ↔ Q2 is an one to one mapping between the ordered sets that gives
( (x1 , y1) , (x2 , y2) , … , (xn , yn) )
Proposal
• We propose a fundamental set of 3 queries for answering provenance
questions.
• Decide (the first query)
• Sequence (the second query)
• Map (the third query)
How complete are the 3 queries ?
• Check with different research works
Ghoshal et al., "Provenance from log files: a
BigData problem." Proceedings of the Joint
EDBT/ICDT 2013 Workshops.
• Match – Select Rule: Decide
• Link Rule: Map
• Remap Rule: Map
Akidau et al., “The dataflow model: a practical
approach to balancing correctness, latency, and
cost in massive-scale, unbounded, out-of-order
data processing." Proceedings of the VLDB
Endowment, 2015
• What results are being computed?: Decide
• Where in event time they are being computed?: Decide
• When in processing time they are materialized?: Decide
• How earlier results relate to later refinements?: Sequence
Anand et al., "Techniques for efficiently querying
scientific workflow provenance graphs." EDBT
2010.
• Lineage query – Sequence
• Function query - Decide
Buneman et al., "Why and where: A
characterization of data provenance."
International conference on database theory.
2001.
• Why provenance (source data query) : Sequence
• Where provenance (Mapping with database) : Map
Cheney et al., "Provenance in databases: Why,
how, and where." Foundations and Trends® in
Databases, 2009
• Why provenance (source) : Decide
• How provenance (lineage) : Sequence
• Where provenance (source in database): Decide
Answering through Visualization
• Decide – Visualization not necessary or Tabular format
• Sequence – DAG
• Map – Tabular format
To visualize analytical analytical results
• This is query dependent
Conclusion

More Related Content

What's hot (20)

Chapter 4 ds
Chapter 4 dsChapter 4 ds
Chapter 4 ds
Hanif Durad
 
IR-ranking
IR-rankingIR-ranking
IR-ranking
FELIX75
 
Data Structure
Data StructureData Structure
Data Structure
sheraz1
 
Clustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn TutorialClustering: A Scikit Learn Tutorial
Clustering: A Scikit Learn Tutorial
Damian R. Mingle, MBA
 
Lecture 1 and 2
Lecture 1 and 2Lecture 1 and 2
Lecture 1 and 2
SaheedTundeZubairSTA
 
Introduction data structure
Introduction data structureIntroduction data structure
Introduction data structure
MUHAMMAD ISMAIL
 
Chapter 7 ds
Chapter 7 dsChapter 7 ds
Chapter 7 ds
Hanif Durad
 
Data structure lecture 1
Data structure lecture 1Data structure lecture 1
Data structure lecture 1
Kumar
 
Data Structures and Files
Data Structures and FilesData Structures and Files
Data Structures and Files
KanchanPatil34
 
R training3
R training3R training3
R training3
Hellen Gakuruh
 
Lecture 01 Intro to DSA
Lecture 01 Intro to DSALecture 01 Intro to DSA
Lecture 01 Intro to DSA
Nurjahan Nipa
 
Data structures (introduction)
 Data structures (introduction) Data structures (introduction)
Data structures (introduction)
Arvind Devaraj
 
Data structure and algorithm All in One
Data structure and algorithm All in OneData structure and algorithm All in One
Data structure and algorithm All in One
jehan1987
 
BASICS OF DATA STRUCTURE
BASICS OF DATA STRUCTUREBASICS OF DATA STRUCTURE
BASICS OF DATA STRUCTURE
VENNILAV6
 
Data structures
Data structuresData structures
Data structures
Manaswi Sharma
 
Data structure and algorithm using java
Data structure and algorithm using javaData structure and algorithm using java
Data structure and algorithm using java
Narayan Sau
 
Introduction to data structure and algorithms
Introduction to data structure and algorithmsIntroduction to data structure and algorithms
Introduction to data structure and algorithms
Research Scholar in Manonmaniam Sundaranar University
 
Data Structures: Stack Operations
Data Structures:    Stack OperationsData Structures:    Stack Operations
Data Structures: Stack Operations
GopikaS12
 
Mca ii dfs u-1 introduction to data structure
Mca ii dfs u-1 introduction to data structureMca ii dfs u-1 introduction to data structure
Mca ii dfs u-1 introduction to data structure
Rai University
 
Elementary data structure
Elementary data structureElementary data structure
Elementary data structure
Biswajit Mandal
 
IR-ranking
IR-rankingIR-ranking
IR-ranking
FELIX75
 
Data Structure
Data StructureData Structure
Data Structure
sheraz1
 
Introduction data structure
Introduction data structureIntroduction data structure
Introduction data structure
MUHAMMAD ISMAIL
 
Data structure lecture 1
Data structure lecture 1Data structure lecture 1
Data structure lecture 1
Kumar
 
Data Structures and Files
Data Structures and FilesData Structures and Files
Data Structures and Files
KanchanPatil34
 
Lecture 01 Intro to DSA
Lecture 01 Intro to DSALecture 01 Intro to DSA
Lecture 01 Intro to DSA
Nurjahan Nipa
 
Data structures (introduction)
 Data structures (introduction) Data structures (introduction)
Data structures (introduction)
Arvind Devaraj
 
Data structure and algorithm All in One
Data structure and algorithm All in OneData structure and algorithm All in One
Data structure and algorithm All in One
jehan1987
 
BASICS OF DATA STRUCTURE
BASICS OF DATA STRUCTUREBASICS OF DATA STRUCTURE
BASICS OF DATA STRUCTURE
VENNILAV6
 
Data structure and algorithm using java
Data structure and algorithm using javaData structure and algorithm using java
Data structure and algorithm using java
Narayan Sau
 
Data Structures: Stack Operations
Data Structures:    Stack OperationsData Structures:    Stack Operations
Data Structures: Stack Operations
GopikaS12
 
Mca ii dfs u-1 introduction to data structure
Mca ii dfs u-1 introduction to data structureMca ii dfs u-1 introduction to data structure
Mca ii dfs u-1 introduction to data structure
Rai University
 
Elementary data structure
Elementary data structureElementary data structure
Elementary data structure
Biswajit Mandal
 

Similar to Wrokflow programming and provenance query model (20)

Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
SharabiNaif
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
Anonymous9etQKwW
 
Intro_2.ppt
Intro_2.pptIntro_2.ppt
Intro_2.ppt
MumitAhmed1
 
A Workshop on R
A Workshop on RA Workshop on R
A Workshop on R
Ajay Ohri
 
Dagstuhl 2013 - Montali - On the Relationship between OBDA and Relational Map...
Dagstuhl 2013 - Montali - On the Relationship between OBDA and Relational Map...Dagstuhl 2013 - Montali - On the Relationship between OBDA and Relational Map...
Dagstuhl 2013 - Montali - On the Relationship between OBDA and Relational Map...
Faculty of Computer Science - Free University of Bozen-Bolzano
 
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
DrkhanchanaR
 
Workflow Provenance: From Modelling to Reporting
Workflow Provenance: From Modelling to ReportingWorkflow Provenance: From Modelling to Reporting
Workflow Provenance: From Modelling to Reporting
Rayhan Ferdous
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
Manish Pandey
 
Data mining knowledge representation Notes
Data mining knowledge representation NotesData mining knowledge representation Notes
Data mining knowledge representation Notes
RevathiSundar4
 
Sample Project Report okokokokokokokokok
Sample Project Report okokokokokokokokokSample Project Report okokokokokokokokok
Sample Project Report okokokokokokokokok
SamraKanwal9
 
Licentiate Defense Slide
Licentiate Defense SlideLicentiate Defense Slide
Licentiate Defense Slide
Rerngvit Yanggratoke
 
Start From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize AlgorithmStart From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize Algorithm
Yu Liu
 
Time Series With OrientDB - Fosdem 2015
Time Series With OrientDB - Fosdem 2015Time Series With OrientDB - Fosdem 2015
Time Series With OrientDB - Fosdem 2015
wolf4ood
 
Training in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media AnalyticsTraining in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media Analytics
Ajay Ohri
 
Basic data analysis using R.
Basic data analysis using R.Basic data analysis using R.
Basic data analysis using R.
C. Tobin Magle
 
RAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesRAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme Scales
Ian Foster
 
background.pptx
background.pptxbackground.pptx
background.pptx
KabileshCm
 
geekgap.io webinar #1
geekgap.io webinar #1geekgap.io webinar #1
geekgap.io webinar #1
junior Teudjio
 
Lecture 1 (bce-7)
Lecture   1 (bce-7)Lecture   1 (bce-7)
Lecture 1 (bce-7)
farazahmad005
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
Rakebul Hasan
 
A Workshop on R
A Workshop on RA Workshop on R
A Workshop on R
Ajay Ohri
 
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
DrkhanchanaR
 
Workflow Provenance: From Modelling to Reporting
Workflow Provenance: From Modelling to ReportingWorkflow Provenance: From Modelling to Reporting
Workflow Provenance: From Modelling to Reporting
Rayhan Ferdous
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
Manish Pandey
 
Data mining knowledge representation Notes
Data mining knowledge representation NotesData mining knowledge representation Notes
Data mining knowledge representation Notes
RevathiSundar4
 
Sample Project Report okokokokokokokokok
Sample Project Report okokokokokokokokokSample Project Report okokokokokokokokok
Sample Project Report okokokokokokokokok
SamraKanwal9
 
Start From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize AlgorithmStart From A MapReduce Graph Pattern-recognize Algorithm
Start From A MapReduce Graph Pattern-recognize Algorithm
Yu Liu
 
Time Series With OrientDB - Fosdem 2015
Time Series With OrientDB - Fosdem 2015Time Series With OrientDB - Fosdem 2015
Time Series With OrientDB - Fosdem 2015
wolf4ood
 
Training in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media AnalyticsTraining in Analytics, R and Social Media Analytics
Training in Analytics, R and Social Media Analytics
Ajay Ohri
 
Basic data analysis using R.
Basic data analysis using R.Basic data analysis using R.
Basic data analysis using R.
C. Tobin Magle
 
RAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme ScalesRAMSES: Robust Analytic Models for Science at Extreme Scales
RAMSES: Robust Analytic Models for Science at Extreme Scales
Ian Foster
 
background.pptx
background.pptxbackground.pptx
background.pptx
KabileshCm
 
Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...Predicting query performance and explaining results to assist Linked Data con...
Predicting query performance and explaining results to assist Linked Data con...
Rakebul Hasan
 

Recently uploaded (20)

AI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier VroomAI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
UXPA Boston
 
Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
Cybersecurity Tools and Technologies - Microsoft Certificate
Cybersecurity Tools and Technologies - Microsoft CertificateCybersecurity Tools and Technologies - Microsoft Certificate
Cybersecurity Tools and Technologies - Microsoft Certificate
VICTOR MAESTRE RAMIREZ
 
Developing System Infrastructure Design Plan.pptx
Developing System Infrastructure Design Plan.pptxDeveloping System Infrastructure Design Plan.pptx
Developing System Infrastructure Design Plan.pptx
wondimagegndesta
 
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Alan Dix
 
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
ICT Frame Magazine Pvt. Ltd.
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
DNF 2.0 Implementations Challenges in Nepal
DNF 2.0 Implementations Challenges in NepalDNF 2.0 Implementations Challenges in Nepal
DNF 2.0 Implementations Challenges in Nepal
ICT Frame Magazine Pvt. Ltd.
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
React Native for Business Solutions: Building Scalable Apps for Success
React Native for Business Solutions: Building Scalable Apps for SuccessReact Native for Business Solutions: Building Scalable Apps for Success
React Native for Business Solutions: Building Scalable Apps for Success
Amelia Swank
 
Top-AI-Based-Tools-for-Game-Developers (1).pptx
Top-AI-Based-Tools-for-Game-Developers (1).pptxTop-AI-Based-Tools-for-Game-Developers (1).pptx
Top-AI-Based-Tools-for-Game-Developers (1).pptx
BR Softech
 
Agentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community MeetupAgentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community Meetup
Manoj Batra (1600 + Connections)
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
Understanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdfUnderstanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdf
Fulcrum Concepts, LLC
 
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Gary Arora
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Vasileios Komianos
 
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier VroomAI x Accessibility UXPA by Stew Smith and Olivier Vroom
AI x Accessibility UXPA by Stew Smith and Olivier Vroom
UXPA Boston
 
Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)Design pattern talk by Kaya Weers - 2025 (v2)
Design pattern talk by Kaya Weers - 2025 (v2)
Kaya Weers
 
Cybersecurity Tools and Technologies - Microsoft Certificate
Cybersecurity Tools and Technologies - Microsoft CertificateCybersecurity Tools and Technologies - Microsoft Certificate
Cybersecurity Tools and Technologies - Microsoft Certificate
VICTOR MAESTRE RAMIREZ
 
Developing System Infrastructure Design Plan.pptx
Developing System Infrastructure Design Plan.pptxDeveloping System Infrastructure Design Plan.pptx
Developing System Infrastructure Design Plan.pptx
wondimagegndesta
 
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Alan Dix
 
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
ICT Frame Magazine Pvt. Ltd.
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdfKit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Kit-Works Team Study_팀스터디_김한솔_nuqs_20250509.pdf
Wonjun Hwang
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
React Native for Business Solutions: Building Scalable Apps for Success
React Native for Business Solutions: Building Scalable Apps for SuccessReact Native for Business Solutions: Building Scalable Apps for Success
React Native for Business Solutions: Building Scalable Apps for Success
Amelia Swank
 
Top-AI-Based-Tools-for-Game-Developers (1).pptx
Top-AI-Based-Tools-for-Game-Developers (1).pptxTop-AI-Based-Tools-for-Game-Developers (1).pptx
Top-AI-Based-Tools-for-Game-Developers (1).pptx
BR Softech
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
Understanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdfUnderstanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdf
Fulcrum Concepts, LLC
 
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Gary Arora
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Digital Technologies for Culture, Arts and Heritage: Insights from Interdisci...
Vasileios Komianos
 

Wrokflow programming and provenance query model

  • 1. Workflow Programming & Provenance Query Model - The Theory Rayhan Ferdous SR Lab Dept. of CS U of S
  • 2. Definitions • Workflow • Data • Module • Dataflow • Module Invocation • Query
  • 3. Workflow (Definition) • A Workflow W = (V, E) is a Directed Acyclic Graph where, • V = {v : v  D  M} • E = {e : e  F} • D = A set to represent Data (Defined later) • M = A set to represent Module (Defined later) • F = A set to represent Dataflow (Defined later)
  • 4. Data (Definition) • Data is a workflow element from W = (V, E) where, • D = {d : d  V, d = (p1, p2, …, pn) } • p  P • P = An ordered set to represent properties (defined later)
  • 5. Module (definition) • Module is a workflow element from W = (V, E) where, • M = {m : m  V, m = (p1, p2, …, pn) } • p  P • P = An ordered set to represent properties (defined later)
  • 6. Properties (definition) • Properties P = (p1, p2, …, pn) is an ordered set associated with any data d  D and module m  M from a workflow W where, • p = Null | Any
  • 7. Dataflow (definition) • A Dataflow F is an ordered tuple containing two workflow elements to represent a directed edge in workflow W where, • F = (d, m) | (m, d) • d  D • m  M • (d, m) is considered as input dataflow, written as d → m • (m, d) is considered as output dataflow, written as m → d
  • 8. Module invocation • A module invocation K is an ordered tuple containing two dataflows where, • K = { (f1 , f2) : f1 = (di , m), f2 = (m, do) }
  • 9. Mathematical Rules Support • Workflow W containing all the defined elements is a graph itself • So, it supports all the rules of Graph Theory inherently
  • 10. Data Analytics Lifecycle and Query • Workflow can be used in any phase of data analytics • If a query system can be applied upon all the phases of analytics lifecycle → it can be taken as elementary • We present a fundamental query system (only one) • And 2 others that are derived from the first one (Total 3 queries) • That can be used to answer any provenance question
  • 11. Answering Query • In our approach, to answer a query we use data visualization • Data visualization enables storytelling about data as well as answer multiple queries at the same time
  • 12. Analytics Lifecycle phase – Base question – Problem type 1. Discovery (Do I have enough information?) (Discovery) 2. Data Preparation (Do I have enough good quality data?) (Discovery) 3. Model Planning (Can I refine the analytics pipeline?) (Prediction/Recommendation) 4. Model Building (Is the model robust enough?) (Prediction/Recommendation) 5. Communicate Results (Reporting) (Prediction/Recommendation) 6. Operationalize (Execution)
  • 13. Category of questions that queries could answer Analytics Past event Discovery Elementary Query Present event Monitoring Online Time Series Query Future event Prediction/ Recommendation Mapping Query
  • 14. The elementary Query (our proposal) • An elementary query Q is a function that returns an ordered element of type R satisfying certain condition(s) from any Workflow W. • Q = f(R, W, C) • R = (r1, r2, … , rk) is an ordered set where r  P • C = Boolean condition
  • 15. A workflow can be regenerated with a sequence of Query results • We can use the first elementary query to generate a sequence of dataflows. • Example: • Q1 = A → B • Q2 = B → C • Q3 = C → D • That gives A → B → C → D
  • 16. A dataset can be prepared by mapping results from two distinct queries • The mapping function can be regenerated using the first elementary query • Example: • Q1 = (x1 , x2 , … , xn) • Q2 = (y1 , y2 , … , yn) • Q1 ↔ Q2 is an one to one mapping between the ordered sets that gives ( (x1 , y1) , (x2 , y2) , … , (xn , yn) )
  • 17. Proposal • We propose a fundamental set of 3 queries for answering provenance questions. • Decide (the first query) • Sequence (the second query) • Map (the third query)
  • 18. How complete are the 3 queries ? • Check with different research works
  • 19. Ghoshal et al., "Provenance from log files: a BigData problem." Proceedings of the Joint EDBT/ICDT 2013 Workshops. • Match – Select Rule: Decide • Link Rule: Map • Remap Rule: Map
  • 20. Akidau et al., “The dataflow model: a practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing." Proceedings of the VLDB Endowment, 2015 • What results are being computed?: Decide • Where in event time they are being computed?: Decide • When in processing time they are materialized?: Decide • How earlier results relate to later refinements?: Sequence
  • 21. Anand et al., "Techniques for efficiently querying scientific workflow provenance graphs." EDBT 2010. • Lineage query – Sequence • Function query - Decide
  • 22. Buneman et al., "Why and where: A characterization of data provenance." International conference on database theory. 2001. • Why provenance (source data query) : Sequence • Where provenance (Mapping with database) : Map
  • 23. Cheney et al., "Provenance in databases: Why, how, and where." Foundations and Trends® in Databases, 2009 • Why provenance (source) : Decide • How provenance (lineage) : Sequence • Where provenance (source in database): Decide
  • 24. Answering through Visualization • Decide – Visualization not necessary or Tabular format • Sequence – DAG • Map – Tabular format
  • 25. To visualize analytical analytical results • This is query dependent
  翻译: