SlideShare a Scribd company logo
WIFI SSID:Spark+AISummit | Password: UnifiedDataAnalytics
Alicia Frame, PhD
Senior Data Scientist, Neo4j
Transforming AI with Graphs:
Real World Examples with Spark & Neo4j
#UnifiedDataAnalytics #SparkAISummit
Transforming AI with Graphs: Real World Examples using Spark and Neo4j
Financial Services Drug Discovery Recommendations
Cybersecurity Predictive Maintenance
Customer Segmentation
Churn Prediction Search/MDM
Graph Data Science Applications
CAR
DRIVES
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
since:
Jan 10, 2011
brand: “Volvo”
model: “V70”
Latitude: 37.5629900°
Longitude: -122.3255300°
Nodes
• Can have Labels to classify nodes
• Labels have native indexes
Relationships
• Relate nodes by type and direction
Properties
• Attributes of Nodes & Relationships
• Stored as Name/Value pairs
• Can have indexes and composite indexes
MARRIED TO
LIVES WITH
OW
NS
PERSON PERSON
7
Labeled Property Graphs
• Current data science models ignore network structure
• Graphs add highly predictive features to existing ML models
• Otherwise unattainable predictions based on relationships
Novel & More Accurate Predictions
with the Data You Already Have
Machine Learning Pipeline
“The idea is that graph networks are bigger than
any one machine-learning approach.
Graphs bring an ability to generalize about
structure that the individual neural nets don't have.”
"Where do the
graphs come from
that
graph networks
operate over?”
Building a Graph ML Model
Data
Sources
Native Graph
Platform
Machine
Learning
Aggregate Disparate Data
and Cleanse
Build Predictive ModelsUnify Graphs and Engineer
Features
Parquet JSON
and more…
MLlib
and more…
Spark Graph Native Graph
Platform
Machine Learning
Example: Spark & Neo4j Workflow
Graph
Transactions
Graph
Analytics
Cypher 9 in Spark 3.0
to create non-
persistent graphs
MLlib to Train Models
Native Graph Algorithms,
Processing, and Storage
Explore Graphs
Build Graph
Solutions
• Massively scalable
• Powerful data pipelining
• Robust ML Libraries
• Non-persistent, non-native graphs
• Persistent, dynamic graphs
• Graph native query and algorithm
performance
• Constantly growing list of graph
algorithms and embeddings
The Steps of Graph Data Science
Query Based
Knowledge
Graph
Query Based
Feature
Engineering
Graph
Algorithm
Feature
Engineering
Graph
Embeddings
Graph Neural
Networks
DataScienceComplexity
Knowledge
Graphs
Graph Feature
Engineering
Graph Native
Learning
Graph Persistence
Steps Forward in Graph Data Science
Query Based
Knowledge
Graph
Query Based
Feature
Engineering
Graph Algorithm
Feature
Engineering
Graph
Embeddings
Graph Neural
Networks
Enterprise Maturity
DataScienceComplexity
Query based knowledge graphs:
Connecting the Dots at NASA
“Using Neo4j someone from our Orion project found information from the Apollo
project that prevented an issue, saving well over two years of work and one
million dollars of taxpayer funds.”
Steps Forward in Graph Data Science
Query Based
Knowledge
Graph
Graph
Algorithm
Feature
Engineering
Graph
Embeddings
Graph Neural
Networks
Query Based
Feature
Engineering
Enterprise Maturity
DataScienceComplexity
Churn prediction research
has found that simple hand-
engineered features are highly
predictive
• How many calls/texts has
an account made?
• How many of their contacts
have churned?
Query-Based Feature Engineering
Telecom-churn prediction
Telecommunication
networks are easily
represented as graphs
Query-Based Feature Engineering
Telecom-churn prediction
23
Add connected features
based on graph queries to
tabular data
Khan et al, 2015
Spark Graph Native Graph
Platform
Machine Learning
• Merge distributed data
into DataFrames
• Reshape your tables
into graphs
• Explore cypher queries
• Move to Neo4j to build
expert queries
• Persist your graph
Knowledge Graphs:
Getting Started Example with Spark
• Bring query based
graph features to ML
pipeline
Graph
Transactions
Graph
Analytics
Steps Forward in Graph Data Science
Query Based
Feature
Engineering
Graph
Embeddings
Graph Neural
Networks
Query Based
Knowledge
Graph
Graph
Algorithm
Feature
Engineering
Enterprise Maturity
DataScienceComplexity
Feature Engineering is how we combine and process the
data to create new, more meaningful features, such as
clustering or connectivity metrics.
Graph Feature Engineering
Add More Descriptive Features:
- Influence
- Relationships
- Communities
Graph Feature Categories & Algorithms
Pathfinding
& Search
Finds the optimal paths or
evaluates
route availability and quality
Centrality /
Importance
Determines the importance of
distinct nodes in the network
Community
Detection
Detects group clustering or
partition options
Heuristic
Link Prediction
Estimates the likelihood of nodes
forming a relationship
Evaluates how alike nodes
are
Similarity
Embeddings
Learned representations
of connectivity or topology
• Connected components to identify
disjointed graphs sharing identifiers
• PageRank to measure influence and
transaction volumes
• Louvain to identify communities
that frequently interact
• Jaccard to measure account
similarity based on relationships
Financial Crime: Detecting Fraud
Large financial institutions already have existing pipelines
to identify fraud via heuristics and models
Graph based features improve accuracy:
+142,000 Peer Reviewed Publications
Graph Fraud / Anomaly Detection
in the last 10 years
Spark Graph Native Graph
Platform
Machine Learning
• Merge distributed data
into DataFrames
• Reshape your tables
into graphs
• Explore cypher queries
and simple algorithms
• Persist your graph
• Create rule based
features
• Run native graph
algorithms and write to
graph or stream
Graph Feature Engineering:
Getting Started Example with Spark
• Bring graph features
to ML pipeline for
training
Graph
Transactions
Graph
Analytics
Graph Algorithms in Neo4J
• Parallel Breadth First Search
• Parallel Depth First Search
• Shortest Path
• Single-Source Shortest Path
• All Pairs Shortest Path
• Minimum Spanning Tree
• A* Shortest Path
• Yen’s K Shortest Path
• K-Spanning Tree (MST)
• Random Walk
• Degree Centrality
• Closeness Centrality
• CC Variations: Harmonic, Dangalchev,
Wasserman & Faust
• Betweenness Centrality
• Approximate Betweenness Centrality
• PageRank
• Personalized PageRank
• ArticleRank
• Eigenvector Centrality
• Triangle Count
• Clustering Coefficients
• Connected Components (Union Find)
• Strongly Connected Components
• Label Propagation
• Louvain Modularity – 1 Step & Multi-Step
• Balanced Triad (identification)
• Euclidean Distance
• Cosine Similarity
• Jaccard Similarity
• Overlap Similarity
• Pearson Similarity
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Similarity
neo4j.com/docs/
graph-algorithms/current/
Link
Prediction
• Adamic Adar
• Common Neighbors
• Preferential Attachment
• Resource Allocations
• Same Community
• Total Neighbors
Graph Algorithms in Neo4J
• Parallel Breadth First Search
• Parallel Depth First Search
• Shortest Path
• Single-Source Shortest Path
• All Pairs Shortest Path
• Minimum Spanning Tree
• A* Shortest Path
• Yen’s K Shortest Path
• K-Spanning Tree (MST)
• Random Walk
• Degree Centrality
• Closeness Centrality
• CC Variations: Harmonic, Dangalchev,
Wasserman & Faust
• Betweenness Centrality
• Approximate Betweenness Centrality
• PageRank
• Personalized PageRank
• ArticleRank
• Eigenvector Centrality
• Triangle Count
• Clustering Coefficients
• Connected Components (Union Find)
• Strongly Connected Components
• Label Propagation
• Louvain Modularity – 1 Step & Multi-Step
• Balanced Triad (identification)
• Euclidean Distance
• Cosine Similarity
• Jaccard Similarity
• Overlap Similarity
• Pearson Similarity
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Similarity
neo4j.com/docs/
graph-algorithms/current/
Link
Prediction
• Adamic Adar
• Common Neighbors
• Preferential Attachment
• Resource Allocations
• Same Community
• Total Neighbors
Steps Forward in Graph Data Science
Query Based
Knowledge
Graph
Graph
Algorithm
Feature
Engineering
Graph Neural
Networks
Query Based
Feature
Engineering
Graph
Embeddings
Enterprise Maturity
DataScienceComplexity
Embedding transforms graphs into a vector, or set of
vectors, describing topology, connectivity, or attributes of
nodes and edges in the graph
Graph Embeddings
• Vertex embeddings: describe connectivity of each node
• Path embeddings: traversals across the graph
• Graph embeddings: encode an entire graph into a single vector
Explainable Reasoning over Knowledge Graphs for
Recommendation
Graph Embeddings - Recommendations
Explainable Reasoning over Knowledge Graphs for
Recommendation
Graph Embeddings - Recommendations
Spark Graph Native Graph
Platform
Machine Learning
• Merge distributed data into
DataFrames
• Reshape your tables
into graphs
• Explore cypher queries and
simple algorithms
• Move to Neo4j to build
expert queries
• Write to persist
• Stay tuned for DeepWalk
and DeepGL algorithms
Graph Feature Engineering:
Getting Started Example with Spark
• Bring graph features to
ML pipeline for training
Graph
Transactions
Graph
Analytics
Steps Forward in Graph Data Science
Query Based
Knowledge
Graph
Graph
Algorithm
Feature
Engineering
Query Based
Feature
Engineering
Graph Neural
Networks
Graph
Embeddings
Enterprise Maturity
DataScienceComplexity
Deep Learning refers to training multi-layer neural
networks using gradient descent
Graph Native Learning
Graph Native Learning refers to deep learning models
that take a graph as an input, performs computations,
and return a graph
Graph Native Learning
Battaglia et al, 2018
Example: electron path prediction
Bradshaw et al, 2019
Graph Native Learning
Given reactants and reagents, what will the
products be?
Given reactants and reagents, what will the
products be?
Example: electron path prediction
Graph Native Learning
Progressing in Graph Data Science
Query Based
Knowledge
Graph
Query Based
Feature
Engineering
Graph
Algorithm
Feature
Engineering
Graph
Embeddings
Graph Neural
Networks
Enterprise Maturity
DataScienceComplexity
Knowledge
Graphs
Graph Feature
Engineering
Graph Native
Learning
Graph Persistence
Resources
Business
• neo4j.com/use-cases/artificial-intelligence-analytics/
Data Scientists/Developers
• neo4j.com/sandbox
• neo4j.com/developer/
• community.neo4j.com
alicia.frame@neo4j.com
@aliciaframe1
44#UnifiedAnalytics #SparkAISummit
DON’T FORGET TO RATE
AND REVIEW THE SESSIONS
SEARCH SPARK + AI SUMMIT
Ad

More Related Content

What's hot (20)

AI Modernization at AT&T and the Application to Fraud with Databricks
AI Modernization at AT&T and the Application to Fraud with DatabricksAI Modernization at AT&T and the Application to Fraud with Databricks
AI Modernization at AT&T and the Application to Fraud with Databricks
Databricks
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
DATAVERSITY
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
Neo4j
 
Intro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesIntro to Neo4j and Graph Databases
Intro to Neo4j and Graph Databases
Neo4j
 
Graph databases
Graph databasesGraph databases
Graph databases
Vinoth Kannan
 
Introduction to Neo4j and .Net
Introduction to Neo4j and .NetIntroduction to Neo4j and .Net
Introduction to Neo4j and .Net
Neo4j
 
The Knowledge Graph Explosion
The Knowledge Graph ExplosionThe Knowledge Graph Explosion
The Knowledge Graph Explosion
Neo4j
 
Relational to Big Graph
Relational to Big GraphRelational to Big Graph
Relational to Big Graph
Neo4j
 
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4jNeo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j
 
Intro to Graphs and Neo4j
Intro to Graphs and Neo4jIntro to Graphs and Neo4j
Intro to Graphs and Neo4j
jexp
 
Neo4j Graph Data Science - Webinar
Neo4j Graph Data Science - WebinarNeo4j Graph Data Science - Webinar
Neo4j Graph Data Science - Webinar
Neo4j
 
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data ScienceScaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Neo4j
 
Graph-Powered Machine Learning
Graph-Powered Machine LearningGraph-Powered Machine Learning
Graph-Powered Machine Learning
Databricks
 
Graphs for Finance - AML with Neo4j Graph Data Science
Graphs for Finance - AML with Neo4j Graph Data Science Graphs for Finance - AML with Neo4j Graph Data Science
Graphs for Finance - AML with Neo4j Graph Data Science
Neo4j
 
NoSQL Graph Databases - Why, When and Where
NoSQL Graph Databases - Why, When and WhereNoSQL Graph Databases - Why, When and Where
NoSQL Graph Databases - Why, When and Where
Eugene Hanikblum
 
Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
 Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit... Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
Databricks
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
Alex Ivy
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
James Serra
 
A Universe of Knowledge Graphs
A Universe of Knowledge GraphsA Universe of Knowledge Graphs
A Universe of Knowledge Graphs
Neo4j
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
Julien Le Dem
 
AI Modernization at AT&T and the Application to Fraud with Databricks
AI Modernization at AT&T and the Application to Fraud with DatabricksAI Modernization at AT&T and the Application to Fraud with Databricks
AI Modernization at AT&T and the Application to Fraud with Databricks
Databricks
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
DATAVERSITY
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
Neo4j
 
Intro to Neo4j and Graph Databases
Intro to Neo4j and Graph DatabasesIntro to Neo4j and Graph Databases
Intro to Neo4j and Graph Databases
Neo4j
 
Introduction to Neo4j and .Net
Introduction to Neo4j and .NetIntroduction to Neo4j and .Net
Introduction to Neo4j and .Net
Neo4j
 
The Knowledge Graph Explosion
The Knowledge Graph ExplosionThe Knowledge Graph Explosion
The Knowledge Graph Explosion
Neo4j
 
Relational to Big Graph
Relational to Big GraphRelational to Big Graph
Relational to Big Graph
Neo4j
 
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4jNeo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j
 
Intro to Graphs and Neo4j
Intro to Graphs and Neo4jIntro to Graphs and Neo4j
Intro to Graphs and Neo4j
jexp
 
Neo4j Graph Data Science - Webinar
Neo4j Graph Data Science - WebinarNeo4j Graph Data Science - Webinar
Neo4j Graph Data Science - Webinar
Neo4j
 
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data ScienceScaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Neo4j
 
Graph-Powered Machine Learning
Graph-Powered Machine LearningGraph-Powered Machine Learning
Graph-Powered Machine Learning
Databricks
 
Graphs for Finance - AML with Neo4j Graph Data Science
Graphs for Finance - AML with Neo4j Graph Data Science Graphs for Finance - AML with Neo4j Graph Data Science
Graphs for Finance - AML with Neo4j Graph Data Science
Neo4j
 
NoSQL Graph Databases - Why, When and Where
NoSQL Graph Databases - Why, When and WhereNoSQL Graph Databases - Why, When and Where
NoSQL Graph Databases - Why, When and Where
Eugene Hanikblum
 
Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
 Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit... Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
Near Real-Time Netflix Recommendations Using Apache Spark Streaming with Nit...
Databricks
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
Alex Ivy
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
James Serra
 
A Universe of Knowledge Graphs
A Universe of Knowledge GraphsA Universe of Knowledge Graphs
A Universe of Knowledge Graphs
Neo4j
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
Julien Le Dem
 

Similar to Transforming AI with Graphs: Real World Examples using Spark and Neo4j (20)

Leveraging Graphs for Better AI
Leveraging Graphs for Better AILeveraging Graphs for Better AI
Leveraging Graphs for Better AI
Neo4j
 
How Graph Technology is Changing AI
How Graph Technology is Changing AIHow Graph Technology is Changing AI
How Graph Technology is Changing AI
Databricks
 
Leveraging Graphs for Better AI
Leveraging Graphs for Better AILeveraging Graphs for Better AI
Leveraging Graphs for Better AI
Neo4j
 
How Graphs Enhance AI
How Graphs Enhance AIHow Graphs Enhance AI
How Graphs Enhance AI
Neo4j
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-System
inside-BigData.com
 
Introduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainIntroduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & Bahrain
Neo4j
 
How Graphs are Changing AI
How Graphs are Changing AIHow Graphs are Changing AI
How Graphs are Changing AI
Neo4j
 
Neo4j GraphTalk Oslo - Introduction to Graphs
Neo4j GraphTalk Oslo - Introduction to GraphsNeo4j GraphTalk Oslo - Introduction to Graphs
Neo4j GraphTalk Oslo - Introduction to Graphs
Neo4j
 
How Graph Algorithms Answer your Business Questions in Banking and Beyond
How Graph Algorithms Answer your Business Questions in Banking and BeyondHow Graph Algorithms Answer your Business Questions in Banking and Beyond
How Graph Algorithms Answer your Business Questions in Banking and Beyond
Neo4j
 
What Is GDS and Neo4j’s GDS Library
What Is GDS and Neo4j’s GDS LibraryWhat Is GDS and Neo4j’s GDS Library
What Is GDS and Neo4j’s GDS Library
Neo4j
 
GraphTour 2020 - Graphs & AI: A Path for Data Science
GraphTour 2020 - Graphs & AI: A Path for Data ScienceGraphTour 2020 - Graphs & AI: A Path for Data Science
GraphTour 2020 - Graphs & AI: A Path for Data Science
Neo4j
 
State of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAMState of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAM
Neo4j
 
Graph Databases
Graph DatabasesGraph Databases
Graph Databases
Girish Khanzode
 
Morpheus - SQL and Cypher in Apache Spark
Morpheus - SQL and Cypher in Apache SparkMorpheus - SQL and Cypher in Apache Spark
Morpheus - SQL and Cypher in Apache Spark
Henning Kropp
 
Morpheus SQL and Cypher® in Apache® Spark - Big Data Meetup Munich
Morpheus SQL and Cypher® in Apache® Spark - Big Data Meetup MunichMorpheus SQL and Cypher® in Apache® Spark - Big Data Meetup Munich
Morpheus SQL and Cypher® in Apache® Spark - Big Data Meetup Munich
Martin Junghanns
 
Odsc 2019 entity_reputation_knowledge_graph
Odsc 2019 entity_reputation_knowledge_graphOdsc 2019 entity_reputation_knowledge_graph
Odsc 2019 entity_reputation_knowledge_graph
venkatramanJ4
 
Neo4j Training Introduction
Neo4j Training IntroductionNeo4j Training Introduction
Neo4j Training Introduction
Max De Marzi
 
Nodes2020 | Graph of enterprise_metadata | NEO4J Conference
Nodes2020 | Graph of enterprise_metadata | NEO4J ConferenceNodes2020 | Graph of enterprise_metadata | NEO4J Conference
Nodes2020 | Graph of enterprise_metadata | NEO4J Conference
Deepak Chandramouli
 
La bi, l'informatique décisionnelle et les graphes
La bi, l'informatique décisionnelle et les graphesLa bi, l'informatique décisionnelle et les graphes
La bi, l'informatique décisionnelle et les graphes
Cédric Fauvet
 
3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning
Neo4j
 
Leveraging Graphs for Better AI
Leveraging Graphs for Better AILeveraging Graphs for Better AI
Leveraging Graphs for Better AI
Neo4j
 
How Graph Technology is Changing AI
How Graph Technology is Changing AIHow Graph Technology is Changing AI
How Graph Technology is Changing AI
Databricks
 
Leveraging Graphs for Better AI
Leveraging Graphs for Better AILeveraging Graphs for Better AI
Leveraging Graphs for Better AI
Neo4j
 
How Graphs Enhance AI
How Graphs Enhance AIHow Graphs Enhance AI
How Graphs Enhance AI
Neo4j
 
The Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-SystemThe Analytics Frontier of the Hadoop Eco-System
The Analytics Frontier of the Hadoop Eco-System
inside-BigData.com
 
Introduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainIntroduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & Bahrain
Neo4j
 
How Graphs are Changing AI
How Graphs are Changing AIHow Graphs are Changing AI
How Graphs are Changing AI
Neo4j
 
Neo4j GraphTalk Oslo - Introduction to Graphs
Neo4j GraphTalk Oslo - Introduction to GraphsNeo4j GraphTalk Oslo - Introduction to Graphs
Neo4j GraphTalk Oslo - Introduction to Graphs
Neo4j
 
How Graph Algorithms Answer your Business Questions in Banking and Beyond
How Graph Algorithms Answer your Business Questions in Banking and BeyondHow Graph Algorithms Answer your Business Questions in Banking and Beyond
How Graph Algorithms Answer your Business Questions in Banking and Beyond
Neo4j
 
What Is GDS and Neo4j’s GDS Library
What Is GDS and Neo4j’s GDS LibraryWhat Is GDS and Neo4j’s GDS Library
What Is GDS and Neo4j’s GDS Library
Neo4j
 
GraphTour 2020 - Graphs & AI: A Path for Data Science
GraphTour 2020 - Graphs & AI: A Path for Data ScienceGraphTour 2020 - Graphs & AI: A Path for Data Science
GraphTour 2020 - Graphs & AI: A Path for Data Science
Neo4j
 
State of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAMState of Florida Neo4j Graph Briefing - Cyber IAM
State of Florida Neo4j Graph Briefing - Cyber IAM
Neo4j
 
Morpheus - SQL and Cypher in Apache Spark
Morpheus - SQL and Cypher in Apache SparkMorpheus - SQL and Cypher in Apache Spark
Morpheus - SQL and Cypher in Apache Spark
Henning Kropp
 
Morpheus SQL and Cypher® in Apache® Spark - Big Data Meetup Munich
Morpheus SQL and Cypher® in Apache® Spark - Big Data Meetup MunichMorpheus SQL and Cypher® in Apache® Spark - Big Data Meetup Munich
Morpheus SQL and Cypher® in Apache® Spark - Big Data Meetup Munich
Martin Junghanns
 
Odsc 2019 entity_reputation_knowledge_graph
Odsc 2019 entity_reputation_knowledge_graphOdsc 2019 entity_reputation_knowledge_graph
Odsc 2019 entity_reputation_knowledge_graph
venkatramanJ4
 
Neo4j Training Introduction
Neo4j Training IntroductionNeo4j Training Introduction
Neo4j Training Introduction
Max De Marzi
 
Nodes2020 | Graph of enterprise_metadata | NEO4J Conference
Nodes2020 | Graph of enterprise_metadata | NEO4J ConferenceNodes2020 | Graph of enterprise_metadata | NEO4J Conference
Nodes2020 | Graph of enterprise_metadata | NEO4J Conference
Deepak Chandramouli
 
La bi, l'informatique décisionnelle et les graphes
La bi, l'informatique décisionnelle et les graphesLa bi, l'informatique décisionnelle et les graphes
La bi, l'informatique décisionnelle et les graphes
Cédric Fauvet
 
3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning
Neo4j
 
Ad

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
Ad

Recently uploaded (20)

2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
bastakwyry
 
L1_Slides_Foundational Concepts_508.pptx
L1_Slides_Foundational Concepts_508.pptxL1_Slides_Foundational Concepts_508.pptx
L1_Slides_Foundational Concepts_508.pptx
38NoopurPatel
 
CS-404 COA COURSE FILE JAN JUN 2025.docx
CS-404 COA COURSE FILE JAN JUN 2025.docxCS-404 COA COURSE FILE JAN JUN 2025.docx
CS-404 COA COURSE FILE JAN JUN 2025.docx
nidarizvitit
 
hersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distributionhersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distribution
hershtara1
 
Understanding Complex Development Processes
Understanding Complex Development ProcessesUnderstanding Complex Development Processes
Understanding Complex Development Processes
Process mining Evangelist
 
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
OlhaTatokhina1
 
Process Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBSProcess Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBS
Process mining Evangelist
 
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
Taqyea
 
Sets theories and applications that can used to imporve knowledge
Sets theories and applications that can used to imporve knowledgeSets theories and applications that can used to imporve knowledge
Sets theories and applications that can used to imporve knowledge
saumyasl2020
 
What is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdfWhat is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdf
SaikatBasu37
 
Ann Naser Nabil- Data Scientist Portfolio.pdf
Ann Naser Nabil- Data Scientist Portfolio.pdfAnn Naser Nabil- Data Scientist Portfolio.pdf
Ann Naser Nabil- Data Scientist Portfolio.pdf
আন্ নাসের নাবিল
 
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdfTOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
NhiV747372
 
problem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursingproblem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursing
vishnudathas123
 
Process Mining at Dimension Data - Jan vermeulen
Process Mining at Dimension Data - Jan vermeulenProcess Mining at Dimension Data - Jan vermeulen
Process Mining at Dimension Data - Jan vermeulen
Process mining Evangelist
 
Feature Engineering for Electronic Health Record Systems
Feature Engineering for Electronic Health Record SystemsFeature Engineering for Electronic Health Record Systems
Feature Engineering for Electronic Health Record Systems
Process mining Evangelist
 
AWS RDS Presentation to make concepts easy.pptx
AWS RDS Presentation to make concepts easy.pptxAWS RDS Presentation to make concepts easy.pptx
AWS RDS Presentation to make concepts easy.pptx
bharatkumarbhojwani
 
Automated Melanoma Detection via Image Processing.pptx
Automated Melanoma Detection via Image Processing.pptxAutomated Melanoma Detection via Image Processing.pptx
Automated Melanoma Detection via Image Processing.pptx
handrymaharjan23
 
Process Mining as Enabler for Digital Transformations
Process Mining as Enabler for Digital TransformationsProcess Mining as Enabler for Digital Transformations
Process Mining as Enabler for Digital Transformations
Process mining Evangelist
 
Adopting Process Mining at the Rabobank - use case
Adopting Process Mining at the Rabobank - use caseAdopting Process Mining at the Rabobank - use case
Adopting Process Mining at the Rabobank - use case
Process mining Evangelist
 
Chapter 6-3 Introducingthe Concepts .pptx
Chapter 6-3 Introducingthe Concepts .pptxChapter 6-3 Introducingthe Concepts .pptx
Chapter 6-3 Introducingthe Concepts .pptx
PermissionTafadzwaCh
 
2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
2-Raction quotient_١٠٠١٤٦.ppt of physical chemisstry
bastakwyry
 
L1_Slides_Foundational Concepts_508.pptx
L1_Slides_Foundational Concepts_508.pptxL1_Slides_Foundational Concepts_508.pptx
L1_Slides_Foundational Concepts_508.pptx
38NoopurPatel
 
CS-404 COA COURSE FILE JAN JUN 2025.docx
CS-404 COA COURSE FILE JAN JUN 2025.docxCS-404 COA COURSE FILE JAN JUN 2025.docx
CS-404 COA COURSE FILE JAN JUN 2025.docx
nidarizvitit
 
hersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distributionhersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distribution
hershtara1
 
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
OlhaTatokhina1
 
Process Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBSProcess Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBS
Process mining Evangelist
 
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
Taqyea
 
Sets theories and applications that can used to imporve knowledge
Sets theories and applications that can used to imporve knowledgeSets theories and applications that can used to imporve knowledge
Sets theories and applications that can used to imporve knowledge
saumyasl2020
 
What is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdfWhat is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdf
SaikatBasu37
 
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdfTOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
NhiV747372
 
problem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursingproblem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursing
vishnudathas123
 
Process Mining at Dimension Data - Jan vermeulen
Process Mining at Dimension Data - Jan vermeulenProcess Mining at Dimension Data - Jan vermeulen
Process Mining at Dimension Data - Jan vermeulen
Process mining Evangelist
 
Feature Engineering for Electronic Health Record Systems
Feature Engineering for Electronic Health Record SystemsFeature Engineering for Electronic Health Record Systems
Feature Engineering for Electronic Health Record Systems
Process mining Evangelist
 
AWS RDS Presentation to make concepts easy.pptx
AWS RDS Presentation to make concepts easy.pptxAWS RDS Presentation to make concepts easy.pptx
AWS RDS Presentation to make concepts easy.pptx
bharatkumarbhojwani
 
Automated Melanoma Detection via Image Processing.pptx
Automated Melanoma Detection via Image Processing.pptxAutomated Melanoma Detection via Image Processing.pptx
Automated Melanoma Detection via Image Processing.pptx
handrymaharjan23
 
Process Mining as Enabler for Digital Transformations
Process Mining as Enabler for Digital TransformationsProcess Mining as Enabler for Digital Transformations
Process Mining as Enabler for Digital Transformations
Process mining Evangelist
 
Adopting Process Mining at the Rabobank - use case
Adopting Process Mining at the Rabobank - use caseAdopting Process Mining at the Rabobank - use case
Adopting Process Mining at the Rabobank - use case
Process mining Evangelist
 
Chapter 6-3 Introducingthe Concepts .pptx
Chapter 6-3 Introducingthe Concepts .pptxChapter 6-3 Introducingthe Concepts .pptx
Chapter 6-3 Introducingthe Concepts .pptx
PermissionTafadzwaCh
 

Transforming AI with Graphs: Real World Examples using Spark and Neo4j

  • 1. WIFI SSID:Spark+AISummit | Password: UnifiedDataAnalytics
  • 2. Alicia Frame, PhD Senior Data Scientist, Neo4j Transforming AI with Graphs: Real World Examples with Spark & Neo4j #UnifiedDataAnalytics #SparkAISummit
  • 4. Financial Services Drug Discovery Recommendations Cybersecurity Predictive Maintenance Customer Segmentation Churn Prediction Search/MDM Graph Data Science Applications
  • 5. CAR DRIVES name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 since: Jan 10, 2011 brand: “Volvo” model: “V70” Latitude: 37.5629900° Longitude: -122.3255300° Nodes • Can have Labels to classify nodes • Labels have native indexes Relationships • Relate nodes by type and direction Properties • Attributes of Nodes & Relationships • Stored as Name/Value pairs • Can have indexes and composite indexes MARRIED TO LIVES WITH OW NS PERSON PERSON 7 Labeled Property Graphs
  • 6. • Current data science models ignore network structure • Graphs add highly predictive features to existing ML models • Otherwise unattainable predictions based on relationships Novel & More Accurate Predictions with the Data You Already Have Machine Learning Pipeline
  • 7. “The idea is that graph networks are bigger than any one machine-learning approach. Graphs bring an ability to generalize about structure that the individual neural nets don't have.” "Where do the graphs come from that graph networks operate over?”
  • 8. Building a Graph ML Model Data Sources Native Graph Platform Machine Learning Aggregate Disparate Data and Cleanse Build Predictive ModelsUnify Graphs and Engineer Features Parquet JSON and more… MLlib and more…
  • 9. Spark Graph Native Graph Platform Machine Learning Example: Spark & Neo4j Workflow Graph Transactions Graph Analytics Cypher 9 in Spark 3.0 to create non- persistent graphs MLlib to Train Models Native Graph Algorithms, Processing, and Storage
  • 10. Explore Graphs Build Graph Solutions • Massively scalable • Powerful data pipelining • Robust ML Libraries • Non-persistent, non-native graphs • Persistent, dynamic graphs • Graph native query and algorithm performance • Constantly growing list of graph algorithms and embeddings
  • 11. The Steps of Graph Data Science Query Based Knowledge Graph Query Based Feature Engineering Graph Algorithm Feature Engineering Graph Embeddings Graph Neural Networks DataScienceComplexity Knowledge Graphs Graph Feature Engineering Graph Native Learning Graph Persistence
  • 12. Steps Forward in Graph Data Science Query Based Knowledge Graph Query Based Feature Engineering Graph Algorithm Feature Engineering Graph Embeddings Graph Neural Networks Enterprise Maturity DataScienceComplexity
  • 13. Query based knowledge graphs: Connecting the Dots at NASA “Using Neo4j someone from our Orion project found information from the Apollo project that prevented an issue, saving well over two years of work and one million dollars of taxpayer funds.”
  • 14. Steps Forward in Graph Data Science Query Based Knowledge Graph Graph Algorithm Feature Engineering Graph Embeddings Graph Neural Networks Query Based Feature Engineering Enterprise Maturity DataScienceComplexity
  • 15. Churn prediction research has found that simple hand- engineered features are highly predictive • How many calls/texts has an account made? • How many of their contacts have churned? Query-Based Feature Engineering Telecom-churn prediction Telecommunication networks are easily represented as graphs
  • 16. Query-Based Feature Engineering Telecom-churn prediction 23 Add connected features based on graph queries to tabular data Khan et al, 2015
  • 17. Spark Graph Native Graph Platform Machine Learning • Merge distributed data into DataFrames • Reshape your tables into graphs • Explore cypher queries • Move to Neo4j to build expert queries • Persist your graph Knowledge Graphs: Getting Started Example with Spark • Bring query based graph features to ML pipeline Graph Transactions Graph Analytics
  • 18. Steps Forward in Graph Data Science Query Based Feature Engineering Graph Embeddings Graph Neural Networks Query Based Knowledge Graph Graph Algorithm Feature Engineering Enterprise Maturity DataScienceComplexity
  • 19. Feature Engineering is how we combine and process the data to create new, more meaningful features, such as clustering or connectivity metrics. Graph Feature Engineering Add More Descriptive Features: - Influence - Relationships - Communities
  • 20. Graph Feature Categories & Algorithms Pathfinding & Search Finds the optimal paths or evaluates route availability and quality Centrality / Importance Determines the importance of distinct nodes in the network Community Detection Detects group clustering or partition options Heuristic Link Prediction Estimates the likelihood of nodes forming a relationship Evaluates how alike nodes are Similarity Embeddings Learned representations of connectivity or topology
  • 21. • Connected components to identify disjointed graphs sharing identifiers • PageRank to measure influence and transaction volumes • Louvain to identify communities that frequently interact • Jaccard to measure account similarity based on relationships Financial Crime: Detecting Fraud Large financial institutions already have existing pipelines to identify fraud via heuristics and models Graph based features improve accuracy:
  • 22. +142,000 Peer Reviewed Publications Graph Fraud / Anomaly Detection in the last 10 years
  • 23. Spark Graph Native Graph Platform Machine Learning • Merge distributed data into DataFrames • Reshape your tables into graphs • Explore cypher queries and simple algorithms • Persist your graph • Create rule based features • Run native graph algorithms and write to graph or stream Graph Feature Engineering: Getting Started Example with Spark • Bring graph features to ML pipeline for training Graph Transactions Graph Analytics
  • 24. Graph Algorithms in Neo4J • Parallel Breadth First Search • Parallel Depth First Search • Shortest Path • Single-Source Shortest Path • All Pairs Shortest Path • Minimum Spanning Tree • A* Shortest Path • Yen’s K Shortest Path • K-Spanning Tree (MST) • Random Walk • Degree Centrality • Closeness Centrality • CC Variations: Harmonic, Dangalchev, Wasserman & Faust • Betweenness Centrality • Approximate Betweenness Centrality • PageRank • Personalized PageRank • ArticleRank • Eigenvector Centrality • Triangle Count • Clustering Coefficients • Connected Components (Union Find) • Strongly Connected Components • Label Propagation • Louvain Modularity – 1 Step & Multi-Step • Balanced Triad (identification) • Euclidean Distance • Cosine Similarity • Jaccard Similarity • Overlap Similarity • Pearson Similarity Pathfinding & Search Centrality / Importance Community Detection Similarity neo4j.com/docs/ graph-algorithms/current/ Link Prediction • Adamic Adar • Common Neighbors • Preferential Attachment • Resource Allocations • Same Community • Total Neighbors
  • 25. Graph Algorithms in Neo4J • Parallel Breadth First Search • Parallel Depth First Search • Shortest Path • Single-Source Shortest Path • All Pairs Shortest Path • Minimum Spanning Tree • A* Shortest Path • Yen’s K Shortest Path • K-Spanning Tree (MST) • Random Walk • Degree Centrality • Closeness Centrality • CC Variations: Harmonic, Dangalchev, Wasserman & Faust • Betweenness Centrality • Approximate Betweenness Centrality • PageRank • Personalized PageRank • ArticleRank • Eigenvector Centrality • Triangle Count • Clustering Coefficients • Connected Components (Union Find) • Strongly Connected Components • Label Propagation • Louvain Modularity – 1 Step & Multi-Step • Balanced Triad (identification) • Euclidean Distance • Cosine Similarity • Jaccard Similarity • Overlap Similarity • Pearson Similarity Pathfinding & Search Centrality / Importance Community Detection Similarity neo4j.com/docs/ graph-algorithms/current/ Link Prediction • Adamic Adar • Common Neighbors • Preferential Attachment • Resource Allocations • Same Community • Total Neighbors
  • 26. Steps Forward in Graph Data Science Query Based Knowledge Graph Graph Algorithm Feature Engineering Graph Neural Networks Query Based Feature Engineering Graph Embeddings Enterprise Maturity DataScienceComplexity
  • 27. Embedding transforms graphs into a vector, or set of vectors, describing topology, connectivity, or attributes of nodes and edges in the graph Graph Embeddings • Vertex embeddings: describe connectivity of each node • Path embeddings: traversals across the graph • Graph embeddings: encode an entire graph into a single vector
  • 28. Explainable Reasoning over Knowledge Graphs for Recommendation Graph Embeddings - Recommendations
  • 29. Explainable Reasoning over Knowledge Graphs for Recommendation Graph Embeddings - Recommendations
  • 30. Spark Graph Native Graph Platform Machine Learning • Merge distributed data into DataFrames • Reshape your tables into graphs • Explore cypher queries and simple algorithms • Move to Neo4j to build expert queries • Write to persist • Stay tuned for DeepWalk and DeepGL algorithms Graph Feature Engineering: Getting Started Example with Spark • Bring graph features to ML pipeline for training Graph Transactions Graph Analytics
  • 31. Steps Forward in Graph Data Science Query Based Knowledge Graph Graph Algorithm Feature Engineering Query Based Feature Engineering Graph Neural Networks Graph Embeddings Enterprise Maturity DataScienceComplexity
  • 32. Deep Learning refers to training multi-layer neural networks using gradient descent Graph Native Learning
  • 33. Graph Native Learning refers to deep learning models that take a graph as an input, performs computations, and return a graph Graph Native Learning Battaglia et al, 2018
  • 34. Example: electron path prediction Bradshaw et al, 2019 Graph Native Learning Given reactants and reagents, what will the products be? Given reactants and reagents, what will the products be?
  • 35. Example: electron path prediction Graph Native Learning
  • 36. Progressing in Graph Data Science Query Based Knowledge Graph Query Based Feature Engineering Graph Algorithm Feature Engineering Graph Embeddings Graph Neural Networks Enterprise Maturity DataScienceComplexity Knowledge Graphs Graph Feature Engineering Graph Native Learning Graph Persistence
  • 37. Resources Business • neo4j.com/use-cases/artificial-intelligence-analytics/ Data Scientists/Developers • neo4j.com/sandbox • neo4j.com/developer/ • community.neo4j.com alicia.frame@neo4j.com @aliciaframe1 44#UnifiedAnalytics #SparkAISummit
  • 38. DON’T FORGET TO RATE AND REVIEW THE SESSIONS SEARCH SPARK + AI SUMMIT
  翻译: