SlideShare a Scribd company logo
ML at Facebook:
An Infrastructure View
Yangqing Jia
Director, Facebook AI Infra
(* This is not
Facebook AI Infra)
Facebook ML Infrastructure - 2018 slides
The Machine Learning Moore’s Law?
0
500
1000
1500
2000
2500
2001 2003 2005 2007 2009 2011 2013 2015 2017
NumberofCitations
Machine Learning Execution Flow
Data Features Training Evaluation Inference
Offline Online
Machine Learning Execution Flow
Data Features Training Eval Inference PredictionsModel
It’s an infrastructure challenge
Data
Offline Training Online Inference
Storage
Challenges!
Network
Challenges!
Compute
Challenges!
Let’s Answer Some Pressing Questions
• Does Facebook leverage machine learning?
• Does Facebook design hardware?
• Does Facebook design hardware for machine learning?
• What platforms and frameworks exist; can the community use them?
• What assumptions break when supporting 2B people?
Does Facebook Use Machine Learning?
News Feed Ads Search
Language Translation
Sigma Facer Lumos
Speech Recognition
Content
Understanding
Classification
& Ranking
Services
What ML Models Do We Leverage?
GBDTSVM MLP CNN RNN
Support
Vector
Machines
Gradient-
Boosted
Decision Trees
Multi-Layer
Perceptron
Convolutional
Neural Nets
Recurrent
Neural Nets
Facer Sigma News Feed
Ads
Search
Sigma
Facer
Lumos
Language
Translation
Content
Understanding
Speech Rec
How Often Do We Train Models?
minutes hours days months
How Long Does Training Take?
seconds minutes hours days
How Much Compute Does Inference Consume?
100X 10x 1x
Does Facebook Design Hardware?
• Yes! Since 2010! All designs released through open compute!
• Facebook Server Design Philosophy
• Identify a small number of major services with unique resource requirements
• Design servers for those major services
One major
server design
versus
A B
D
C
Customized, dedicated hardwareGlobal shared pool
Does Facebook Design Hardware?
Yosemite/Twin Lakes:
For the web tier and
other “stateless services”
Open Compute “Sleds”
are 2U x 3 Across in an
Open Compute Rack
Server Card
Chassis
4-way
Shared NIC
SSD Boot
Rack
Does Facebook Design Hardware?
Tioga Pass:
For compute or memory-
intensive workloads:
Bryce Canyon:
For storage-heavy workloads:
Does Facebook Design Hardware for AI/ML?
• HP SL270s (2013): learning serviceability, thermal, perf, reliability, cluster mgmt.
• Big Sur (M40) -> Big Basin (P100) -> Big Basin Volta (V100)
Big Sur
Integrated Compute
8 Nvidia M40 GPUs
Big Basin
JBOG Design (CPU headnode)
8 Nvidia P100 / V100 GPUs
Putting it Together
Data Features Training Evaluation Inference
Bryce Canyon Big Basin, Tioga Pass
Tioga Pass
Twin Lakes
Let’s Answer Some Pressing Questions
• Does Facebook leverage machine learning?
• Does Facebook design hardware?
• Does Facebook design hardware for machine learning?
• What platforms and frameworks exist; can the community use them?
• What assumptions break when supporting 2B people?
Facebook AI Frameworks
Infra Efficiency for Production
• Stability
• Scale & Speed
• Data Integration
• Relatively Fixed
Developer Efficiency for Research
• Flexible
• Fast Iteration
• Highly Debuggable
• Less Robust
Facebook AI Frameworks
Infra Efficiency for Production
• Stability
• Scale & Speed
• Data Integration
• Relatively Fixed
Developer Efficiency for Research
• Flexible
• Fast Iteration
• Highly Debuggable
• Less Robust
OpenGL ESNNPACK
Metal™/
MPSCNN
Qualcomm
Snapdragon
NPE
CUDA/cuDNN
Facebook ML Infrastructure - 2018 slides
Facebook ML Infrastructure - 2018 slides
Facebook ML Infrastructure - 2018 slides
Deep Learning Frameworks
Framework
backends
Vendor and numeric libraries
Apple CoreML Nvidia TensorRT
Intel/Nervana
ngraph
Qualcom
SNPE
…
O(n2) pairs
Shared model and operator representation
Open Neural Network Exchange
Framework
backends
Vendor and numeric libraries
Apple CoreML Nvidia TensorRT
Intel/Nervana
ngraph
Qualcom
SNPE
…
From O(n2) to O(n) pairs
Putting it Together
Data Features Training Evaluation Inference
Bryce Canyon Big Basin, Tioga Pass
Tioga Pass
Twin Lakes
Data APIs Caffe2, PyTorch, ONNX C2 Predictor
Facebook AI Ecosystem
Frameworks: Core ML Software
Caffe2 / PyTorch / ONNX
Platforms: Workflow Management, Deployment
FB Learner
Infrastructure: Servers, Storage, Network Strategy
Open Compute Project
FB Learner Platform
FB Learner
Feature Store
FB Learner
Flow
FB Learner
Predictor
• AI Workflow
• Model Management and Deployment
FBLearner in ML
Data Features Training Eval Inference PredictionsModel
FBLearner
Flow
FBLearner
Predictor
FBLearner
Feature
Store
CPU+GPUCPU CPU
Putting it All Together
Data Features Training Evaluation Inference
Bryce Canyon Big Basin, Tioga Pass
Tioga Pass
Twin Lakes
Feature Store FBLearner Flow FBL Predictor
Data APIs Caffe2, PyTorch, ONNX C2 Predictor
2 Billion People
What changes when you scale to over
Scaling Challenges / Opportunities
Lots of Data Lots of Compute
Scaling Opportunity: Free Compute!
Santa Clara, California
Ashburn, Virginia
Prineville, Oregon
Forest City, North Carolina
Lulea, Sweden
Altoona, Iowa
Fort Worth, Texas
Clonee, Ireland
Los Lunas, New Mexico
Odense, Denmark
New Albany, Ohio
Papillion, Nebraska
Key Takeaways
Facebook AI
Lots of
Data
Wide variety
of models
Full stack
challenges Global scale
Kim Hazelwood Sarah Bird David Brooks Soumith Chintala Utku Diril Dmytro Dzhulgakov
Mohamed Fawzy Bill Jia Yangqing Jia Aditya Kalro James Law
Kevin Lee Jason Lu Pieter Noordhuis Misha Smelyanskiy Liang Xiong Xiaodong Wang
Facebook ML Infrastructure - 2018 slides
Ad

More Related Content

What's hot (20)

Cloud Computing in the Real-World 1.pptx
Cloud Computing in the Real-World 1.pptxCloud Computing in the Real-World 1.pptx
Cloud Computing in the Real-World 1.pptx
ArifKhursheedAcademi
 
Cloud Computing Architecture
Cloud Computing Architecture Cloud Computing Architecture
Cloud Computing Architecture
Vasu Jain
 
Microsoft Azure - Introduction
Microsoft Azure - IntroductionMicrosoft Azure - Introduction
Microsoft Azure - Introduction
Pranav Ainavolu
 
Cloud Computing Security
Cloud Computing SecurityCloud Computing Security
Cloud Computing Security
Ninh Nguyen
 
Application Assessment - Executive Summary Report
Application Assessment - Executive Summary ReportApplication Assessment - Executive Summary Report
Application Assessment - Executive Summary Report
CAST
 
F5 DDoS Protection
F5 DDoS ProtectionF5 DDoS Protection
F5 DDoS Protection
MarketingArrowECS_CZ
 
Third Party Cloud Management
Third Party Cloud ManagementThird Party Cloud Management
Third Party Cloud Management
Orchestrate Mortgage and Title Solutions, LLC
 
Cloud computing risk & challenges
Cloud computing risk & challengesCloud computing risk & challenges
Cloud computing risk & challenges
Parag Deodhar
 
Introduction to Cloud Computing
Introduction to Cloud Computing Introduction to Cloud Computing
Introduction to Cloud Computing
CloudSyntrix
 
Cloud Computing_Unit 1- Part 1.pptx
Cloud Computing_Unit 1- Part 1.pptxCloud Computing_Unit 1- Part 1.pptx
Cloud Computing_Unit 1- Part 1.pptx
Vivek Shelke
 
Cloud security and security architecture
Cloud security and security architectureCloud security and security architecture
Cloud security and security architecture
Vladimir Jirasek
 
Cloud Computing Fundamentals
Cloud Computing FundamentalsCloud Computing Fundamentals
Cloud Computing Fundamentals
Sonia Nagpal
 
Building Real-Time Travel Alerts
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel Alerts
Timothy Spann
 
cloud computing ppt
cloud computing pptcloud computing ppt
cloud computing ppt
himanshuawasthi2109
 
CLOUD ARCHITECTURE AND SERVICES.pptx
CLOUD ARCHITECTURE AND SERVICES.pptxCLOUD ARCHITECTURE AND SERVICES.pptx
CLOUD ARCHITECTURE AND SERVICES.pptx
Dr Geetha Mohan
 
Introducing Neo4j
Introducing Neo4jIntroducing Neo4j
Introducing Neo4j
Neo4j
 
Virtualization and cloud computing
Virtualization and cloud computingVirtualization and cloud computing
Virtualization and cloud computing
Deep Gupta
 
Cloud Security
Cloud SecurityCloud Security
Cloud Security
AWS User Group Bengaluru
 
Implementing zero trust architecture in azure hybrid cloud
Implementing zero trust architecture in azure hybrid cloudImplementing zero trust architecture in azure hybrid cloud
Implementing zero trust architecture in azure hybrid cloud
Ajit Bhingarkar
 
Edge Computing & AI
Edge Computing & AIEdge Computing & AI
Edge Computing & AI
Paul O'Hagan
 
Cloud Computing in the Real-World 1.pptx
Cloud Computing in the Real-World 1.pptxCloud Computing in the Real-World 1.pptx
Cloud Computing in the Real-World 1.pptx
ArifKhursheedAcademi
 
Cloud Computing Architecture
Cloud Computing Architecture Cloud Computing Architecture
Cloud Computing Architecture
Vasu Jain
 
Microsoft Azure - Introduction
Microsoft Azure - IntroductionMicrosoft Azure - Introduction
Microsoft Azure - Introduction
Pranav Ainavolu
 
Cloud Computing Security
Cloud Computing SecurityCloud Computing Security
Cloud Computing Security
Ninh Nguyen
 
Application Assessment - Executive Summary Report
Application Assessment - Executive Summary ReportApplication Assessment - Executive Summary Report
Application Assessment - Executive Summary Report
CAST
 
Cloud computing risk & challenges
Cloud computing risk & challengesCloud computing risk & challenges
Cloud computing risk & challenges
Parag Deodhar
 
Introduction to Cloud Computing
Introduction to Cloud Computing Introduction to Cloud Computing
Introduction to Cloud Computing
CloudSyntrix
 
Cloud Computing_Unit 1- Part 1.pptx
Cloud Computing_Unit 1- Part 1.pptxCloud Computing_Unit 1- Part 1.pptx
Cloud Computing_Unit 1- Part 1.pptx
Vivek Shelke
 
Cloud security and security architecture
Cloud security and security architectureCloud security and security architecture
Cloud security and security architecture
Vladimir Jirasek
 
Cloud Computing Fundamentals
Cloud Computing FundamentalsCloud Computing Fundamentals
Cloud Computing Fundamentals
Sonia Nagpal
 
Building Real-Time Travel Alerts
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel Alerts
Timothy Spann
 
CLOUD ARCHITECTURE AND SERVICES.pptx
CLOUD ARCHITECTURE AND SERVICES.pptxCLOUD ARCHITECTURE AND SERVICES.pptx
CLOUD ARCHITECTURE AND SERVICES.pptx
Dr Geetha Mohan
 
Introducing Neo4j
Introducing Neo4jIntroducing Neo4j
Introducing Neo4j
Neo4j
 
Virtualization and cloud computing
Virtualization and cloud computingVirtualization and cloud computing
Virtualization and cloud computing
Deep Gupta
 
Implementing zero trust architecture in azure hybrid cloud
Implementing zero trust architecture in azure hybrid cloudImplementing zero trust architecture in azure hybrid cloud
Implementing zero trust architecture in azure hybrid cloud
Ajit Bhingarkar
 
Edge Computing & AI
Edge Computing & AIEdge Computing & AI
Edge Computing & AI
Paul O'Hagan
 

Similar to Facebook ML Infrastructure - 2018 slides (20)

Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....
Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....
Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....
Tobias Schneck
 
Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!
Tobias Schneck
 
Containers & AI - Beauty and the Beast !?! @MLCon - 27.6.2024
Containers & AI - Beauty and the Beast !?! @MLCon - 27.6.2024Containers & AI - Beauty and the Beast !?! @MLCon - 27.6.2024
Containers & AI - Beauty and the Beast !?! @MLCon - 27.6.2024
Tobias Schneck
 
2018 .NET Conf - 利用Machine Learning .NET整合機器學習至應用程式
2018 .NET Conf - 利用Machine Learning .NET整合機器學習至應用程式2018 .NET Conf - 利用Machine Learning .NET整合機器學習至應用程式
2018 .NET Conf - 利用Machine Learning .NET整合機器學習至應用程式
Alan Tsai
 
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Ian Gomez
 
Tour de France Azure PaaS 6/7 Ajouter de l'intelligence
Tour de France Azure PaaS 6/7 Ajouter de l'intelligenceTour de France Azure PaaS 6/7 Ajouter de l'intelligence
Tour de France Azure PaaS 6/7 Ajouter de l'intelligence
Alex Danvy
 
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at ScaleData Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Databricks
 
Introduction to ML.NET
Introduction to ML.NETIntroduction to ML.NET
Introduction to ML.NET
Gianni Rosa Gallina
 
Technology and AI sharing - From 2016 to Y2017 and Beyond
Technology and AI sharing - From 2016 to Y2017 and BeyondTechnology and AI sharing - From 2016 to Y2017 and Beyond
Technology and AI sharing - From 2016 to Y2017 and Beyond
James Huang
 
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine LearningPaige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
Edunomica
 
Session 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data BenchmarksSession 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data Benchmarks
DataBench
 
Microsoft power platform
Microsoft power platformMicrosoft power platform
Microsoft power platform
Jenkins NS
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 edition
David Talby
 
Introduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI PlatformIntroduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI Platform
Indrajit Poddar
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
James Anderson
 
Maintainable Machine Learning Products
Maintainable Machine Learning ProductsMaintainable Machine Learning Products
Maintainable Machine Learning Products
Andrew Musselman
 
Microsoft AI Platform Overview
Microsoft AI Platform OverviewMicrosoft AI Platform Overview
Microsoft AI Platform Overview
David Chou
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
Databricks
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, and Deep Learnin...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, and Deep Learnin...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, and Deep Learnin...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, and Deep Learnin...
Databricks
 
Data Con LA 2018 - A Tale of DL Frameworks: TensorFlow, Keras, & Deep Learnin...
Data Con LA 2018 - A Tale of DL Frameworks: TensorFlow, Keras, & Deep Learnin...Data Con LA 2018 - A Tale of DL Frameworks: TensorFlow, Keras, & Deep Learnin...
Data Con LA 2018 - A Tale of DL Frameworks: TensorFlow, Keras, & Deep Learnin...
Data Con LA
 
Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....
Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....
Kubernetes and AI - Beauty and the Beast - Tobias Schneck - DOAG 24 NUE - 20....
Tobias Schneck
 
Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!Containers & AI - Beauty and the Beast!?!
Containers & AI - Beauty and the Beast!?!
Tobias Schneck
 
Containers & AI - Beauty and the Beast !?! @MLCon - 27.6.2024
Containers & AI - Beauty and the Beast !?! @MLCon - 27.6.2024Containers & AI - Beauty and the Beast !?! @MLCon - 27.6.2024
Containers & AI - Beauty and the Beast !?! @MLCon - 27.6.2024
Tobias Schneck
 
2018 .NET Conf - 利用Machine Learning .NET整合機器學習至應用程式
2018 .NET Conf - 利用Machine Learning .NET整合機器學習至應用程式2018 .NET Conf - 利用Machine Learning .NET整合機器學習至應用程式
2018 .NET Conf - 利用Machine Learning .NET整合機器學習至應用程式
Alan Tsai
 
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
Ian Gomez
 
Tour de France Azure PaaS 6/7 Ajouter de l'intelligence
Tour de France Azure PaaS 6/7 Ajouter de l'intelligenceTour de France Azure PaaS 6/7 Ajouter de l'intelligence
Tour de France Azure PaaS 6/7 Ajouter de l'intelligence
Alex Danvy
 
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at ScaleData Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Data Agility—A Journey to Advanced Analytics and Machine Learning at Scale
Databricks
 
Technology and AI sharing - From 2016 to Y2017 and Beyond
Technology and AI sharing - From 2016 to Y2017 and BeyondTechnology and AI sharing - From 2016 to Y2017 and Beyond
Technology and AI sharing - From 2016 to Y2017 and Beyond
James Huang
 
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine LearningPaige Roberts: Shortcut MLOps with In-Database Machine Learning
Paige Roberts: Shortcut MLOps with In-Database Machine Learning
Edunomica
 
Session 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data BenchmarksSession 1 - The Current Landscape of Big Data Benchmarks
Session 1 - The Current Landscape of Big Data Benchmarks
DataBench
 
Microsoft power platform
Microsoft power platformMicrosoft power platform
Microsoft power platform
Jenkins NS
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 edition
David Talby
 
Introduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI PlatformIntroduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI Platform
Indrajit Poddar
 
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
GDG Cloud Southlake #16: Priyanka Vergadia: Scalable Data Analytics in Google...
James Anderson
 
Maintainable Machine Learning Products
Maintainable Machine Learning ProductsMaintainable Machine Learning Products
Maintainable Machine Learning Products
Andrew Musselman
 
Microsoft AI Platform Overview
Microsoft AI Platform OverviewMicrosoft AI Platform Overview
Microsoft AI Platform Overview
David Chou
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & Deep Learning ...
Databricks
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, and Deep Learnin...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, and Deep Learnin...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, and Deep Learnin...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, and Deep Learnin...
Databricks
 
Data Con LA 2018 - A Tale of DL Frameworks: TensorFlow, Keras, & Deep Learnin...
Data Con LA 2018 - A Tale of DL Frameworks: TensorFlow, Keras, & Deep Learnin...Data Con LA 2018 - A Tale of DL Frameworks: TensorFlow, Keras, & Deep Learnin...
Data Con LA 2018 - A Tale of DL Frameworks: TensorFlow, Keras, & Deep Learnin...
Data Con LA
 
Ad

More from Karthik Murugesan (20)

Rakuten - Recommendation Platform
Rakuten - Recommendation PlatformRakuten - Recommendation Platform
Rakuten - Recommendation Platform
Karthik Murugesan
 
Yahoo's Knowledge Graph - 2014 slides
Yahoo's Knowledge Graph - 2014 slidesYahoo's Knowledge Graph - 2014 slides
Yahoo's Knowledge Graph - 2014 slides
Karthik Murugesan
 
Free servers to build Big Data Systems on: Bing's Approach
Free servers to build Big Data Systems on: Bing's  Approach Free servers to build Big Data Systems on: Bing's  Approach
Free servers to build Big Data Systems on: Bing's Approach
Karthik Murugesan
 
Microsoft cosmos
Microsoft cosmosMicrosoft cosmos
Microsoft cosmos
Karthik Murugesan
 
Microsoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER IntroductionMicrosoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER Introduction
Karthik Murugesan
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
Karthik Murugesan
 
Lyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesLyft data Platform - 2019 slides
Lyft data Platform - 2019 slides
Karthik Murugesan
 
The Evolution of Spotify Home Architecture - Qcon 2019
The Evolution of Spotify Home Architecture - Qcon 2019The Evolution of Spotify Home Architecture - Qcon 2019
The Evolution of Spotify Home Architecture - Qcon 2019
Karthik Murugesan
 
Unifying Twitter around a single ML platform - Twitter AI Platform 2019
Unifying Twitter around a single ML platform  - Twitter AI Platform 2019Unifying Twitter around a single ML platform  - Twitter AI Platform 2019
Unifying Twitter around a single ML platform - Twitter AI Platform 2019
Karthik Murugesan
 
The magic behind your Lyft ride prices: A case study on machine learning and ...
The magic behind your Lyft ride prices: A case study on machine learning and ...The magic behind your Lyft ride prices: A case study on machine learning and ...
The magic behind your Lyft ride prices: A case study on machine learning and ...
Karthik Murugesan
 
The journey toward a self-service data platform at Netflix - sf 2019
The journey toward a self-service data platform at Netflix - sf 2019The journey toward a self-service data platform at Netflix - sf 2019
The journey toward a self-service data platform at Netflix - sf 2019
Karthik Murugesan
 
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
Karthik Murugesan
 
Developing a ML model using TF Estimator
Developing a ML model using TF EstimatorDeveloping a ML model using TF Estimator
Developing a ML model using TF Estimator
Karthik Murugesan
 
Production Model Deployment - StitchFix - 2018
Production Model Deployment - StitchFix - 2018Production Model Deployment - StitchFix - 2018
Production Model Deployment - StitchFix - 2018
Karthik Murugesan
 
Netflix factstore for recommendations - 2018
Netflix factstore  for recommendations - 2018Netflix factstore  for recommendations - 2018
Netflix factstore for recommendations - 2018
Karthik Murugesan
 
Trends in Music Recommendations 2018
Trends in Music Recommendations 2018Trends in Music Recommendations 2018
Trends in Music Recommendations 2018
Karthik Murugesan
 
Netflix Ads Personalization Solution - 2017
Netflix Ads Personalization Solution - 2017Netflix Ads Personalization Solution - 2017
Netflix Ads Personalization Solution - 2017
Karthik Murugesan
 
State Of AI 2018
State Of AI 2018State Of AI 2018
State Of AI 2018
Karthik Murugesan
 
Spotify Machine Learning Solution for Music Discovery
Spotify Machine Learning Solution for Music DiscoverySpotify Machine Learning Solution for Music Discovery
Spotify Machine Learning Solution for Music Discovery
Karthik Murugesan
 
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform
Karthik Murugesan
 
Rakuten - Recommendation Platform
Rakuten - Recommendation PlatformRakuten - Recommendation Platform
Rakuten - Recommendation Platform
Karthik Murugesan
 
Yahoo's Knowledge Graph - 2014 slides
Yahoo's Knowledge Graph - 2014 slidesYahoo's Knowledge Graph - 2014 slides
Yahoo's Knowledge Graph - 2014 slides
Karthik Murugesan
 
Free servers to build Big Data Systems on: Bing's Approach
Free servers to build Big Data Systems on: Bing's  Approach Free servers to build Big Data Systems on: Bing's  Approach
Free servers to build Big Data Systems on: Bing's Approach
Karthik Murugesan
 
Microsoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER IntroductionMicrosoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER Introduction
Karthik Murugesan
 
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
BIng NLP Expert - Dl summer-school-2017.-jianfeng-gao.v2
Karthik Murugesan
 
Lyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesLyft data Platform - 2019 slides
Lyft data Platform - 2019 slides
Karthik Murugesan
 
The Evolution of Spotify Home Architecture - Qcon 2019
The Evolution of Spotify Home Architecture - Qcon 2019The Evolution of Spotify Home Architecture - Qcon 2019
The Evolution of Spotify Home Architecture - Qcon 2019
Karthik Murugesan
 
Unifying Twitter around a single ML platform - Twitter AI Platform 2019
Unifying Twitter around a single ML platform  - Twitter AI Platform 2019Unifying Twitter around a single ML platform  - Twitter AI Platform 2019
Unifying Twitter around a single ML platform - Twitter AI Platform 2019
Karthik Murugesan
 
The magic behind your Lyft ride prices: A case study on machine learning and ...
The magic behind your Lyft ride prices: A case study on machine learning and ...The magic behind your Lyft ride prices: A case study on machine learning and ...
The magic behind your Lyft ride prices: A case study on machine learning and ...
Karthik Murugesan
 
The journey toward a self-service data platform at Netflix - sf 2019
The journey toward a self-service data platform at Netflix - sf 2019The journey toward a self-service data platform at Netflix - sf 2019
The journey toward a self-service data platform at Netflix - sf 2019
Karthik Murugesan
 
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
2019 Slides - Michelangelo Palette: A Feature Engineering Platform at Uber
Karthik Murugesan
 
Developing a ML model using TF Estimator
Developing a ML model using TF EstimatorDeveloping a ML model using TF Estimator
Developing a ML model using TF Estimator
Karthik Murugesan
 
Production Model Deployment - StitchFix - 2018
Production Model Deployment - StitchFix - 2018Production Model Deployment - StitchFix - 2018
Production Model Deployment - StitchFix - 2018
Karthik Murugesan
 
Netflix factstore for recommendations - 2018
Netflix factstore  for recommendations - 2018Netflix factstore  for recommendations - 2018
Netflix factstore for recommendations - 2018
Karthik Murugesan
 
Trends in Music Recommendations 2018
Trends in Music Recommendations 2018Trends in Music Recommendations 2018
Trends in Music Recommendations 2018
Karthik Murugesan
 
Netflix Ads Personalization Solution - 2017
Netflix Ads Personalization Solution - 2017Netflix Ads Personalization Solution - 2017
Netflix Ads Personalization Solution - 2017
Karthik Murugesan
 
Spotify Machine Learning Solution for Music Discovery
Spotify Machine Learning Solution for Music DiscoverySpotify Machine Learning Solution for Music Discovery
Spotify Machine Learning Solution for Music Discovery
Karthik Murugesan
 
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform
AirBNB - Zipline: Airbnb’s Machine Learning Data Management Platform
Karthik Murugesan
 
Ad

Recently uploaded (20)

Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptxWebinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
MSP360
 
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Raffi Khatchadourian
 
AI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of DocumentsAI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of Documents
UiPathCommunity
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
Build With AI - In Person Session Slides.pdf
Build With AI - In Person Session Slides.pdfBuild With AI - In Person Session Slides.pdf
Build With AI - In Person Session Slides.pdf
Google Developer Group - Harare
 
Optima Cyber - Maritime Cyber Security - MSSP Services - Manolis Sfakianakis ...
Optima Cyber - Maritime Cyber Security - MSSP Services - Manolis Sfakianakis ...Optima Cyber - Maritime Cyber Security - MSSP Services - Manolis Sfakianakis ...
Optima Cyber - Maritime Cyber Security - MSSP Services - Manolis Sfakianakis ...
Mike Mingos
 
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
James Anderson
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Safe Software
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
Lorenzo Miniero
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Jignesh Shah - The Innovator and Czar of Exchanges
Jignesh Shah - The Innovator and Czar of ExchangesJignesh Shah - The Innovator and Czar of Exchanges
Jignesh Shah - The Innovator and Czar of Exchanges
Jignesh Shah Innovator
 
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Cyntexa
 
AI You Can Trust: The Critical Role of Governance and Quality.pdf
AI You Can Trust: The Critical Role of Governance and Quality.pdfAI You Can Trust: The Critical Role of Governance and Quality.pdf
AI You Can Trust: The Critical Role of Governance and Quality.pdf
Precisely
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 
UiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer OpportunitiesUiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer Opportunities
DianaGray10
 
Q1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor PresentationQ1 2025 Dropbox Earnings and Investor Presentation
Q1 2025 Dropbox Earnings and Investor Presentation
Dropbox
 
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptxWebinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
MSP360
 
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Hybridize Functions: A Tool for Automatically Refactoring Imperative Deep Lea...
Raffi Khatchadourian
 
AI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of DocumentsAI Agents at Work: UiPath, Maestro & the Future of Documents
AI Agents at Work: UiPath, Maestro & the Future of Documents
UiPathCommunity
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
Optima Cyber - Maritime Cyber Security - MSSP Services - Manolis Sfakianakis ...
Optima Cyber - Maritime Cyber Security - MSSP Services - Manolis Sfakianakis ...Optima Cyber - Maritime Cyber Security - MSSP Services - Manolis Sfakianakis ...
Optima Cyber - Maritime Cyber Security - MSSP Services - Manolis Sfakianakis ...
Mike Mingos
 
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
GDG Cloud Southlake #42: Suresh Mathew: Autonomous Resource Optimization: How...
James Anderson
 
Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?Shoehorning dependency injection into a FP language, what does it take?
Shoehorning dependency injection into a FP language, what does it take?
Eric Torreborre
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Safe Software
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
RTP Over QUIC: An Interesting Opportunity Or Wasted Time?
Lorenzo Miniero
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Jignesh Shah - The Innovator and Czar of Exchanges
Jignesh Shah - The Innovator and Czar of ExchangesJignesh Shah - The Innovator and Czar of Exchanges
Jignesh Shah - The Innovator and Czar of Exchanges
Jignesh Shah Innovator
 
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Everything You Need to Know About Agentforce? (Put AI Agents to Work)
Cyntexa
 
AI You Can Trust: The Critical Role of Governance and Quality.pdf
AI You Can Trust: The Critical Role of Governance and Quality.pdfAI You Can Trust: The Critical Role of Governance and Quality.pdf
AI You Can Trust: The Critical Role of Governance and Quality.pdf
Precisely
 
AsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API DesignAsyncAPI v3 : Streamlining Event-Driven API Design
AsyncAPI v3 : Streamlining Event-Driven API Design
leonid54
 
UiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer OpportunitiesUiPath Agentic Automation: Community Developer Opportunities
UiPath Agentic Automation: Community Developer Opportunities
DianaGray10
 

Facebook ML Infrastructure - 2018 slides

  • 1. ML at Facebook: An Infrastructure View Yangqing Jia Director, Facebook AI Infra
  • 2. (* This is not Facebook AI Infra)
  • 4. The Machine Learning Moore’s Law? 0 500 1000 1500 2000 2500 2001 2003 2005 2007 2009 2011 2013 2015 2017 NumberofCitations
  • 5. Machine Learning Execution Flow Data Features Training Evaluation Inference Offline Online
  • 6. Machine Learning Execution Flow Data Features Training Eval Inference PredictionsModel
  • 7. It’s an infrastructure challenge Data Offline Training Online Inference Storage Challenges! Network Challenges! Compute Challenges!
  • 8. Let’s Answer Some Pressing Questions • Does Facebook leverage machine learning? • Does Facebook design hardware? • Does Facebook design hardware for machine learning? • What platforms and frameworks exist; can the community use them? • What assumptions break when supporting 2B people?
  • 9. Does Facebook Use Machine Learning? News Feed Ads Search Language Translation Sigma Facer Lumos Speech Recognition Content Understanding Classification & Ranking Services
  • 10. What ML Models Do We Leverage? GBDTSVM MLP CNN RNN Support Vector Machines Gradient- Boosted Decision Trees Multi-Layer Perceptron Convolutional Neural Nets Recurrent Neural Nets Facer Sigma News Feed Ads Search Sigma Facer Lumos Language Translation Content Understanding Speech Rec
  • 11. How Often Do We Train Models? minutes hours days months
  • 12. How Long Does Training Take? seconds minutes hours days
  • 13. How Much Compute Does Inference Consume? 100X 10x 1x
  • 14. Does Facebook Design Hardware? • Yes! Since 2010! All designs released through open compute! • Facebook Server Design Philosophy • Identify a small number of major services with unique resource requirements • Design servers for those major services One major server design versus A B D C Customized, dedicated hardwareGlobal shared pool
  • 15. Does Facebook Design Hardware? Yosemite/Twin Lakes: For the web tier and other “stateless services” Open Compute “Sleds” are 2U x 3 Across in an Open Compute Rack Server Card Chassis 4-way Shared NIC SSD Boot Rack
  • 16. Does Facebook Design Hardware? Tioga Pass: For compute or memory- intensive workloads: Bryce Canyon: For storage-heavy workloads:
  • 17. Does Facebook Design Hardware for AI/ML? • HP SL270s (2013): learning serviceability, thermal, perf, reliability, cluster mgmt. • Big Sur (M40) -> Big Basin (P100) -> Big Basin Volta (V100) Big Sur Integrated Compute 8 Nvidia M40 GPUs Big Basin JBOG Design (CPU headnode) 8 Nvidia P100 / V100 GPUs
  • 18. Putting it Together Data Features Training Evaluation Inference Bryce Canyon Big Basin, Tioga Pass Tioga Pass Twin Lakes
  • 19. Let’s Answer Some Pressing Questions • Does Facebook leverage machine learning? • Does Facebook design hardware? • Does Facebook design hardware for machine learning? • What platforms and frameworks exist; can the community use them? • What assumptions break when supporting 2B people?
  • 20. Facebook AI Frameworks Infra Efficiency for Production • Stability • Scale & Speed • Data Integration • Relatively Fixed Developer Efficiency for Research • Flexible • Fast Iteration • Highly Debuggable • Less Robust
  • 21. Facebook AI Frameworks Infra Efficiency for Production • Stability • Scale & Speed • Data Integration • Relatively Fixed Developer Efficiency for Research • Flexible • Fast Iteration • Highly Debuggable • Less Robust OpenGL ESNNPACK Metal™/ MPSCNN Qualcomm Snapdragon NPE CUDA/cuDNN
  • 25. Deep Learning Frameworks Framework backends Vendor and numeric libraries Apple CoreML Nvidia TensorRT Intel/Nervana ngraph Qualcom SNPE … O(n2) pairs
  • 26. Shared model and operator representation Open Neural Network Exchange Framework backends Vendor and numeric libraries Apple CoreML Nvidia TensorRT Intel/Nervana ngraph Qualcom SNPE … From O(n2) to O(n) pairs
  • 27. Putting it Together Data Features Training Evaluation Inference Bryce Canyon Big Basin, Tioga Pass Tioga Pass Twin Lakes Data APIs Caffe2, PyTorch, ONNX C2 Predictor
  • 28. Facebook AI Ecosystem Frameworks: Core ML Software Caffe2 / PyTorch / ONNX Platforms: Workflow Management, Deployment FB Learner Infrastructure: Servers, Storage, Network Strategy Open Compute Project
  • 29. FB Learner Platform FB Learner Feature Store FB Learner Flow FB Learner Predictor • AI Workflow • Model Management and Deployment
  • 30. FBLearner in ML Data Features Training Eval Inference PredictionsModel FBLearner Flow FBLearner Predictor FBLearner Feature Store CPU+GPUCPU CPU
  • 31. Putting it All Together Data Features Training Evaluation Inference Bryce Canyon Big Basin, Tioga Pass Tioga Pass Twin Lakes Feature Store FBLearner Flow FBL Predictor Data APIs Caffe2, PyTorch, ONNX C2 Predictor
  • 32. 2 Billion People What changes when you scale to over
  • 33. Scaling Challenges / Opportunities Lots of Data Lots of Compute
  • 35. Santa Clara, California Ashburn, Virginia Prineville, Oregon Forest City, North Carolina Lulea, Sweden Altoona, Iowa Fort Worth, Texas Clonee, Ireland Los Lunas, New Mexico Odense, Denmark New Albany, Ohio Papillion, Nebraska
  • 36. Key Takeaways Facebook AI Lots of Data Wide variety of models Full stack challenges Global scale
  • 37. Kim Hazelwood Sarah Bird David Brooks Soumith Chintala Utku Diril Dmytro Dzhulgakov Mohamed Fawzy Bill Jia Yangqing Jia Aditya Kalro James Law Kevin Lee Jason Lu Pieter Noordhuis Misha Smelyanskiy Liang Xiong Xiaodong Wang
  翻译: