SlideShare a Scribd company logo
TensorFlow Extended (TFX)
An End-to-End ML Platform
Konstantinos (Gus) Katsiapis
Google
Ahmet Altay
Google
Data
Ingestion
Data
Analysis + Validation
Feature
Engineering
Trainer
Model Evaluation
and Validation
Serving Logging
Shared Utilities for Garbage Collection, Data Access Controls
Pipeline Storage
Tuner
Shared Configuration Framework and Job Orchestration
Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
TensorFlow Extended (TFX) is an
end-to-end ML pipeline for TensorFlow
TFX powers our most important bets and
products...
(incl. )
Major ProductsAlphaBets
Data
Ingestion
Data
Analysis + Validation
Feature
Engineering
Trainer
Model Evaluation
and Validation
Serving Logging
Shared Utilities for Garbage Collection, Data Access Controls
Pipeline Storage
Tuner
Shared Configuration Framework and Job Orchestration
Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
TensorFlow Extended (TFX) is an
end-to-end ML pipeline for TensorFlow
… and some of our most important partners.
What is Apache Beam?
- A unified batch and stream distributed processing API
- A set of SDK frontends: Java, Python, Go, Scala, SQL
- A set of Runners which can execute Beam jobs into
various backends: Local, Apache Flink, Apache Spark,
Apache Gearpump, Apache Samza, Apache Hadoop,
Google Cloud Dataflow, …
Building Components out of Libraries
Data Ingestion
TensorFlow
Transform
Estimator Model
TensorFlow
Model Analysis
Honoring
Validation
Outcomes
TensorFlow
Data Validation
TensorFlow
Serving
ExampleGen
StatisticsGen
SchemaGen
Example
Validator
Transform Trainer
Evaluator
Model
Validator
Pusher Model Server
Powered by Beam Powered by Beam
What makes a Component
Model
Validator
Packaged binary
or container
What makes a Component
Last Validated
Model
New (Candidate)
Model
Validation
Outcome
Well defined
inputs and outputs
Model
Validator
What makes a Component
Config
Last Validated
Model
New (Candidate)
Model
Validation
Outcome
Well defined
configuration
Model
Validator
Metadata Store
What makes a Component
Last Validated
Model
New (Candidate)
Model
Validation
Outcome
Context
Model
Validator
Config
Metadata Store
What makes a Component
Trainer
Last Validated
Model
New (Candidate)
Model
New Model
Validation
Outcome
Pusher
New (Candidate)
Model
Validation
Outcome
Deployment targets:
TensorFlow Serving
TensorFlow Lite
TensorFlow JS
TensorFlow Hub
Model
Validator
Config
Metadata Store? That’s new
Trainer
Metadata Store? That’s new
Task-Aware Pipelines
Transform
Metadata Store? That’s new
Task-Aware Pipelines
Input Data
Transformed
Data
Trained
Models
Deployment
Task- and Data-Aware Pipelines
Pipeline + Metadata Storage
Training Data
TrainerTransformTrainerTransform
What’s in the Metadata Store?
Trained
Models
Type definitions of Artifacts and their Properties
E.g., Models, Data, Evaluation Metrics
What’s in the Metadata Store?
Trained
Models
Type definitions of Artifacts and their Properties
E.g., Models, Data, Evaluation Metrics
Trainer Execution Records (Runs) of Components
E.g., Runtime Configuration, Inputs + Outputs
What’s in the Metadata Store?
Trained
Models
Type definitions of Artifacts and their Properties
E.g., Models, Data, Evaluation Metrics
Trainer Execution Records (Runs) of Components
E.g., Runtime Configuration, Inputs + Outputs
Lineage Tracking Across All Executions
E.g., to recurse back to all inputs of a specific artifact
List all training runs and
attributes
Visualize lineage of a specific model
Model artifact
that was created
Visualize data a model was trained on
Visualize sliced eval metrics associated
with a model
Launch TensorBoard for a specific model run
Compare multiple model runs
Compare data statistics for multiple models
Examples of Metadata-Powered Functionality
Use-cases enabled by lineage tracking
Examples of Metadata-Powered Functionality
Compare previous model runsUse-cases enabled by lineage tracking
Examples of Metadata-Powered Functionality
Compare previous model runs
Carry-over state from previous models
Use-cases enabled by lineage tracking
Examples of Metadata-Powered Functionality
Use-cases enabled by lineage tracking Compare previous model runs
Carry-over state from previous models Re-use previously computed outputs
Use-cases enabled by lineage tracking
How do we orchestrate TFX?
Component Component Component
How do we orchestrate TFX?
Metadata Store
Component
Driver
Publisher
Component
Driver
Publisher
Component
Driver
Publisher
How do we orchestrate TFX?
Metadata Store
Component
Driver
Publisher
Component
Driver
Publisher
Component
Driver
Publisher
Executor Executor Executor
Metadata Store
Executors do the work
Driver
Transform, etc.
Publisher
Beam
Spark Dataflow
class Executor(base_executor.BaseExecutor):
"""Generic TFX statsgen executor."""
...
def Do(...) -> None:
"""Computes stats for each split of input using tensorflow_data_validation.
...
with beam.Pipeline(argv=self._get_beam_pipeline_args()) as p:
for split, instance in split_to_instance.items():
...
output_path = os.path.join(output_uri, _DEFAULT_FILE_NAME)
_ = (
p
| 'ReadData.' + split >> beam.io.ReadFromTFRecord(file_pattern=input_uri)
| 'DecodeData.' + split >> tf_example_decoder.DecodeTFExample()
| 'GenerateStatistics.' + split >> stats_api.GenerateStatistics(stats_options)
| 'WriteStatsOutput.' + split >> beam.io.WriteToTFRecord(
output_path,shard_name_template='',
coder=beam.coders.ProtoCoder(
statistics_pb2.DatasetFeatureStatisticsList)))
tf.logging.info('Statistics written to {}.'.format(output_uri))
Metadata Store
Executors do the work
Driver
Trainer
Publisher
TensorFlow
Metadata Store
Executors do the work
Driver
Pusher, etc
Publisher
TFX Config
How do we orchestrate TFX?
Metadata Store
Component
Driver
Publisher
Component
Driver
Publisher
Component
Driver
Publisher
Executor Executor Executor
def create_pipeline():
"""Implements the chicago taxi pipeline with TFX."""
examples = csv_input(os.path.join(data_root, 'simple'))
example_gen = CsvExampleGen(input_base=examples)
statistics_gen = StatisticsGen(input_data=...)
infer_schema = SchemaGen(stats=...)
validate_stats = ExampleValidator(stats=..., schema=...)
# Performs transformations and feature engineering in training and serving
transform = Transform(
input_data=example_gen.outputs.examples,
schema=infer_schema.outputs.output,
module_file=_taxi_module_file)
trainer = Trainer(...)
model_analyzer = Evaluator(examples=..., model_exports=...)
model_validator = ModelValidator(examples=..., model=...)
pusher = Pusher(model_export=..., model_blessing=..., serving_model_dir=...)
return [example_gen, statistics_gen, infer_schema, validate_stats, transform, trainer,
model_analyzer, model_validator, pusher]
pipeline = AirflowDAGRunner(_airflow_config).run(_create_pipeline())
… Back to orchestration
TFX Config
Metadata Store
Component
Driver
Publisher
Component
Driver
Publisher
Component
Driver
Publisher
Executor Executor Executor
Bring your very own favorite orchestrator
TFX Config
Metadata Store
Component
Driver
Publisher
Component
Driver
Publisher
Component
Driver
Publisher
Executor Executor Executor
Airflow Runtime Kubeflow Runtime Your own runtime...
Examples of orchestrated TFX pipelines
Airflow Kubeflow Pipelines
TFX: Putting it all together.
Overview
Data Ingestion
Data Analysis &
Validation
Data
Transformation
ExampleGen
45
Component: ExampleGen
Example
Gen
Raw Data
Inputs and Outputs
CSV TF Record
Split
TF Record
Data
Training
Eval
Component: ExampleGen
examples = csv_input(os.path.join(data_root, 'simple'))
example_gen = CsvExampleGen(input_base=examples)
Configuration
Eval
Example
Gen
Raw Data
CSV TF Record
Split
TF Record
Data
Training
Eval
Inputs and Outputs
Data Analysis & Validation
Data Analysis &
Validation
StatisticsGen
49
Component: StatisticsGen
StatisticsGen
Data
ExampleGen
Inputs and Outputs
Statistics
● Training
● Eval
● Serving logs (for skew detection)
50
Component: StatisticsGen
StatisticsGen
Data
● Captures shape of data
● Visualization highlights unusual stats
● Overlay helps with comparison
ExampleGen
Inputs and Outputs
Statistics
Component: StatisticsGen
statistics_gen =
StatisticsGen(input_data=example_gen.outputs.examples)
Configuration
Visualization
StatisticsGen
Data
ExampleGen
Inputs and Outputs
Statistics
52
Why are my tip predictions bad in the morning hours?
SchemaGen
54
Component: SchemaGen
SchemaGen
Statistics
StatisticsGen
Inputs and Outputs
Schema
● High-level description of the data
○ Expected features
○ Expected value domains
○ Expected constraints
○ and much more!
● Codifies expectations of “good” data
● Initially inferred, then user-curated
55
Component: SchemaGen
infer_schema = SchemaGen(stats=statistics_gen.outputs.output)
Configuration
Visualization
SchemaGen
Statistics
StatisticsGen
Inputs and Outputs
Schema
56
What are expected values for payment types?
Example Validator
58
Component: ExampleValidator
Example
Validator
Statistics Schema
StatisticsGen SchemaGen
Inputs and Outputs
Anomalies
Report
● Missing features
● Wrong feature valency
● Training/serving skew
● Data distribution drift
● ...
59
Component: ExampleValidator
validate_stats = ExampleValidator(
stats=statistics_gen.outputs.output,
schema=infer_schema.outputs.output)
Configuration
Visualization
Example
Validator
Statistics Schema
StatisticsGen SchemaGen
Inputs and Outputs
Anomalies
Report
60
Is this new taxi company name a typo or
a new company?
Transform
62
Using tf.Transform for feature
transformations.
63
Using tf.Transform for feature
transformations.
64
Using tf.Transform for feature
transformations.
Training Serving
65
Component: Transform
Transform
Data Schema
Transform
Graph
Transformed
Data
ExampleGen SchemaGen
Trainer
Inputs and Outputs
● User-provided transform code (TF Transform)
● Schema for parsing
Code
66
Component: Transform
Transform
Data
Transform Graph
● Applied at training time
● Embedded in serving graph
Transformed Data
● Optional, for performance optimization
Schema
Transform
Graph
Transformed
Data
ExampleGen SchemaGen
Trainer
Inputs and Outputs
Code
Component: Transform
transform = Transform(
input_data=example_gen.outputs.examples,
schema=infer_schema.outputs.output,
module_file=taxi_module_file)
Configuration
for key in _DENSE_FLOAT_FEATURE_KEYS:
outputs[_transformed_name(key)] = transform.scale_to_z_score(
_fill_in_missing(inputs[key]))
# ...
outputs[_transformed_name(_LABEL_KEY)] = tf.where(
tf.is_nan(taxi_fare),
tf.cast(tf.zeros_like(taxi_fare), tf.int64),
# Test if the tip was > 20% of the fare.
tf.cast(
tf.greater(tips, tf.multiply(taxi_fare, tf.constant(0.2))), tf.int64))
# ...
Code
Transform
Data Schema
Transform
Graph
Transformed
Data
ExampleGen SchemaGen
Trainer
Inputs and Outputs
Code
Trainer
69
Component: Trainer
Trainer
Data Schema
Transform SchemaGen
Evaluator
Inputs and Outputs
Code
Transform
Graph
Model
Validator
Pusher
Model(s)
● User-provided training code (TensorFlow)
● Optionally, transformed data
Trainer
Data Schema
Transform SchemaGen
Evaluator
Inputs and Outputs
Code
Transform
Graph
Model
Validator
Pusher
Model(s)
70
Component: Trainer
Highlight: SavedModel Format
TensorFlow
Serving
TensorFlow
Model Analysis
Train, Eval, and Inference Graphs
SignatureDef
Eval
Metadata
SignatureDef
Component: Trainer
trainer = Trainer(
module_file=taxi_module_file,
transformed_examples=transform.outputs.transformed_examples,
schema=infer_schema.outputs.output,
transform_output=transform.outputs.transform_output,
train_steps=10000,
eval_steps=5000,
warm_starting=True)
Configuration
Code: Just TensorFlow :)
Trainer
Data Schema
Transform SchemaGen
Evaluator
Inputs and Outputs
Code
Transform
Graph
Model
Validator
Pusher
Model(s)
Model Analysis & Validation
Evaluator
74
Component: Evaluator
Evaluator
Data Model
ExampleGen Trainer
Inputs and Outputs
Evaluation
Metrics
● Evaluation split of data
● Eval spec for slicing of metrics
Component: Evaluator
model_analyzer = Evaluator(
examples=examples_gen.outputs.output,
eval_spec=taxi_eval_spec,
model_exports=trainer.outputs.output)
Configuration
Visualization
Evaluator
Data Model
ExampleGen Trainer
Inputs and Outputs
Evaluation
Metrics
Model Validator
77
Component: ModelValidator
Model
Validator
Data
ExampleGen Trainer
Inputs and Outputs
Validation
Outcome
Model (x2)
● Evaluation split of data
● Last validated model
● New candidate model
Component: ModelValidator
model_validator = ModelValidator(
examples=examples_gen.outputs.output,
model=trainer.outputs.output,
eval_spec=taxi_mv_spec)
Configuration
● Configuration options
○ Validate using current eval data
○ “Next-day eval”, validate using unseen data
Model
Validator
Data
ExampleGen Trainer
Inputs and Outputs
Validation
Outcome
Model (x2)
Pusher
80
Component: Pusher
Pusher
Validation
Outcome
Model
Validator
Inputs and Outputs
Pusher
Pusher
Deployment
Options
Component: Pusher
pusher = Pusher(
model_export=trainer.outputs.output,
model_blessing=model_validator.outputs.blessing,
serving_model_dir=serving_model_dir)
Configuration
● Block push on validation outcome
● Push destinations supported today
○ Filesystem (TensorFlow Lite, TensorFlow JS)
○ TensorFlow Serving
Pusher
Validation
Outcome
Model
Validator
Inputs and Outputs
Pusher
Pusher
Deployment
Options
TFX Data Parallel
Processing
Apache Beam and Apache Spark
Data
Ingestion
TensorFlow
Data Validation
TensorFlow
Transform
Estimator
or Keras
Model
TensorFlow
Model Analysis
TensorFlow
Serving
Logging
Shared Utilities for Garbage Collection, Data Access Controls
Pipeline Storage
Tuner
Shared Configuration Framework and Job Orchestration
Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization
TFX + Apache Beam
Beam Vision
Provide a comprehensive portability framework
for data processing pipelines, one that allows you
to write your pipeline once in your language of
choice and run it with minimal effort on the
execution engine of choice.
Apache Beam
Sum Per Key
input | Sum.PerKey()
Python
input.apply(
Sum.integersPerKey())
Java
stats.Sum(s, input)
Go
SELECT key, SUM(value)
FROM input GROUP BY key
SQL
Cloud Dataflow
Apache Spark
Apache Flink
Apache Apex
Gearpump
Apache Samza
Apache Nemo
(incubating)
IBM Streams
How does Beam map to Flink?
How does Beam (Java) map to Spark?
Beam Java: Already runs on Spark!
How does Beam (Python) map to Spark?
Beam Portability (Python, …)
• Active work in progress!
• Several PRs are already in!
– Supports: Impulse, ParDo, GroupByKey,
Combine, Flatten, PAssert, Metrics, Side
inputs, ...
– Missing: State/Timers, SDF, ...
How does Beam map to Spark?
Call to action!
• Help with code, reviews, testing.
• Tracking JIRA(s)
– BEAM-2891
– BEAM-2590
Get started with TensorFlow Extended
(TFX)
An End-to-End ML Platform
github.com/tensorflow/tfx
tensorflow.org/tfx
Ad

More Related Content

What's hot (20)

IoT Agents (Introduction)
IoT Agents (Introduction)IoT Agents (Introduction)
IoT Agents (Introduction)
dmoranj
 
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMwareEvent Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
HostedbyConfluent
 
Securing the Elastic Stack for free
Securing the Elastic Stack for freeSecuring the Elastic Stack for free
Securing the Elastic Stack for free
Elasticsearch
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
Rahul Jain
 
Apache Kafka for Automotive Industry, Mobility Services & Smart City
Apache Kafka for Automotive Industry, Mobility Services & Smart CityApache Kafka for Automotive Industry, Mobility Services & Smart City
Apache Kafka for Automotive Industry, Mobility Services & Smart City
Kai Wähner
 
Elastic Stack Introduction
Elastic Stack IntroductionElastic Stack Introduction
Elastic Stack Introduction
Vikram Shinde
 
Deploying Flink on Kubernetes - David Anderson
 Deploying Flink on Kubernetes - David Anderson Deploying Flink on Kubernetes - David Anderson
Deploying Flink on Kubernetes - David Anderson
Ververica
 
Apache Nifi Crash Course
Apache Nifi Crash CourseApache Nifi Crash Course
Apache Nifi Crash Course
DataWorks Summit
 
Best practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at RenaultBest practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at Renault
DataWorks Summit
 
Benefits of Stream Processing and Apache Kafka Use Cases
Benefits of Stream Processing and Apache Kafka Use CasesBenefits of Stream Processing and Apache Kafka Use Cases
Benefits of Stream Processing and Apache Kafka Use Cases
confluent
 
Building Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache AirflowBuilding Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache Airflow
Sid Anand
 
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
HostedbyConfluent
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
Databricks
 
Balance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data CloudBalance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data Cloud
Kent Graziano
 
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
HostedbyConfluent
 
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
 Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa... Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Databricks
 
Getting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on KubernetesGetting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on Kubernetes
Databricks
 
Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...
Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...
Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...
Edureka!
 
Creating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Creating a Context-Aware solution, Complex Event Processing with FIWARE PerseoCreating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Creating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Fernando Lopez Aguilar
 
Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem Apache NiFi in the Hadoop Ecosystem
Apache NiFi in the Hadoop Ecosystem
DataWorks Summit/Hadoop Summit
 
IoT Agents (Introduction)
IoT Agents (Introduction)IoT Agents (Introduction)
IoT Agents (Introduction)
dmoranj
 
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMwareEvent Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
HostedbyConfluent
 
Securing the Elastic Stack for free
Securing the Elastic Stack for freeSecuring the Elastic Stack for free
Securing the Elastic Stack for free
Elasticsearch
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
Rahul Jain
 
Apache Kafka for Automotive Industry, Mobility Services & Smart City
Apache Kafka for Automotive Industry, Mobility Services & Smart CityApache Kafka for Automotive Industry, Mobility Services & Smart City
Apache Kafka for Automotive Industry, Mobility Services & Smart City
Kai Wähner
 
Elastic Stack Introduction
Elastic Stack IntroductionElastic Stack Introduction
Elastic Stack Introduction
Vikram Shinde
 
Deploying Flink on Kubernetes - David Anderson
 Deploying Flink on Kubernetes - David Anderson Deploying Flink on Kubernetes - David Anderson
Deploying Flink on Kubernetes - David Anderson
Ververica
 
Best practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at RenaultBest practices and lessons learnt from Running Apache NiFi at Renault
Best practices and lessons learnt from Running Apache NiFi at Renault
DataWorks Summit
 
Benefits of Stream Processing and Apache Kafka Use Cases
Benefits of Stream Processing and Apache Kafka Use CasesBenefits of Stream Processing and Apache Kafka Use Cases
Benefits of Stream Processing and Apache Kafka Use Cases
confluent
 
Building Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache AirflowBuilding Better Data Pipelines using Apache Airflow
Building Better Data Pipelines using Apache Airflow
Sid Anand
 
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
Analyzing Petabyte Scale Financial Data with Apache Pinot and Apache Kafka | ...
HostedbyConfluent
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
Databricks
 
Balance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data CloudBalance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data Cloud
Kent Graziano
 
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
HostedbyConfluent
 
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
 Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa... Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
Databricks
 
Getting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on KubernetesGetting Started with Apache Spark on Kubernetes
Getting Started with Apache Spark on Kubernetes
Databricks
 
Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...
Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...
Keras Tutorial For Beginners | Creating Deep Learning Models Using Keras In P...
Edureka!
 
Creating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Creating a Context-Aware solution, Complex Event Processing with FIWARE PerseoCreating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Creating a Context-Aware solution, Complex Event Processing with FIWARE Perseo
Fernando Lopez Aguilar
 

Similar to TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow (20)

Flink Forward San Francisco 2019: TensorFlow Extended: An end-to-end machine ...
Flink Forward San Francisco 2019: TensorFlow Extended: An end-to-end machine ...Flink Forward San Francisco 2019: TensorFlow Extended: An end-to-end machine ...
Flink Forward San Francisco 2019: TensorFlow Extended: An end-to-end machine ...
Flink Forward
 
TensorFlow Extension (TFX) and Apache Beam
TensorFlow Extension (TFX) and Apache BeamTensorFlow Extension (TFX) and Apache Beam
TensorFlow Extension (TFX) and Apache Beam
markgrover
 
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
Fei Chen
 
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupMl ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Jim Dowling
 
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
gdgsurrey
 
Moving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow ExtendedMoving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow Extended
Jonathan Mugan
 
Utilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learningUtilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learning
Paris Data Engineers !
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
Jim Dowling
 
Simplifying Model Management with MLflow
Simplifying Model Management with MLflowSimplifying Model Management with MLflow
Simplifying Model Management with MLflow
Databricks
 
ML-Ops how to bring your data science to production
ML-Ops  how to bring your data science to productionML-Ops  how to bring your data science to production
ML-Ops how to bring your data science to production
Herman Wu
 
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
Chester Chen
 
MLflow: A Platform for Production Machine Learning
MLflow: A Platform for Production Machine LearningMLflow: A Platform for Production Machine Learning
MLflow: A Platform for Production Machine Learning
Matei Zaharia
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Chris Fregly
 
Oracle Demantra Training
 Oracle Demantra Training Oracle Demantra Training
Oracle Demantra Training
williamflender
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
DataWorks Summit
 
Flux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / PipelineFlux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / Pipeline
Jan Wiegelmann
 
Streaming Inference with Apache Beam and TFX
Streaming Inference with Apache Beam and TFXStreaming Inference with Apache Beam and TFX
Streaming Inference with Apache Beam and TFX
Databricks
 
SQL Server Query Optimization, Execution and Debugging Query Performance
SQL Server Query Optimization, Execution and Debugging Query PerformanceSQL Server Query Optimization, Execution and Debugging Query Performance
SQL Server Query Optimization, Execution and Debugging Query Performance
Vinod Kumar
 
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
Chetan Khatri
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
Jan Kirenz
 
Flink Forward San Francisco 2019: TensorFlow Extended: An end-to-end machine ...
Flink Forward San Francisco 2019: TensorFlow Extended: An end-to-end machine ...Flink Forward San Francisco 2019: TensorFlow Extended: An end-to-end machine ...
Flink Forward San Francisco 2019: TensorFlow Extended: An end-to-end machine ...
Flink Forward
 
TensorFlow Extension (TFX) and Apache Beam
TensorFlow Extension (TFX) and Apache BeamTensorFlow Extension (TFX) and Apache Beam
TensorFlow Extension (TFX) and Apache Beam
markgrover
 
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
ML Platform Q1 Meetup: End to-end Feature Analysis, Validation and Transforma...
Fei Chen
 
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science MeetupMl ops and the feature store with hopsworks, DC Data Science Meetup
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Jim Dowling
 
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
Certification Study Group -Professional ML Engineer Session 2 (GCP-TensorFlow...
gdgsurrey
 
Moving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow ExtendedMoving Your Machine Learning Models to Production with TensorFlow Extended
Moving Your Machine Learning Models to Production with TensorFlow Extended
Jonathan Mugan
 
Utilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learningUtilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learning
Paris Data Engineers !
 
Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks Metadata and Provenance for ML Pipelines with Hopsworks
Metadata and Provenance for ML Pipelines with Hopsworks
Jim Dowling
 
Simplifying Model Management with MLflow
Simplifying Model Management with MLflowSimplifying Model Management with MLflow
Simplifying Model Management with MLflow
Databricks
 
ML-Ops how to bring your data science to production
ML-Ops  how to bring your data science to productionML-Ops  how to bring your data science to production
ML-Ops how to bring your data science to production
Herman Wu
 
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
Analytics Metrics delivery and ML Feature visualization: Evolution of Data Pl...
Chester Chen
 
MLflow: A Platform for Production Machine Learning
MLflow: A Platform for Production Machine LearningMLflow: A Platform for Production Machine Learning
MLflow: A Platform for Production Machine Learning
Matei Zaharia
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Chris Fregly
 
Oracle Demantra Training
 Oracle Demantra Training Oracle Demantra Training
Oracle Demantra Training
williamflender
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
DataWorks Summit
 
Flux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / PipelineFlux - Open Machine Learning Stack / Pipeline
Flux - Open Machine Learning Stack / Pipeline
Jan Wiegelmann
 
Streaming Inference with Apache Beam and TFX
Streaming Inference with Apache Beam and TFXStreaming Inference with Apache Beam and TFX
Streaming Inference with Apache Beam and TFX
Databricks
 
SQL Server Query Optimization, Execution and Debugging Query Performance
SQL Server Query Optimization, Execution and Debugging Query PerformanceSQL Server Query Optimization, Execution and Debugging Query Performance
SQL Server Query Optimization, Execution and Debugging Query Performance
Vinod Kumar
 
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
TransmogrifAI - Automate Machine Learning Workflow with the power of Scala an...
Chetan Khatri
 
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & KubeflowMLOps - Build pipelines with Tensor Flow Extended & Kubeflow
MLOps - Build pipelines with Tensor Flow Extended & Kubeflow
Jan Kirenz
 
Ad

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
Ad

Recently uploaded (20)

What is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdfWhat is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdf
SaikatBasu37
 
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
muhammed84essa
 
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
OlhaTatokhina1
 
Controlling Financial Processes at a Municipality
Controlling Financial Processes at a MunicipalityControlling Financial Processes at a Municipality
Controlling Financial Processes at a Municipality
Process mining Evangelist
 
Decision Trees in Artificial-Intelligence.pdf
Decision Trees in Artificial-Intelligence.pdfDecision Trees in Artificial-Intelligence.pdf
Decision Trees in Artificial-Intelligence.pdf
Saikat Basu
 
Customer Segmentation using K-Means clustering
Customer Segmentation using K-Means clusteringCustomer Segmentation using K-Means clustering
Customer Segmentation using K-Means clustering
Ingrid Nyakerario
 
report (maam dona subject).pptxhsgwiswhs
report (maam dona subject).pptxhsgwiswhsreport (maam dona subject).pptxhsgwiswhs
report (maam dona subject).pptxhsgwiswhs
AngelPinedaTaguinod
 
4. Multivariable statistics_Using Stata_2025.pdf
4. Multivariable statistics_Using Stata_2025.pdf4. Multivariable statistics_Using Stata_2025.pdf
4. Multivariable statistics_Using Stata_2025.pdf
axonneurologycenter1
 
50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd
emir73065
 
Process Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBSProcess Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBS
Process mining Evangelist
 
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfjOral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
maitripatel5301
 
3. Univariable and Multivariable Analysis_Using Stata_2025.pdf
3. Univariable and Multivariable Analysis_Using Stata_2025.pdf3. Univariable and Multivariable Analysis_Using Stata_2025.pdf
3. Univariable and Multivariable Analysis_Using Stata_2025.pdf
axonneurologycenter1
 
hersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distributionhersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distribution
hershtara1
 
Multi-tenant Data Pipeline Orchestration
Multi-tenant Data Pipeline OrchestrationMulti-tenant Data Pipeline Orchestration
Multi-tenant Data Pipeline Orchestration
Romi Kuntsman
 
Agricultural_regionalisation_in_India(Final).pptx
Agricultural_regionalisation_in_India(Final).pptxAgricultural_regionalisation_in_India(Final).pptx
Agricultural_regionalisation_in_India(Final).pptx
mostafaahammed38
 
Automated Melanoma Detection via Image Processing.pptx
Automated Melanoma Detection via Image Processing.pptxAutomated Melanoma Detection via Image Processing.pptx
Automated Melanoma Detection via Image Processing.pptx
handrymaharjan23
 
Adopting Process Mining at the Rabobank - use case
Adopting Process Mining at the Rabobank - use caseAdopting Process Mining at the Rabobank - use case
Adopting Process Mining at the Rabobank - use case
Process mining Evangelist
 
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
Taqyea
 
AWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdfAWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdf
philsparkshome
 
RAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit FrameworkRAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit Framework
apanneer
 
What is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdfWhat is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdf
SaikatBasu37
 
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
CERTIFIED BUSINESS ANALYSIS PROFESSIONAL™
muhammed84essa
 
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
2024-Media-Literacy-Index-Of-Ukrainians-ENG-SHORT.pdf
OlhaTatokhina1
 
Controlling Financial Processes at a Municipality
Controlling Financial Processes at a MunicipalityControlling Financial Processes at a Municipality
Controlling Financial Processes at a Municipality
Process mining Evangelist
 
Decision Trees in Artificial-Intelligence.pdf
Decision Trees in Artificial-Intelligence.pdfDecision Trees in Artificial-Intelligence.pdf
Decision Trees in Artificial-Intelligence.pdf
Saikat Basu
 
Customer Segmentation using K-Means clustering
Customer Segmentation using K-Means clusteringCustomer Segmentation using K-Means clustering
Customer Segmentation using K-Means clustering
Ingrid Nyakerario
 
report (maam dona subject).pptxhsgwiswhs
report (maam dona subject).pptxhsgwiswhsreport (maam dona subject).pptxhsgwiswhs
report (maam dona subject).pptxhsgwiswhs
AngelPinedaTaguinod
 
4. Multivariable statistics_Using Stata_2025.pdf
4. Multivariable statistics_Using Stata_2025.pdf4. Multivariable statistics_Using Stata_2025.pdf
4. Multivariable statistics_Using Stata_2025.pdf
axonneurologycenter1
 
50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd
emir73065
 
Process Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBSProcess Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBS
Process mining Evangelist
 
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfjOral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
maitripatel5301
 
3. Univariable and Multivariable Analysis_Using Stata_2025.pdf
3. Univariable and Multivariable Analysis_Using Stata_2025.pdf3. Univariable and Multivariable Analysis_Using Stata_2025.pdf
3. Univariable and Multivariable Analysis_Using Stata_2025.pdf
axonneurologycenter1
 
hersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distributionhersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distribution
hershtara1
 
Multi-tenant Data Pipeline Orchestration
Multi-tenant Data Pipeline OrchestrationMulti-tenant Data Pipeline Orchestration
Multi-tenant Data Pipeline Orchestration
Romi Kuntsman
 
Agricultural_regionalisation_in_India(Final).pptx
Agricultural_regionalisation_in_India(Final).pptxAgricultural_regionalisation_in_India(Final).pptx
Agricultural_regionalisation_in_India(Final).pptx
mostafaahammed38
 
Automated Melanoma Detection via Image Processing.pptx
Automated Melanoma Detection via Image Processing.pptxAutomated Melanoma Detection via Image Processing.pptx
Automated Melanoma Detection via Image Processing.pptx
handrymaharjan23
 
Adopting Process Mining at the Rabobank - use case
Adopting Process Mining at the Rabobank - use caseAdopting Process Mining at the Rabobank - use case
Adopting Process Mining at the Rabobank - use case
Process mining Evangelist
 
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
录取通知书加拿大TMU毕业证多伦多都会大学电子版毕业证成绩单
Taqyea
 
AWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdfAWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdf
philsparkshome
 
RAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit FrameworkRAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit Framework
apanneer
 

TensorFlow Extended: An End-to-End Machine Learning Platform for TensorFlow

  • 1. TensorFlow Extended (TFX) An End-to-End ML Platform Konstantinos (Gus) Katsiapis Google Ahmet Altay Google
  • 2. Data Ingestion Data Analysis + Validation Feature Engineering Trainer Model Evaluation and Validation Serving Logging Shared Utilities for Garbage Collection, Data Access Controls Pipeline Storage Tuner Shared Configuration Framework and Job Orchestration Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization TensorFlow Extended (TFX) is an end-to-end ML pipeline for TensorFlow
  • 3. TFX powers our most important bets and products... (incl. ) Major ProductsAlphaBets
  • 4. Data Ingestion Data Analysis + Validation Feature Engineering Trainer Model Evaluation and Validation Serving Logging Shared Utilities for Garbage Collection, Data Access Controls Pipeline Storage Tuner Shared Configuration Framework and Job Orchestration Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization TensorFlow Extended (TFX) is an end-to-end ML pipeline for TensorFlow
  • 5. … and some of our most important partners.
  • 6. What is Apache Beam? - A unified batch and stream distributed processing API - A set of SDK frontends: Java, Python, Go, Scala, SQL - A set of Runners which can execute Beam jobs into various backends: Local, Apache Flink, Apache Spark, Apache Gearpump, Apache Samza, Apache Hadoop, Google Cloud Dataflow, …
  • 7. Building Components out of Libraries Data Ingestion TensorFlow Transform Estimator Model TensorFlow Model Analysis Honoring Validation Outcomes TensorFlow Data Validation TensorFlow Serving ExampleGen StatisticsGen SchemaGen Example Validator Transform Trainer Evaluator Model Validator Pusher Model Server Powered by Beam Powered by Beam
  • 8. What makes a Component Model Validator Packaged binary or container
  • 9. What makes a Component Last Validated Model New (Candidate) Model Validation Outcome Well defined inputs and outputs Model Validator
  • 10. What makes a Component Config Last Validated Model New (Candidate) Model Validation Outcome Well defined configuration Model Validator
  • 11. Metadata Store What makes a Component Last Validated Model New (Candidate) Model Validation Outcome Context Model Validator Config
  • 12. Metadata Store What makes a Component Trainer Last Validated Model New (Candidate) Model New Model Validation Outcome Pusher New (Candidate) Model Validation Outcome Deployment targets: TensorFlow Serving TensorFlow Lite TensorFlow JS TensorFlow Hub Model Validator Config
  • 14. Trainer Metadata Store? That’s new Task-Aware Pipelines Transform
  • 15. Metadata Store? That’s new Task-Aware Pipelines Input Data Transformed Data Trained Models Deployment Task- and Data-Aware Pipelines Pipeline + Metadata Storage Training Data TrainerTransformTrainerTransform
  • 16. What’s in the Metadata Store? Trained Models Type definitions of Artifacts and their Properties E.g., Models, Data, Evaluation Metrics
  • 17. What’s in the Metadata Store? Trained Models Type definitions of Artifacts and their Properties E.g., Models, Data, Evaluation Metrics Trainer Execution Records (Runs) of Components E.g., Runtime Configuration, Inputs + Outputs
  • 18. What’s in the Metadata Store? Trained Models Type definitions of Artifacts and their Properties E.g., Models, Data, Evaluation Metrics Trainer Execution Records (Runs) of Components E.g., Runtime Configuration, Inputs + Outputs Lineage Tracking Across All Executions E.g., to recurse back to all inputs of a specific artifact
  • 19. List all training runs and attributes
  • 20. Visualize lineage of a specific model Model artifact that was created
  • 21. Visualize data a model was trained on
  • 22. Visualize sliced eval metrics associated with a model
  • 23. Launch TensorBoard for a specific model run
  • 25. Compare data statistics for multiple models
  • 26. Examples of Metadata-Powered Functionality Use-cases enabled by lineage tracking
  • 27. Examples of Metadata-Powered Functionality Compare previous model runsUse-cases enabled by lineage tracking
  • 28. Examples of Metadata-Powered Functionality Compare previous model runs Carry-over state from previous models Use-cases enabled by lineage tracking
  • 29. Examples of Metadata-Powered Functionality Use-cases enabled by lineage tracking Compare previous model runs Carry-over state from previous models Re-use previously computed outputs Use-cases enabled by lineage tracking
  • 30. How do we orchestrate TFX? Component Component Component
  • 31. How do we orchestrate TFX? Metadata Store Component Driver Publisher Component Driver Publisher Component Driver Publisher
  • 32. How do we orchestrate TFX? Metadata Store Component Driver Publisher Component Driver Publisher Component Driver Publisher Executor Executor Executor
  • 33. Metadata Store Executors do the work Driver Transform, etc. Publisher Beam Spark Dataflow
  • 34. class Executor(base_executor.BaseExecutor): """Generic TFX statsgen executor.""" ... def Do(...) -> None: """Computes stats for each split of input using tensorflow_data_validation. ... with beam.Pipeline(argv=self._get_beam_pipeline_args()) as p: for split, instance in split_to_instance.items(): ... output_path = os.path.join(output_uri, _DEFAULT_FILE_NAME) _ = ( p | 'ReadData.' + split >> beam.io.ReadFromTFRecord(file_pattern=input_uri) | 'DecodeData.' + split >> tf_example_decoder.DecodeTFExample() | 'GenerateStatistics.' + split >> stats_api.GenerateStatistics(stats_options) | 'WriteStatsOutput.' + split >> beam.io.WriteToTFRecord( output_path,shard_name_template='', coder=beam.coders.ProtoCoder( statistics_pb2.DatasetFeatureStatisticsList))) tf.logging.info('Statistics written to {}.'.format(output_uri))
  • 35. Metadata Store Executors do the work Driver Trainer Publisher TensorFlow
  • 36. Metadata Store Executors do the work Driver Pusher, etc Publisher
  • 37. TFX Config How do we orchestrate TFX? Metadata Store Component Driver Publisher Component Driver Publisher Component Driver Publisher Executor Executor Executor
  • 38. def create_pipeline(): """Implements the chicago taxi pipeline with TFX.""" examples = csv_input(os.path.join(data_root, 'simple')) example_gen = CsvExampleGen(input_base=examples) statistics_gen = StatisticsGen(input_data=...) infer_schema = SchemaGen(stats=...) validate_stats = ExampleValidator(stats=..., schema=...) # Performs transformations and feature engineering in training and serving transform = Transform( input_data=example_gen.outputs.examples, schema=infer_schema.outputs.output, module_file=_taxi_module_file) trainer = Trainer(...) model_analyzer = Evaluator(examples=..., model_exports=...) model_validator = ModelValidator(examples=..., model=...) pusher = Pusher(model_export=..., model_blessing=..., serving_model_dir=...) return [example_gen, statistics_gen, infer_schema, validate_stats, transform, trainer, model_analyzer, model_validator, pusher] pipeline = AirflowDAGRunner(_airflow_config).run(_create_pipeline())
  • 39. … Back to orchestration TFX Config Metadata Store Component Driver Publisher Component Driver Publisher Component Driver Publisher Executor Executor Executor
  • 40. Bring your very own favorite orchestrator TFX Config Metadata Store Component Driver Publisher Component Driver Publisher Component Driver Publisher Executor Executor Executor Airflow Runtime Kubeflow Runtime Your own runtime...
  • 41. Examples of orchestrated TFX pipelines Airflow Kubeflow Pipelines
  • 42. TFX: Putting it all together.
  • 43. Overview Data Ingestion Data Analysis & Validation Data Transformation
  • 45. 45 Component: ExampleGen Example Gen Raw Data Inputs and Outputs CSV TF Record Split TF Record Data Training Eval
  • 46. Component: ExampleGen examples = csv_input(os.path.join(data_root, 'simple')) example_gen = CsvExampleGen(input_base=examples) Configuration Eval Example Gen Raw Data CSV TF Record Split TF Record Data Training Eval Inputs and Outputs
  • 47. Data Analysis & Validation Data Analysis & Validation
  • 49. 49 Component: StatisticsGen StatisticsGen Data ExampleGen Inputs and Outputs Statistics ● Training ● Eval ● Serving logs (for skew detection)
  • 50. 50 Component: StatisticsGen StatisticsGen Data ● Captures shape of data ● Visualization highlights unusual stats ● Overlay helps with comparison ExampleGen Inputs and Outputs Statistics
  • 52. 52 Why are my tip predictions bad in the morning hours?
  • 54. 54 Component: SchemaGen SchemaGen Statistics StatisticsGen Inputs and Outputs Schema ● High-level description of the data ○ Expected features ○ Expected value domains ○ Expected constraints ○ and much more! ● Codifies expectations of “good” data ● Initially inferred, then user-curated
  • 55. 55 Component: SchemaGen infer_schema = SchemaGen(stats=statistics_gen.outputs.output) Configuration Visualization SchemaGen Statistics StatisticsGen Inputs and Outputs Schema
  • 56. 56 What are expected values for payment types?
  • 58. 58 Component: ExampleValidator Example Validator Statistics Schema StatisticsGen SchemaGen Inputs and Outputs Anomalies Report ● Missing features ● Wrong feature valency ● Training/serving skew ● Data distribution drift ● ...
  • 59. 59 Component: ExampleValidator validate_stats = ExampleValidator( stats=statistics_gen.outputs.output, schema=infer_schema.outputs.output) Configuration Visualization Example Validator Statistics Schema StatisticsGen SchemaGen Inputs and Outputs Anomalies Report
  • 60. 60 Is this new taxi company name a typo or a new company?
  • 62. 62 Using tf.Transform for feature transformations.
  • 63. 63 Using tf.Transform for feature transformations.
  • 64. 64 Using tf.Transform for feature transformations. Training Serving
  • 65. 65 Component: Transform Transform Data Schema Transform Graph Transformed Data ExampleGen SchemaGen Trainer Inputs and Outputs ● User-provided transform code (TF Transform) ● Schema for parsing Code
  • 66. 66 Component: Transform Transform Data Transform Graph ● Applied at training time ● Embedded in serving graph Transformed Data ● Optional, for performance optimization Schema Transform Graph Transformed Data ExampleGen SchemaGen Trainer Inputs and Outputs Code
  • 67. Component: Transform transform = Transform( input_data=example_gen.outputs.examples, schema=infer_schema.outputs.output, module_file=taxi_module_file) Configuration for key in _DENSE_FLOAT_FEATURE_KEYS: outputs[_transformed_name(key)] = transform.scale_to_z_score( _fill_in_missing(inputs[key])) # ... outputs[_transformed_name(_LABEL_KEY)] = tf.where( tf.is_nan(taxi_fare), tf.cast(tf.zeros_like(taxi_fare), tf.int64), # Test if the tip was > 20% of the fare. tf.cast( tf.greater(tips, tf.multiply(taxi_fare, tf.constant(0.2))), tf.int64)) # ... Code Transform Data Schema Transform Graph Transformed Data ExampleGen SchemaGen Trainer Inputs and Outputs Code
  • 69. 69 Component: Trainer Trainer Data Schema Transform SchemaGen Evaluator Inputs and Outputs Code Transform Graph Model Validator Pusher Model(s) ● User-provided training code (TensorFlow) ● Optionally, transformed data
  • 70. Trainer Data Schema Transform SchemaGen Evaluator Inputs and Outputs Code Transform Graph Model Validator Pusher Model(s) 70 Component: Trainer Highlight: SavedModel Format TensorFlow Serving TensorFlow Model Analysis Train, Eval, and Inference Graphs SignatureDef Eval Metadata SignatureDef
  • 71. Component: Trainer trainer = Trainer( module_file=taxi_module_file, transformed_examples=transform.outputs.transformed_examples, schema=infer_schema.outputs.output, transform_output=transform.outputs.transform_output, train_steps=10000, eval_steps=5000, warm_starting=True) Configuration Code: Just TensorFlow :) Trainer Data Schema Transform SchemaGen Evaluator Inputs and Outputs Code Transform Graph Model Validator Pusher Model(s)
  • 72. Model Analysis & Validation
  • 74. 74 Component: Evaluator Evaluator Data Model ExampleGen Trainer Inputs and Outputs Evaluation Metrics ● Evaluation split of data ● Eval spec for slicing of metrics
  • 75. Component: Evaluator model_analyzer = Evaluator( examples=examples_gen.outputs.output, eval_spec=taxi_eval_spec, model_exports=trainer.outputs.output) Configuration Visualization Evaluator Data Model ExampleGen Trainer Inputs and Outputs Evaluation Metrics
  • 77. 77 Component: ModelValidator Model Validator Data ExampleGen Trainer Inputs and Outputs Validation Outcome Model (x2) ● Evaluation split of data ● Last validated model ● New candidate model
  • 78. Component: ModelValidator model_validator = ModelValidator( examples=examples_gen.outputs.output, model=trainer.outputs.output, eval_spec=taxi_mv_spec) Configuration ● Configuration options ○ Validate using current eval data ○ “Next-day eval”, validate using unseen data Model Validator Data ExampleGen Trainer Inputs and Outputs Validation Outcome Model (x2)
  • 81. Component: Pusher pusher = Pusher( model_export=trainer.outputs.output, model_blessing=model_validator.outputs.blessing, serving_model_dir=serving_model_dir) Configuration ● Block push on validation outcome ● Push destinations supported today ○ Filesystem (TensorFlow Lite, TensorFlow JS) ○ TensorFlow Serving Pusher Validation Outcome Model Validator Inputs and Outputs Pusher Pusher Deployment Options
  • 82. TFX Data Parallel Processing Apache Beam and Apache Spark
  • 83. Data Ingestion TensorFlow Data Validation TensorFlow Transform Estimator or Keras Model TensorFlow Model Analysis TensorFlow Serving Logging Shared Utilities for Garbage Collection, Data Access Controls Pipeline Storage Tuner Shared Configuration Framework and Job Orchestration Integrated Frontend for Job Management, Monitoring, Debugging, Data/Model/Evaluation Visualization TFX + Apache Beam
  • 84. Beam Vision Provide a comprehensive portability framework for data processing pipelines, one that allows you to write your pipeline once in your language of choice and run it with minimal effort on the execution engine of choice.
  • 85. Apache Beam Sum Per Key input | Sum.PerKey() Python input.apply( Sum.integersPerKey()) Java stats.Sum(s, input) Go SELECT key, SUM(value) FROM input GROUP BY key SQL Cloud Dataflow Apache Spark Apache Flink Apache Apex Gearpump Apache Samza Apache Nemo (incubating) IBM Streams
  • 86. How does Beam map to Flink?
  • 87. How does Beam (Java) map to Spark? Beam Java: Already runs on Spark!
  • 88. How does Beam (Python) map to Spark? Beam Portability (Python, …) • Active work in progress! • Several PRs are already in! – Supports: Impulse, ParDo, GroupByKey, Combine, Flatten, PAssert, Metrics, Side inputs, ... – Missing: State/Timers, SDF, ...
  • 89. How does Beam map to Spark? Call to action! • Help with code, reviews, testing. • Tracking JIRA(s) – BEAM-2891 – BEAM-2590
  • 90. Get started with TensorFlow Extended (TFX) An End-to-End ML Platform github.com/tensorflow/tfx tensorflow.org/tfx
  翻译: