SlideShare a Scribd company logo
Practical Machine
Learning and Rails
Andrew Cantino
  VP Engineering, Mavenlink    @tectonic




  Founder, Agile Productions   @ryanstout
This talk will
- introduce machine learning

- make you ML-aware

- have examples
This talk will not
- give you a PhD

- implement algorithms

- cover collaborative filtering,
  optimization, clustering, advanced statistics,   genetic algorithms, classical AI, NLP, ...
What is Machine Learning?
Many different algorithms

that predict data

from other data

using applied statistics.
"Enhance and rotate 20 degrees"
What data?
       The web is data.

                                           User decisions
       APIs         A/B Tests
                                 Databases
                   Logs          Streams



Browser versions
                       Reviews
                                  Clicktrails
Okay. We have data.
What do we do with it?


We   classify it.
Classification
Classification



            OR
Classification



    :)      OR   :(
Classification
• Documents
    o Sort email (Gmail's importance filter)
    o Route questions to appropriate expert (Aardvark)
    o Categorize reviews (Amazon)



•   Users
    o   Expertise; interests; pro vs free; likelihood of paying;
        expected future karma


•   Events
    o   Abnormal vs. normal
Algorithms:
     Decision Tree Learning
Algorithms:
        Decision Tree Learning

                                                                Features
                            Email contains
                            word "viagra"

                            no       yes

           Email contains                    Email contains
            word "Ruby"                       attachment?


           no         yes                     no        yes


   P(Spam)=10%     P(Spam)=5%         P(Spam)=70%       P(Spam)=95%




                                                       Labels
Algorithms:
     Support Vector Machines (SVMs)




                          Graphics from Wikipedia
Algorithms:
     Support Vector Machines (SVMs)




                          Graphics from Wikipedia
Algorithms:
           Naive Bayes

•   Break documents into words and treat each
    word as an independent feature

•   Surprisingly effective on simple text and
    document classification

•   Works well when you have lots of data



                                          Graphics from Wikipedia
Algorithms:
             Naive Bayes

You received 100 emails, 70 of which were spam.
Word                 Spam with this word   Ham with this word

viagra               42 (60%)              1 (3.3%)

ruby                 7 (10%)               15 (50%)

hello                35 (50%)              24 (80%)



A new email contains hello and viagra. The probability that it
is spam is:
P(S|hello,viagra) = P(S) * P(hello,viagra|S) / P(hello,viagra)
                  = 0.7 * (0.5 * 0.6)        / (0.59 * 0.43)
                  = 82%
                                                      Graphics from Wikipedia
Algorithms:
               Neural Nets
                         Hidden layer

Input layer (features)

                                        Output layer (Classification)




                                                      Graphics from Wikipedia
Curse of Dimensionality

The more features
  and labels that you
  have, the more data
  that you need.




       http://www.iro.umontreal.ca/~bengioy/yoshua_en/research_files/CurseDimensionality.jpg
Overfitting
•   With enough parameters, anything is
    possible.

•   We want our algorithms to generalize and
    infer, not memorize specific training
    examples.

•   Therefore, we test our algorithms on
    different data than we train them on.
Ad

More Related Content

What's hot (20)

低ランク性および平滑性を用いたテンソル補完 (Tensor Completion based on Low-rank and Smooth Structu...
低ランク性および平滑性を用いたテンソル補完 (Tensor Completion based on Low-rank and Smooth Structu...低ランク性および平滑性を用いたテンソル補完 (Tensor Completion based on Low-rank and Smooth Structu...
低ランク性および平滑性を用いたテンソル補完 (Tensor Completion based on Low-rank and Smooth Structu...
Tatsuya Yokota
 
物体検知(Meta Study Group 発表資料)
物体検知(Meta Study Group 発表資料)物体検知(Meta Study Group 発表資料)
物体検知(Meta Study Group 発表資料)
cvpaper. challenge
 
バンディット問題について
バンディット問題についてバンディット問題について
バンディット問題について
jkomiyama
 
DiagrammeRと仲良くなった話ーグラフィカルモデルのためのDiagrammeR速習ー
DiagrammeRと仲良くなった話ーグラフィカルモデルのためのDiagrammeR速習ーDiagrammeRと仲良くなった話ーグラフィカルモデルのためのDiagrammeR速習ー
DiagrammeRと仲良くなった話ーグラフィカルモデルのためのDiagrammeR速習ー
Takashi Yamane
 
AutoEncoderで特徴抽出
AutoEncoderで特徴抽出AutoEncoderで特徴抽出
AutoEncoderで特徴抽出
Kai Sasaki
 
One Class SVMを用いた異常値検知
One Class SVMを用いた異常値検知One Class SVMを用いた異常値検知
One Class SVMを用いた異常値検知
Yuto Mori
 
Kaggle M5 Forecasting (日本語)
Kaggle M5 Forecasting (日本語)Kaggle M5 Forecasting (日本語)
Kaggle M5 Forecasting (日本語)
Masakazu Mori
 
PRML 第14章
PRML 第14章PRML 第14章
PRML 第14章
Akira Miyazawa
 
R実践 機械学習による異常検知 02
R実践 機械学習による異常検知 02R実践 機械学習による異常検知 02
R実践 機械学習による異常検知 02
akira_11
 
空間データのための回帰分析
空間データのための回帰分析空間データのための回帰分析
空間データのための回帰分析
springking
 
coordinate descent 法について
coordinate descent 法についてcoordinate descent 法について
coordinate descent 法について
京都大学大学院情報学研究科数理工学専攻
 
Bayesian Sushistical Modeling
Bayesian Sushistical ModelingBayesian Sushistical Modeling
Bayesian Sushistical Modeling
daiki hojo
 
金融業界でよく使う統計学
金融業界でよく使う統計学金融業界でよく使う統計学
金融業界でよく使う統計学
Nagi Teramo
 
研究室内PRML勉強会 8章1節
研究室内PRML勉強会 8章1節研究室内PRML勉強会 8章1節
研究室内PRML勉強会 8章1節
Koji Matsuda
 
星野「調査観察データの統計科学」第3章
星野「調査観察データの統計科学」第3章星野「調査観察データの統計科学」第3章
星野「調査観察データの統計科学」第3章
Shuyo Nakatani
 
Prml 最尤推定からベイズ曲線フィッティング
Prml 最尤推定からベイズ曲線フィッティングPrml 最尤推定からベイズ曲線フィッティング
Prml 最尤推定からベイズ曲線フィッティング
takutori
 
High-Dimensional Bayesian Optimization with Constraints: Application to Powde...
High-Dimensional Bayesian Optimization with Constraints: Application to Powde...High-Dimensional Bayesian Optimization with Constraints: Application to Powde...
High-Dimensional Bayesian Optimization with Constraints: Application to Powde...
Shoki Miyagawa
 
Wasserstein GANを熟読する
Wasserstein GANを熟読するWasserstein GANを熟読する
Wasserstein GANを熟読する
ssusera4bf2d
 
能動学習セミナー
能動学習セミナー能動学習セミナー
能動学習セミナー
Preferred Networks
 
PRML 1.5-1.5.5 決定理論
PRML 1.5-1.5.5 決定理論PRML 1.5-1.5.5 決定理論
PRML 1.5-1.5.5 決定理論
Akihiro Nitta
 
低ランク性および平滑性を用いたテンソル補完 (Tensor Completion based on Low-rank and Smooth Structu...
低ランク性および平滑性を用いたテンソル補完 (Tensor Completion based on Low-rank and Smooth Structu...低ランク性および平滑性を用いたテンソル補完 (Tensor Completion based on Low-rank and Smooth Structu...
低ランク性および平滑性を用いたテンソル補完 (Tensor Completion based on Low-rank and Smooth Structu...
Tatsuya Yokota
 
物体検知(Meta Study Group 発表資料)
物体検知(Meta Study Group 発表資料)物体検知(Meta Study Group 発表資料)
物体検知(Meta Study Group 発表資料)
cvpaper. challenge
 
バンディット問題について
バンディット問題についてバンディット問題について
バンディット問題について
jkomiyama
 
DiagrammeRと仲良くなった話ーグラフィカルモデルのためのDiagrammeR速習ー
DiagrammeRと仲良くなった話ーグラフィカルモデルのためのDiagrammeR速習ーDiagrammeRと仲良くなった話ーグラフィカルモデルのためのDiagrammeR速習ー
DiagrammeRと仲良くなった話ーグラフィカルモデルのためのDiagrammeR速習ー
Takashi Yamane
 
AutoEncoderで特徴抽出
AutoEncoderで特徴抽出AutoEncoderで特徴抽出
AutoEncoderで特徴抽出
Kai Sasaki
 
One Class SVMを用いた異常値検知
One Class SVMを用いた異常値検知One Class SVMを用いた異常値検知
One Class SVMを用いた異常値検知
Yuto Mori
 
Kaggle M5 Forecasting (日本語)
Kaggle M5 Forecasting (日本語)Kaggle M5 Forecasting (日本語)
Kaggle M5 Forecasting (日本語)
Masakazu Mori
 
R実践 機械学習による異常検知 02
R実践 機械学習による異常検知 02R実践 機械学習による異常検知 02
R実践 機械学習による異常検知 02
akira_11
 
空間データのための回帰分析
空間データのための回帰分析空間データのための回帰分析
空間データのための回帰分析
springking
 
Bayesian Sushistical Modeling
Bayesian Sushistical ModelingBayesian Sushistical Modeling
Bayesian Sushistical Modeling
daiki hojo
 
金融業界でよく使う統計学
金融業界でよく使う統計学金融業界でよく使う統計学
金融業界でよく使う統計学
Nagi Teramo
 
研究室内PRML勉強会 8章1節
研究室内PRML勉強会 8章1節研究室内PRML勉強会 8章1節
研究室内PRML勉強会 8章1節
Koji Matsuda
 
星野「調査観察データの統計科学」第3章
星野「調査観察データの統計科学」第3章星野「調査観察データの統計科学」第3章
星野「調査観察データの統計科学」第3章
Shuyo Nakatani
 
Prml 最尤推定からベイズ曲線フィッティング
Prml 最尤推定からベイズ曲線フィッティングPrml 最尤推定からベイズ曲線フィッティング
Prml 最尤推定からベイズ曲線フィッティング
takutori
 
High-Dimensional Bayesian Optimization with Constraints: Application to Powde...
High-Dimensional Bayesian Optimization with Constraints: Application to Powde...High-Dimensional Bayesian Optimization with Constraints: Application to Powde...
High-Dimensional Bayesian Optimization with Constraints: Application to Powde...
Shoki Miyagawa
 
Wasserstein GANを熟読する
Wasserstein GANを熟読するWasserstein GANを熟読する
Wasserstein GANを熟読する
ssusera4bf2d
 
PRML 1.5-1.5.5 決定理論
PRML 1.5-1.5.5 決定理論PRML 1.5-1.5.5 決定理論
PRML 1.5-1.5.5 決定理論
Akihiro Nitta
 

Similar to Practical Machine Learning and Rails Part1 (20)

Static Analysis
Static AnalysisStatic Analysis
Static Analysis
alice yang
 
Cs221 lecture5-fall11
Cs221 lecture5-fall11Cs221 lecture5-fall11
Cs221 lecture5-fall11
darwinrlo
 
Machine Learning 101 - AWS Machine Learning Web Day
Machine Learning 101 - AWS Machine Learning Web DayMachine Learning 101 - AWS Machine Learning Web Day
Machine Learning 101 - AWS Machine Learning Web Day
AWS Germany
 
NAIVE BAYES ALGORITHM
NAIVE BAYES ALGORITHMNAIVE BAYES ALGORITHM
NAIVE BAYES ALGORITHM
Rang Technologies
 
The Art of Identifying Vulnerabilities - CascadiaFest 2015
The Art of Identifying Vulnerabilities  - CascadiaFest 2015The Art of Identifying Vulnerabilities  - CascadiaFest 2015
The Art of Identifying Vulnerabilities - CascadiaFest 2015
Adam Baldwin
 
Barga Data Science lecture 9
Barga Data Science lecture 9Barga Data Science lecture 9
Barga Data Science lecture 9
Roger Barga
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015
Turi, Inc.
 
Data mining on yelp dataset
Data mining on yelp datasetData mining on yelp dataset
Data mining on yelp dataset
Parineetha Tirumali
 
2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.Net2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.Net
Bruno Capuano
 
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...
Silvio Cesare
 
Data Science-2.pptx for engineering students
Data Science-2.pptx for engineering studentsData Science-2.pptx for engineering students
Data Science-2.pptx for engineering students
anughasha
 
The Magical Art of Extracting Meaning From Data
The Magical Art of Extracting Meaning From DataThe Magical Art of Extracting Meaning From Data
The Magical Art of Extracting Meaning From Data
lmrei
 
Knowledge graphs, meet Deep Learning
Knowledge graphs, meet Deep LearningKnowledge graphs, meet Deep Learning
Knowledge graphs, meet Deep Learning
Connected Data World
 
Machine learning, biomarker accuracy and best practices
Machine learning, biomarker accuracy and best practicesMachine learning, biomarker accuracy and best practices
Machine learning, biomarker accuracy and best practices
Pradeep Redddy Raamana
 
07-Classification.pptx
07-Classification.pptx07-Classification.pptx
07-Classification.pptx
Shree Shree
 
Machine Learning Classifiers
Machine Learning ClassifiersMachine Learning Classifiers
Machine Learning Classifiers
Mostafa
 
Probabilistic Programming: Why, What, How, When?
Probabilistic Programming: Why, What, How, When?Probabilistic Programming: Why, What, How, When?
Probabilistic Programming: Why, What, How, When?
Salesforce Engineering
 
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tDefcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
pseudor00t overflow
 
Practical Data Analysis in Python
Practical Data Analysis in PythonPractical Data Analysis in Python
Practical Data Analysis in Python
Hilary Mason
 
Barga Data Science lecture 8
Barga Data Science lecture 8Barga Data Science lecture 8
Barga Data Science lecture 8
Roger Barga
 
Static Analysis
Static AnalysisStatic Analysis
Static Analysis
alice yang
 
Cs221 lecture5-fall11
Cs221 lecture5-fall11Cs221 lecture5-fall11
Cs221 lecture5-fall11
darwinrlo
 
Machine Learning 101 - AWS Machine Learning Web Day
Machine Learning 101 - AWS Machine Learning Web DayMachine Learning 101 - AWS Machine Learning Web Day
Machine Learning 101 - AWS Machine Learning Web Day
AWS Germany
 
The Art of Identifying Vulnerabilities - CascadiaFest 2015
The Art of Identifying Vulnerabilities  - CascadiaFest 2015The Art of Identifying Vulnerabilities  - CascadiaFest 2015
The Art of Identifying Vulnerabilities - CascadiaFest 2015
Adam Baldwin
 
Barga Data Science lecture 9
Barga Data Science lecture 9Barga Data Science lecture 9
Barga Data Science lecture 9
Roger Barga
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015
Turi, Inc.
 
2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.Net2020 01 21 Data Platform Geeks - Machine Learning.Net
2020 01 21 Data Platform Geeks - Machine Learning.Net
Bruno Capuano
 
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...
A Fast Flowgraph Based Classification System for Packed and Polymorphic Malwa...
Silvio Cesare
 
Data Science-2.pptx for engineering students
Data Science-2.pptx for engineering studentsData Science-2.pptx for engineering students
Data Science-2.pptx for engineering students
anughasha
 
The Magical Art of Extracting Meaning From Data
The Magical Art of Extracting Meaning From DataThe Magical Art of Extracting Meaning From Data
The Magical Art of Extracting Meaning From Data
lmrei
 
Knowledge graphs, meet Deep Learning
Knowledge graphs, meet Deep LearningKnowledge graphs, meet Deep Learning
Knowledge graphs, meet Deep Learning
Connected Data World
 
Machine learning, biomarker accuracy and best practices
Machine learning, biomarker accuracy and best practicesMachine learning, biomarker accuracy and best practices
Machine learning, biomarker accuracy and best practices
Pradeep Redddy Raamana
 
07-Classification.pptx
07-Classification.pptx07-Classification.pptx
07-Classification.pptx
Shree Shree
 
Machine Learning Classifiers
Machine Learning ClassifiersMachine Learning Classifiers
Machine Learning Classifiers
Mostafa
 
Probabilistic Programming: Why, What, How, When?
Probabilistic Programming: Why, What, How, When?Probabilistic Programming: Why, What, How, When?
Probabilistic Programming: Why, What, How, When?
Salesforce Engineering
 
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00tDefcon 21-pinto-defending-networks-machine-learning by pseudor00t
Defcon 21-pinto-defending-networks-machine-learning by pseudor00t
pseudor00t overflow
 
Practical Data Analysis in Python
Practical Data Analysis in PythonPractical Data Analysis in Python
Practical Data Analysis in Python
Hilary Mason
 
Barga Data Science lecture 8
Barga Data Science lecture 8Barga Data Science lecture 8
Barga Data Science lecture 8
Roger Barga
 
Ad

More from ryanstout (8)

Neural networks - BigSkyDevCon
Neural networks - BigSkyDevConNeural networks - BigSkyDevCon
Neural networks - BigSkyDevCon
ryanstout
 
Volt 2015
Volt 2015Volt 2015
Volt 2015
ryanstout
 
Isomorphic App Development with Ruby and Volt - Rubyconf2014
Isomorphic App Development with Ruby and Volt - Rubyconf2014Isomorphic App Development with Ruby and Volt - Rubyconf2014
Isomorphic App Development with Ruby and Volt - Rubyconf2014
ryanstout
 
Reactive programming
Reactive programmingReactive programming
Reactive programming
ryanstout
 
Concurrency Patterns
Concurrency PatternsConcurrency Patterns
Concurrency Patterns
ryanstout
 
EmberJS
EmberJSEmberJS
EmberJS
ryanstout
 
Practical Machine Learning and Rails Part2
Practical Machine Learning and Rails Part2Practical Machine Learning and Rails Part2
Practical Machine Learning and Rails Part2
ryanstout
 
Intro to Advanced JavaScript
Intro to Advanced JavaScriptIntro to Advanced JavaScript
Intro to Advanced JavaScript
ryanstout
 
Neural networks - BigSkyDevCon
Neural networks - BigSkyDevConNeural networks - BigSkyDevCon
Neural networks - BigSkyDevCon
ryanstout
 
Isomorphic App Development with Ruby and Volt - Rubyconf2014
Isomorphic App Development with Ruby and Volt - Rubyconf2014Isomorphic App Development with Ruby and Volt - Rubyconf2014
Isomorphic App Development with Ruby and Volt - Rubyconf2014
ryanstout
 
Reactive programming
Reactive programmingReactive programming
Reactive programming
ryanstout
 
Concurrency Patterns
Concurrency PatternsConcurrency Patterns
Concurrency Patterns
ryanstout
 
Practical Machine Learning and Rails Part2
Practical Machine Learning and Rails Part2Practical Machine Learning and Rails Part2
Practical Machine Learning and Rails Part2
ryanstout
 
Intro to Advanced JavaScript
Intro to Advanced JavaScriptIntro to Advanced JavaScript
Intro to Advanced JavaScript
ryanstout
 
Ad

Recently uploaded (20)

Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
accessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electricaccessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electric
UXPA Boston
 
DNF 2.0 Implementations Challenges in Nepal
DNF 2.0 Implementations Challenges in NepalDNF 2.0 Implementations Challenges in Nepal
DNF 2.0 Implementations Challenges in Nepal
ICT Frame Magazine Pvt. Ltd.
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Christian Folini
 
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Gary Arora
 
OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...
OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...
OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...
SOFTTECHHUB
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
Understanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdfUnderstanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdf
Fulcrum Concepts, LLC
 
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
Toru Tamaki
 
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdfICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
Eryk Budi Pratama
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Alan Dix
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
Computer Systems Quiz Presentation in Purple Bold Style (4).pdf
Computer Systems Quiz Presentation in Purple Bold Style (4).pdfComputer Systems Quiz Presentation in Purple Bold Style (4).pdf
Computer Systems Quiz Presentation in Purple Bold Style (4).pdf
fizarcse
 
Build With AI - In Person Session Slides.pdf
Build With AI - In Person Session Slides.pdfBuild With AI - In Person Session Slides.pdf
Build With AI - In Person Session Slides.pdf
Google Developer Group - Harare
 
Top 5 Qualities to Look for in Salesforce Partners in 2025
Top 5 Qualities to Look for in Salesforce Partners in 2025Top 5 Qualities to Look for in Salesforce Partners in 2025
Top 5 Qualities to Look for in Salesforce Partners in 2025
Damco Salesforce Services
 
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptxDevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
Justin Reock
 
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
ICT Frame Magazine Pvt. Ltd.
 
IT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information TechnologyIT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information Technology
SHEHABALYAMANI
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
accessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electricaccessibility Considerations during Design by Rick Blair, Schneider Electric
accessibility Considerations during Design by Rick Blair, Schneider Electric
UXPA Boston
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?
Christian Folini
 
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...
Gary Arora
 
OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...
OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...
OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...
SOFTTECHHUB
 
Artificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptxArtificial_Intelligence_in_Everyday_Life.pptx
Artificial_Intelligence_in_Everyday_Life.pptx
03ANMOLCHAURASIYA
 
Understanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdfUnderstanding SEO in the Age of AI.pdf
Understanding SEO in the Age of AI.pdf
Fulcrum Concepts, LLC
 
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
Toru Tamaki
 
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdfICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
ICDCC 2025: Securing Agentic AI - Eryk Budi Pratama.pdf
Eryk Budi Pratama
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Who's choice? Making decisions with and about Artificial Intelligence, Keele ...
Alan Dix
 
Dark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanizationDark Dynamism: drones, dark factories and deurbanization
Dark Dynamism: drones, dark factories and deurbanization
Jakub Šimek
 
Computer Systems Quiz Presentation in Purple Bold Style (4).pdf
Computer Systems Quiz Presentation in Purple Bold Style (4).pdfComputer Systems Quiz Presentation in Purple Bold Style (4).pdf
Computer Systems Quiz Presentation in Purple Bold Style (4).pdf
fizarcse
 
Top 5 Qualities to Look for in Salesforce Partners in 2025
Top 5 Qualities to Look for in Salesforce Partners in 2025Top 5 Qualities to Look for in Salesforce Partners in 2025
Top 5 Qualities to Look for in Salesforce Partners in 2025
Damco Salesforce Services
 
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptxDevOpsDays SLC - Platform Engineers are Product Managers.pptx
DevOpsDays SLC - Platform Engineers are Product Managers.pptx
Justin Reock
 
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
MULTI-STAKEHOLDER CONSULTATION PROGRAM On Implementation of DNF 2.0 and Way F...
ICT Frame Magazine Pvt. Ltd.
 
IT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information TechnologyIT488 Wireless Sensor Networks_Information Technology
IT488 Wireless Sensor Networks_Information Technology
SHEHABALYAMANI
 

Practical Machine Learning and Rails Part1

  • 2. Andrew Cantino VP Engineering, Mavenlink @tectonic Founder, Agile Productions @ryanstout
  • 3. This talk will - introduce machine learning - make you ML-aware - have examples
  • 4. This talk will not - give you a PhD - implement algorithms - cover collaborative filtering, optimization, clustering, advanced statistics, genetic algorithms, classical AI, NLP, ...
  • 5. What is Machine Learning? Many different algorithms that predict data from other data using applied statistics.
  • 6. "Enhance and rotate 20 degrees"
  • 7. What data? The web is data. User decisions APIs A/B Tests Databases Logs Streams Browser versions Reviews Clicktrails
  • 8. Okay. We have data. What do we do with it? We classify it.
  • 11. Classification :) OR :(
  • 12. Classification • Documents o Sort email (Gmail's importance filter) o Route questions to appropriate expert (Aardvark) o Categorize reviews (Amazon) • Users o Expertise; interests; pro vs free; likelihood of paying; expected future karma • Events o Abnormal vs. normal
  • 13. Algorithms: Decision Tree Learning
  • 14. Algorithms: Decision Tree Learning Features Email contains word "viagra" no yes Email contains Email contains word "Ruby" attachment? no yes no yes P(Spam)=10% P(Spam)=5% P(Spam)=70% P(Spam)=95% Labels
  • 15. Algorithms: Support Vector Machines (SVMs) Graphics from Wikipedia
  • 16. Algorithms: Support Vector Machines (SVMs) Graphics from Wikipedia
  • 17. Algorithms: Naive Bayes • Break documents into words and treat each word as an independent feature • Surprisingly effective on simple text and document classification • Works well when you have lots of data Graphics from Wikipedia
  • 18. Algorithms: Naive Bayes You received 100 emails, 70 of which were spam. Word Spam with this word Ham with this word viagra 42 (60%) 1 (3.3%) ruby 7 (10%) 15 (50%) hello 35 (50%) 24 (80%) A new email contains hello and viagra. The probability that it is spam is: P(S|hello,viagra) = P(S) * P(hello,viagra|S) / P(hello,viagra) = 0.7 * (0.5 * 0.6) / (0.59 * 0.43) = 82% Graphics from Wikipedia
  • 19. Algorithms: Neural Nets Hidden layer Input layer (features) Output layer (Classification) Graphics from Wikipedia
  • 20. Curse of Dimensionality The more features and labels that you have, the more data that you need. http://www.iro.umontreal.ca/~bengioy/yoshua_en/research_files/CurseDimensionality.jpg
  • 21. Overfitting • With enough parameters, anything is possible. • We want our algorithms to generalize and infer, not memorize specific training examples. • Therefore, we test our algorithms on different data than we train them on.
  翻译: