SlideShare a Scribd company logo
Hadoop Con2015 - The Data Scientist’s Toolbox
LEN CHANG
• MACHINE LEARNING & DATA MINING
• DISTRIBUTION SYSTEM & NOSQL
• CRAWLER & CHINESE MINING
• Communication Engineering, General Study - CCU
• Software Engineering, Master Study - NCU
• Pixnet Hackathon 2014 – EXIT MINING
• Pixnet Hackathon 2015 – Spam User Detection
• Taipei Open Data Hackathon 2015
– The relation between Religion and Taipei City
• BI SYSTEM & DATA VISUALIZATION
• FINANCE & EDUCATION & ART & SPORT
• THE PLAYER OF BLIZZARD GAMES
AGENDA
• A GOOD STORY
• TOOL 1 : DATABASE
• TOOL 2 : COLLECTION AND REPLICATE.
• TOOL 3 : VISUALIZATION.
• TOOL 4: MACHINE LEARNING
• SAMPLE
• SUMMARY
A GOOD STORY
DIGITAL CUSTOMER EXPERIENCE
how much money do you want to pay ?
45 NT / Latte 95 NT / Latte
WHY ?
如果說家庭是人際交流的「第一個好去處」,而職場是
「第二個好去處」,那麼像咖啡館(如星巴克)這樣的公
共場所,就是我常提到的「第三個好去處」。咖啡館的環
境介於住家和辦公室兩者之間,既能社交,也能獨處,人
們可以在這裡與他人聯絡感情,也能重新面對自我。星巴
克的創業宗旨,就是想為一般人提供這種寶貴的機會。
~Howard Schultz
• the loyalty card
• pay in advance on mobile
• wireless device charging
Digital customer experience
Chief Digital Officer: Adam Brotman
Location
Mobile pay
loyalty card
A Good
Digital Customer Experience
Social network
BI System, Data
warehousing…etc
A GOOD STORY TELL US…
• FIND YOUR “UNIQUE CUSTOMER DATA”.
• USE “CUSTOMER DATA” TO IMPROVE “DIGITAL CUSTOMER EXPERIENCE"
• USE “DIGITAL CUSTOMER EXPERIENCE” TO HELP ORGANIZATION “MAKE MONEY”.
TOOL 1: DATABASE
OLAP AND NOSQL
Location
Mobile pay
loyalty card
A Good
Digital Customer Experience
Social network
BI System, Data
warehousing…etc
BI System, Data
warehousing…etc
Relation-DB NOSQL
How to choose ?
THE PURPOSE IS IMPORTANT
CDC
ETL
SQL
100 % accurate answer when I see the report
THE PURPOSE IS IMPORTANT
Marching Learning
Real time feedback
Real-time dashboard
less accurate, faster response when I need a rough answer
THE PURPOSE IS IMPORTANT
Marching Learning
Powerful at full-text search, weak at number computing.
THE PURPOSE IS IMPORTANT
High frequency
Real-time dashboard
To ensure accurate and speed, costing isn’t important.
DATABASE
• 100 % ACCURATE
• RELATION DATABASE
• LESS ACCURATE, MORE FASTER
• HBASE, SPARK ,CASSANDRA, MONGODB, OTHERS..
• SPECIAL CASE
• FULL-TEXTING SEARCH: ELASTICSEARCH
• ACCURATE AND SPEED: REDIS OR OTHER IN-MEMORY DB.
COLLECTION AND REPLICATE
LOGSTASH AND FLUENTD
REPLICATION TOOL
Location
Mobile pay
loyalty card
A Good
Digital Customer Experience
Social network
BI System, Data
warehousing…etc
Collection: Any Data in, Any Data out
Location
Mobile pay
loyalty card
Social network
BI System, Data
warehousing…etc
Collection: Any Data in, Any Data out
FLUENTD: BUILD YOUR UNIFIED LOGGING LAYER
LOGSTASH: COLLECT, ENRICH & TRANSPORT DATA
COMPARISON
FLUENTD
• LANG: C EMBEDDED IN RUBY
• PLATFORM: LINUX
• MAJOR OUTPUT DB: MONGODB
LOGSTASH
• LANG: JAVA
• PLATFORM: LINUX AND WINDOWS
• MAJOR OUTPUT DB: ELASTICSEARCH
• ELK ARCH.
Location
Mobile pay
loyalty card
Social network
BI System, Data
warehouse…etc
Replicate: replicate data
from DB_A to DB_B
RDB RDB
Case 1
NOSQL RDB
Case 3
Transaction
DB
NOSQL NOSQL
Case 2
ETL: Extract-Transform-Load
RDB RDB
Case 1
NOSQL NOSQL
Case 2
NOSQL RDB
Case 3
Node
PostgresNode
Node
Node
Node
mongo
COMPARISON
RDB TO RDB NOSQL TO RDBNOSQL TO NOSQL
• TRADITIONAL MECHANISM
• TO ENSURE THE “DATA
CONSISTENCY”
• FINANCIAL INDUSTRY
• HUGE DATA ANALYSIS
• LOW COSTING HARDWARE ,
POWERFUL AND FAST
COMPUTATION
• NEED PROGRAMMING SKILL,
NOT ONLY SQL
• MAKE A RDB AS A NODE OF
NOSQL CLUSTER
• MAYBE IT IS A BALANCE
BETWEEN NOSQL AND RDB
VISUALIZATION
VISUALIZE YOUR DATA
1,999 USD
Hadoop Con2015 - The Data Scientist’s Toolbox
Hadoop Con2015 - The Data Scientist’s Toolbox
MACHINE LEARNING
GENETIC ALGORITHM
Genetic algorithm
Travelling salesman problem
Self-help tourism Scheduling
Genetic Algorithm
System
Linear algebra and Probability are important
Bayesian probability Decision Tree
Regression
Support Vector Machine
SAMPLE
SOME INTERESTING APPLICATION SAMPLE ….
“
”
FINANCIAL DISTRESS PREDICTION
SYSTEM
financial index
Company Share price
Genetic Algorithm
3000 financial indices
20 financial indices
Support Vector Machine
Matlab & C# & ASP.NET
“
”
GAME TREND MONITOR SYSTEM
Crawler System
Crawler System
Crawler System
Crawler System
DB
Text Mining
System
Article =>
Emotional Value
C# & MSSQL & SSRS
DB
C# & MSSQL & SSRS
“
”
APP BEHAVIOR ANALYSIS SYSTEM
RDB
s3fs
Node
PostgresNode
Node
Node
Node
mongo
Pentaho
R
R & RUBY & MONGODB & POSTGRES & Pentaho & MOSQL & FLUENTD & s3fs
SUMMARY
FOR THE SAME THING, YOU WILL MAKE A BETTER SOLUTION OR MECHANISM WHEN YOU'RE A MULTI
DOMAIN-EXPERT.
Crawler System
Text Mining
System
Article => Emotional Value
8 years up…
Shortcut?
What’s the fastest method to understand zombie ?
Hadoop Con2015 - The Data Scientist’s Toolbox
Ad

More Related Content

Viewers also liked (19)

UX, ethnography and possibilities: for Libraries, Museums and Archives
UX, ethnography and possibilities: for Libraries, Museums and ArchivesUX, ethnography and possibilities: for Libraries, Museums and Archives
UX, ethnography and possibilities: for Libraries, Museums and Archives
Ned Potter
 
Designing Teams for Emerging Challenges
Designing Teams for Emerging ChallengesDesigning Teams for Emerging Challenges
Designing Teams for Emerging Challenges
Aaron Irizarry
 
Visual Design with Data
Visual Design with DataVisual Design with Data
Visual Design with Data
Seth Familian
 
3 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 20173 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 2017
Drift
 
綠黨網路支黨部 黨員大會工作報告
綠黨網路支黨部  黨員大會工作報告綠黨網路支黨部  黨員大會工作報告
綠黨網路支黨部 黨員大會工作報告
Charles Chuang
 
臺北市政府開放資料黑客松
臺北市政府開放資料黑客松臺北市政府開放資料黑客松
臺北市政府開放資料黑客松
Charles Chuang
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
Leslie Samuel
 
2014 Pixnet Hackathonh - EXIF Mining
2014 Pixnet Hackathonh - EXIF Mining2014 Pixnet Hackathonh - EXIF Mining
2014 Pixnet Hackathonh - EXIF Mining
Len Chang
 
Use Redis in Odd and Unusual Ways
Use Redis in Odd and Unusual WaysUse Redis in Odd and Unusual Ways
Use Redis in Odd and Unusual Ways
Itamar Haber
 
Madrid Agudelo Juliana_AporteIndividual
Madrid Agudelo Juliana_AporteIndividualMadrid Agudelo Juliana_AporteIndividual
Madrid Agudelo Juliana_AporteIndividual
Juliana Madrid
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Alex Pinto
 
DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...
DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...
DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...
Neil Shannon
 
02 math essentials
02 math essentials02 math essentials
02 math essentials
Poongodi Mano
 
Agile scrum in startup
Agile scrum in startup  Agile scrum in startup
Agile scrum in startup
Len Chang
 
Hadoop con2016 - Implement Real-time Centralized logging System by Elastic Stack
Hadoop con2016 - Implement Real-time Centralized logging System by Elastic StackHadoop con2016 - Implement Real-time Centralized logging System by Elastic Stack
Hadoop con2016 - Implement Real-time Centralized logging System by Elastic Stack
Len Chang
 
Nine Pages You Should Optimize on Your Blog and How
Nine Pages You Should Optimize on Your Blog and HowNine Pages You Should Optimize on Your Blog and How
Nine Pages You Should Optimize on Your Blog and How
Leslie Samuel
 
African Americans: College Majors and Earnings
African Americans: College Majors and Earnings African Americans: College Majors and Earnings
African Americans: College Majors and Earnings
CEW Georgetown
 
The Online College Labor Market
The Online College Labor MarketThe Online College Labor Market
The Online College Labor Market
CEW Georgetown
 
GAME ON! Integrating Games and Simulations in the Classroom
GAME ON! Integrating Games and Simulations in the Classroom GAME ON! Integrating Games and Simulations in the Classroom
GAME ON! Integrating Games and Simulations in the Classroom
Brian Housand
 
UX, ethnography and possibilities: for Libraries, Museums and Archives
UX, ethnography and possibilities: for Libraries, Museums and ArchivesUX, ethnography and possibilities: for Libraries, Museums and Archives
UX, ethnography and possibilities: for Libraries, Museums and Archives
Ned Potter
 
Designing Teams for Emerging Challenges
Designing Teams for Emerging ChallengesDesigning Teams for Emerging Challenges
Designing Teams for Emerging Challenges
Aaron Irizarry
 
Visual Design with Data
Visual Design with DataVisual Design with Data
Visual Design with Data
Seth Familian
 
3 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 20173 Things Every Sales Team Needs to Be Thinking About in 2017
3 Things Every Sales Team Needs to Be Thinking About in 2017
Drift
 
綠黨網路支黨部 黨員大會工作報告
綠黨網路支黨部  黨員大會工作報告綠黨網路支黨部  黨員大會工作報告
綠黨網路支黨部 黨員大會工作報告
Charles Chuang
 
臺北市政府開放資料黑客松
臺北市政府開放資料黑客松臺北市政府開放資料黑客松
臺北市政府開放資料黑客松
Charles Chuang
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
Leslie Samuel
 
2014 Pixnet Hackathonh - EXIF Mining
2014 Pixnet Hackathonh - EXIF Mining2014 Pixnet Hackathonh - EXIF Mining
2014 Pixnet Hackathonh - EXIF Mining
Len Chang
 
Use Redis in Odd and Unusual Ways
Use Redis in Odd and Unusual WaysUse Redis in Odd and Unusual Ways
Use Redis in Odd and Unusual Ways
Itamar Haber
 
Madrid Agudelo Juliana_AporteIndividual
Madrid Agudelo Juliana_AporteIndividualMadrid Agudelo Juliana_AporteIndividual
Madrid Agudelo Juliana_AporteIndividual
Juliana Madrid
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Alex Pinto
 
DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...
DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...
DevNexus 2017 - Building and Deploying 12 Factor Apps in Scala, Java, Ruby, a...
Neil Shannon
 
Agile scrum in startup
Agile scrum in startup  Agile scrum in startup
Agile scrum in startup
Len Chang
 
Hadoop con2016 - Implement Real-time Centralized logging System by Elastic Stack
Hadoop con2016 - Implement Real-time Centralized logging System by Elastic StackHadoop con2016 - Implement Real-time Centralized logging System by Elastic Stack
Hadoop con2016 - Implement Real-time Centralized logging System by Elastic Stack
Len Chang
 
Nine Pages You Should Optimize on Your Blog and How
Nine Pages You Should Optimize on Your Blog and HowNine Pages You Should Optimize on Your Blog and How
Nine Pages You Should Optimize on Your Blog and How
Leslie Samuel
 
African Americans: College Majors and Earnings
African Americans: College Majors and Earnings African Americans: College Majors and Earnings
African Americans: College Majors and Earnings
CEW Georgetown
 
The Online College Labor Market
The Online College Labor MarketThe Online College Labor Market
The Online College Labor Market
CEW Georgetown
 
GAME ON! Integrating Games and Simulations in the Classroom
GAME ON! Integrating Games and Simulations in the Classroom GAME ON! Integrating Games and Simulations in the Classroom
GAME ON! Integrating Games and Simulations in the Classroom
Brian Housand
 

Similar to Hadoop Con2015 - The Data Scientist’s Toolbox (20)

DataScience and BigData Cebu 1st meetup
DataScience and BigData Cebu 1st meetupDataScience and BigData Cebu 1st meetup
DataScience and BigData Cebu 1st meetup
Francisco Liwa
 
Mastering Your Customer Data on Apache Spark by Elliott Cordo
Mastering Your Customer Data on Apache Spark by Elliott CordoMastering Your Customer Data on Apache Spark by Elliott Cordo
Mastering Your Customer Data on Apache Spark by Elliott Cordo
Spark Summit
 
In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017
SingleStore
 
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4j
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4jNeo4j GraphTalks - Introduction to GraphDatabases and Neo4j
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4j
Neo4j
 
Moving Targets: Harnessing Real-time Value from Data in Motion
Moving Targets: Harnessing Real-time Value from Data in Motion Moving Targets: Harnessing Real-time Value from Data in Motion
Moving Targets: Harnessing Real-time Value from Data in Motion
Inside Analysis
 
Introduction: Relational to Graphs
Introduction: Relational to GraphsIntroduction: Relational to Graphs
Introduction: Relational to Graphs
Neo4j
 
Introducing Neo4j
Introducing Neo4jIntroducing Neo4j
Introducing Neo4j
Neo4j
 
Rapid Data Exploration With Hadoop
Rapid Data Exploration With HadoopRapid Data Exploration With Hadoop
Rapid Data Exploration With Hadoop
Peter Skomoroch
 
Smart Cities, Open Data and SMW - SMWCon Spring 2012 Keynote
Smart Cities, Open Data and SMW - SMWCon Spring 2012 KeynoteSmart Cities, Open Data and SMW - SMWCon Spring 2012 Keynote
Smart Cities, Open Data and SMW - SMWCon Spring 2012 Keynote
Joel Natividad
 
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkData-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Databricks
 
Neo4j GraphTalk Oslo - Introduction to Graphs
Neo4j GraphTalk Oslo - Introduction to GraphsNeo4j GraphTalk Oslo - Introduction to Graphs
Neo4j GraphTalk Oslo - Introduction to Graphs
Neo4j
 
Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)
Sid Anand
 
Geschäftliches Potential für System-Integratoren und Berater - Graphdatenban...
Geschäftliches Potential für System-Integratoren und Berater -  Graphdatenban...Geschäftliches Potential für System-Integratoren und Berater -  Graphdatenban...
Geschäftliches Potential für System-Integratoren und Berater - Graphdatenban...
Neo4j
 
Semantics and Machine Learning
Semantics and Machine LearningSemantics and Machine Learning
Semantics and Machine Learning
Vladimir Alexiev, PhD, PMP
 
Graphs fun vjug2
Graphs fun vjug2Graphs fun vjug2
Graphs fun vjug2
Neo4j
 
Introduction to Neo4j
Introduction to Neo4jIntroduction to Neo4j
Introduction to Neo4j
Neo4j
 
Keynote: GraphTour Toronto
Keynote: GraphTour TorontoKeynote: GraphTour Toronto
Keynote: GraphTour Toronto
Neo4j
 
Доклад Владимира Бичева на третьем митапе сообщества блокчейн-разработчиков С...
Доклад Владимира Бичева на третьем митапе сообщества блокчейн-разработчиков С...Доклад Владимира Бичева на третьем митапе сообщества блокчейн-разработчиков С...
Доклад Владимира Бичева на третьем митапе сообщества блокчейн-разработчиков С...
Дмитрий Плахов
 
Data Science At Zillow
Data Science At ZillowData Science At Zillow
Data Science At Zillow
Nicholas McClure
 
The Great Lakes: How to Approach a Big Data Implementation
The Great Lakes: How to Approach a Big Data ImplementationThe Great Lakes: How to Approach a Big Data Implementation
The Great Lakes: How to Approach a Big Data Implementation
Inside Analysis
 
DataScience and BigData Cebu 1st meetup
DataScience and BigData Cebu 1st meetupDataScience and BigData Cebu 1st meetup
DataScience and BigData Cebu 1st meetup
Francisco Liwa
 
Mastering Your Customer Data on Apache Spark by Elliott Cordo
Mastering Your Customer Data on Apache Spark by Elliott CordoMastering Your Customer Data on Apache Spark by Elliott Cordo
Mastering Your Customer Data on Apache Spark by Elliott Cordo
Spark Summit
 
In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017
SingleStore
 
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4j
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4jNeo4j GraphTalks - Introduction to GraphDatabases and Neo4j
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4j
Neo4j
 
Moving Targets: Harnessing Real-time Value from Data in Motion
Moving Targets: Harnessing Real-time Value from Data in Motion Moving Targets: Harnessing Real-time Value from Data in Motion
Moving Targets: Harnessing Real-time Value from Data in Motion
Inside Analysis
 
Introduction: Relational to Graphs
Introduction: Relational to GraphsIntroduction: Relational to Graphs
Introduction: Relational to Graphs
Neo4j
 
Introducing Neo4j
Introducing Neo4jIntroducing Neo4j
Introducing Neo4j
Neo4j
 
Rapid Data Exploration With Hadoop
Rapid Data Exploration With HadoopRapid Data Exploration With Hadoop
Rapid Data Exploration With Hadoop
Peter Skomoroch
 
Smart Cities, Open Data and SMW - SMWCon Spring 2012 Keynote
Smart Cities, Open Data and SMW - SMWCon Spring 2012 KeynoteSmart Cities, Open Data and SMW - SMWCon Spring 2012 Keynote
Smart Cities, Open Data and SMW - SMWCon Spring 2012 Keynote
Joel Natividad
 
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkData-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Databricks
 
Neo4j GraphTalk Oslo - Introduction to Graphs
Neo4j GraphTalk Oslo - Introduction to GraphsNeo4j GraphTalk Oslo - Introduction to Graphs
Neo4j GraphTalk Oslo - Introduction to Graphs
Neo4j
 
Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)Big Data, Fast Data @ PayPal (YOW 2018)
Big Data, Fast Data @ PayPal (YOW 2018)
Sid Anand
 
Geschäftliches Potential für System-Integratoren und Berater - Graphdatenban...
Geschäftliches Potential für System-Integratoren und Berater -  Graphdatenban...Geschäftliches Potential für System-Integratoren und Berater -  Graphdatenban...
Geschäftliches Potential für System-Integratoren und Berater - Graphdatenban...
Neo4j
 
Graphs fun vjug2
Graphs fun vjug2Graphs fun vjug2
Graphs fun vjug2
Neo4j
 
Introduction to Neo4j
Introduction to Neo4jIntroduction to Neo4j
Introduction to Neo4j
Neo4j
 
Keynote: GraphTour Toronto
Keynote: GraphTour TorontoKeynote: GraphTour Toronto
Keynote: GraphTour Toronto
Neo4j
 
Доклад Владимира Бичева на третьем митапе сообщества блокчейн-разработчиков С...
Доклад Владимира Бичева на третьем митапе сообщества блокчейн-разработчиков С...Доклад Владимира Бичева на третьем митапе сообщества блокчейн-разработчиков С...
Доклад Владимира Бичева на третьем митапе сообщества блокчейн-разработчиков С...
Дмитрий Плахов
 
The Great Lakes: How to Approach a Big Data Implementation
The Great Lakes: How to Approach a Big Data ImplementationThe Great Lakes: How to Approach a Big Data Implementation
The Great Lakes: How to Approach a Big Data Implementation
Inside Analysis
 
Ad

Recently uploaded (20)

Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptxSmart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Seasia Infotech
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
Agentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community MeetupAgentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community Meetup
Manoj Batra (1600 + Connections)
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
SOFTTECHHUB
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
CSUC - Consorci de Serveis Universitaris de Catalunya
 
Bepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firmBepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firm
Benard76
 
Canadian book publishing: Insights from the latest salary survey - Tech Forum...
Canadian book publishing: Insights from the latest salary survey - Tech Forum...Canadian book publishing: Insights from the latest salary survey - Tech Forum...
Canadian book publishing: Insights from the latest salary survey - Tech Forum...
BookNet Canada
 
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent LasterAI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
Jignesh Shah - The Innovator and Czar of Exchanges
Jignesh Shah - The Innovator and Czar of ExchangesJignesh Shah - The Innovator and Czar of Exchanges
Jignesh Shah - The Innovator and Czar of Exchanges
Jignesh Shah Innovator
 
AI You Can Trust: The Critical Role of Governance and Quality.pdf
AI You Can Trust: The Critical Role of Governance and Quality.pdfAI You Can Trust: The Critical Role of Governance and Quality.pdf
AI You Can Trust: The Critical Role of Governance and Quality.pdf
Precisely
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptxWebinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
MSP360
 
Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
The Future of Cisco Cloud Security: Innovations and AI Integration
The Future of Cisco Cloud Security: Innovations and AI IntegrationThe Future of Cisco Cloud Security: Innovations and AI Integration
The Future of Cisco Cloud Security: Innovations and AI Integration
Re-solution Data Ltd
 
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptxSmart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Seasia Infotech
 
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à GenèveUiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPath Automation Suite – Cas d'usage d'une NGO internationale basée à Genève
UiPathCommunity
 
machines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdfmachines-for-woodworking-shops-en-compressed.pdf
machines-for-woodworking-shops-en-compressed.pdf
AmirStern2
 
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
The No-Code Way to Build a Marketing Team with One AI Agent (Download the n8n...
SOFTTECHHUB
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Bepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firmBepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firm
Benard76
 
Canadian book publishing: Insights from the latest salary survey - Tech Forum...
Canadian book publishing: Insights from the latest salary survey - Tech Forum...Canadian book publishing: Insights from the latest salary survey - Tech Forum...
Canadian book publishing: Insights from the latest salary survey - Tech Forum...
BookNet Canada
 
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent LasterAI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 
Slack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teamsSlack like a pro: strategies for 10x engineering teams
Slack like a pro: strategies for 10x engineering teams
Nacho Cougil
 
Jignesh Shah - The Innovator and Czar of Exchanges
Jignesh Shah - The Innovator and Czar of ExchangesJignesh Shah - The Innovator and Czar of Exchanges
Jignesh Shah - The Innovator and Czar of Exchanges
Jignesh Shah Innovator
 
AI You Can Trust: The Critical Role of Governance and Quality.pdf
AI You Can Trust: The Critical Role of Governance and Quality.pdfAI You Can Trust: The Critical Role of Governance and Quality.pdf
AI You Can Trust: The Critical Role of Governance and Quality.pdf
Precisely
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025Zilliz Cloud Monthly Technical Review: May 2025
Zilliz Cloud Monthly Technical Review: May 2025
Zilliz
 
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptxReimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
Reimagine How You and Your Team Work with Microsoft 365 Copilot.pptx
John Moore
 
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptxWebinar - Top 5 Backup Mistakes MSPs and Businesses Make   .pptx
Webinar - Top 5 Backup Mistakes MSPs and Businesses Make .pptx
MSP360
 
Viam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdfViam product demo_ Deploying and scaling AI with hardware.pdf
Viam product demo_ Deploying and scaling AI with hardware.pdf
camilalamoratta
 
The Future of Cisco Cloud Security: Innovations and AI Integration
The Future of Cisco Cloud Security: Innovations and AI IntegrationThe Future of Cisco Cloud Security: Innovations and AI Integration
The Future of Cisco Cloud Security: Innovations and AI Integration
Re-solution Data Ltd
 
Ad

Hadoop Con2015 - The Data Scientist’s Toolbox

Editor's Notes

  • #4: Introduction (5 min) TOOL 1 (5 min) TOOL 2 (10 min) TOOL 3 (10 min) SUMMARY (5 min)
  • #7: Question 1: Why we feel a thing which Starbucks latte is more expensive is reasonable ? Question 2: Have anyone can identify what different between general latte and Starbucks latte ? Question 3: So, What Starbucks do something for this? (Animation)
  • #8: Starbucks 1.0 : The relation between person and person. Starbucks 2.0 : Make customers a good digital experience.
  • #14: 不存在一個完美的資料庫,每種資料庫都有其擅長與不擅長的地方。 以銀行業而言,資料不容許出錯,就算是報表也是一樣。這就不適合利用 擁有 “弱”一致性的 NOSQL,而是必須要使用 “強”一致性的 RDB。 因銀行業有其固定的看報表時間,所以可以利用其他時間跑大量的程序,甚至建立許多的 Cube 供報表使用。
  • #15: - 以手機語音小幫手為例,跟上述銀行業最大的差別就在於,些微的資料偏差對於分析來說是沒有太大的影響的,此時,我們就可以利用到NOSQL 的大規模運算能力去快速的獲取我們所需要的答案。
  • #16: 有的時候,某些 NOSQL 是為了處理一些特殊情況而被設計出來的,譬如: 文字檢索。 Elasticsearch 的文字檢索功能非常強大而且快速,可以說整個資料庫就是為了文字檢索而生的。但其對於數值處理方面卻不是很擅長。
  • #17: 資料格式簡單,但需與前台和後端做大規模的資料頻繁更新與一致性確認。 使用 in-memory database
  • #31: - 不管是個人出於興趣作分析,或者是當數據顧問。或者是人數5~10人的新創小公司,這套工具可以幫助你大幅增加判斷的準確度和減少大幅的內部 IT 視覺化工具開發。划算的投資。
  • #33: Monitor 專用
  • #36: - 不要浪費了分散式系統提供給我們將近無窮無盡的運算能力
  • #47: - 沒有捷徑
  • #48: - Domain Knowledge isn’t tool. It’s common sense.
  翻译: