SlideShare a Scribd company logo
Roma – 8 Febbraio 2017
presenta Alberto Paro, Seacom
ElasticSearch 5.x
New Tricks
Alberto Paro
 Laureato in Ingegneria Informatica (POLIMI)
 Autore di 3 libri su ElasticSearch da 1 a 5.x + 6 Tech
review
 Lavoro principalmente in Scala e su tecnologie BD
(Akka, Spray.io, Playframework, Apache Spark) e NoSQL
(Accumulo, Cassandra, ElasticSearch e MongoDB)
 Evangelist linguaggio Scala e Scala.JS
Tip 1: Shrink - 1/5
Why?
 The wrong number of shards during the initial
design sizing. Often sizing the shards without
knowing the correct data/text distribution tends to
oversize the number of shards
 Reducing the number of shards to reduce memory
and resource usage
 Reducing the number of shards to speed up
searching
Tip 1: Shrink - 2/5 - Where is your data?
We can retrieve it via the _nodes API:
curl -XGET 'http://localhost:9200/_nodes?pretty'
In the result there will be a similar section:
.... "nodes" : {
"5Sei9ip8Qhee3J0o9dTV4g" : {
"name" : "Gin Genie",
"transport_address" : "127.0.0.1:9300",
"host" : "127.0.0.1",
"ip" : "127.0.0.1",
"version" : "5.1.1",....
The name of my node is Gin Genie
Tip 1: Shrink - 3/5 - Relocate your data
We can change the index settings, forcing allocation to a single node for
our index, and disabling the writing for the index.
curl -XPUT 'http://localhost:9200/myindex/_settings' -d ’
{
"settings": {
"index.routing.allocation.require._name": "Gin Genie", "index.blocks.write":
true
}
}’
We can check for the green status:
curl -XGET 'http://localhost:9200/_cluster/health?pretty'
Tip 1: Shrink - 4/5 – Shrink our shards
We need to disable the writing for the index via:
curl -XPUT 'http://localhost:9200/myindex/_settings?index.blocks.write=true'
The shrink call for creating the reduced_index, will be:
curl -XPOST 'http://localhost:9200/myindex/_shrink/reduced_index' -d '{
"settings": {
"index.number_of_replicas": 1,
"index.number_of_shards": 1,
"index.codec": "best_compression”
},
"aliases": {"my_search_indices": {}}
}'
Tip 1: Shrink - 5/5 – Post Shrinking
We can also wait for a yellow status if the index it is ready to work:
curl -XGET 'http://localhost:9200/_cluster/health? wait_for_status=yellow’
Now we can remove the read-only by changing the index settings:
curl -XPUT 'http://localhost:9200/myindex/_settings? index.blocks.write=true'
Tip 2: Reindex - 1/2
Why?
 Changing an analyzer for a mapping
 Adding a new subfield to a mapping and you need
to reprocess all the records to search for the new
subfield
 Removing an unused mapping
 Changing a record structure that requires a new
mapping
Tip 2: Reindex - 2/2
curl -XPOST 'http://localhost:9200/_reindex?pretty=true' -d '{
"source": {
"index": "myindex”
"type": "mytype",
"query": "…"
},
"dest": {
"index": "myindex2",
"script": "…"
}
}'
Tip 3: Update By Query with painless
Add a new Field
1. Create your mapping (i.e modified: date)
2. Call an update by query
curl -XPOST http://$server/$index/$mapping/_update_by_query -d '{
"script": {
"inline": "ctx._source.modified="2015-10-06T00:00:00.000+00:00"",
"lang": "painless”
},
"query": {
"bool": {"must_not":[{"exists":{"field":"modified"} }]}
}
}'
Tip 4: Use search_after
Step 1:
curl -XGET 'http://$server/$index/$type/_search' -d ’{
"size": 100,
"query": { "match_all" : {} },
"sort": [{"_uid": "desc"} ]
}’
Step n, n>1:
curl -XGET 'http://$server/$index/$type/_search' -d ’{
"size": 100,
"query": { "match_all" : {} },
"search_after": ["$type#100"],
"sort": [{"_uid": "desc"} ]
}’
Tip 5: Reindex for a remote node – 1/2
Why?
 The backup is a safe Lucene index copy, so it depends on the
Elasticsearch version used. If you are switching from a version
of Elastisearch that is prior to version 5.x, it's not possible to
restore old indices.
 It's not possible to restore backups of a newer Elasticsearch
version in an older version. The restore is only forward-
compatible.
 It's not possible to restore partial data from a backup.
Tip 5: Reindex for a remote node – 2/2
In config/elasticsearch.yml add:
reindex.remote.whitelist: ["192.168.1.227:9200"]
Then:
curl -XPOST "http://$server/_reindex" -d' {
"source": {
"remote": { "host": "http://192.168.1.227:9200" },
"index": "test-source”
},
"dest": {
"index": "test-dest”
}
}'
Tip 6: Ingest Pipeline – 1/2
Why
 Adding/Removing fields without changing your code
 Manipulate your records before ingesting
 Computed fields
 Also supports scripting
Tip 6: Ingest Pipeline – 2/2
curl -XPUT 'http://127.0.0.1:9200/_ingest/pipeline/add-user-john' -d '{
"description" : "Add user john field",
"processors" : [
{
"set" : {
"field": "user",
"value": "john"} }
],
"version":1
}’
curl -XPUT http://$server/$index/$type/$id?pipeline=add-user-john -d '{}'
Grazie per
l’attenzione
Alberto Paro
Q&A
Ad

More Related Content

What's hot (20)

Making an Object System with Tcl 8.5
Making an Object System with Tcl 8.5Making an Object System with Tcl 8.5
Making an Object System with Tcl 8.5
Donal Fellows
 
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6
Workhorse Computing
 
ES6 in Real Life
ES6 in Real LifeES6 in Real Life
ES6 in Real Life
Domenic Denicola
 
Manifests of Future Past
Manifests of Future PastManifests of Future Past
Manifests of Future Past
Puppet
 
ES6: Features + Rails
ES6: Features + RailsES6: Features + Rails
ES6: Features + Rails
Santosh Wadghule
 
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg DonovanSolr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg Donovan
Gregg Donovan
 
Apache avro and overview hadoop tools
Apache avro and overview hadoop toolsApache avro and overview hadoop tools
Apache avro and overview hadoop tools
alireza alikhani
 
Modern technologies in data science
Modern technologies in data science Modern technologies in data science
Modern technologies in data science
Chucheng Hsieh
 
多治見IT勉強会 Groovy Grails
多治見IT勉強会 Groovy Grails多治見IT勉強会 Groovy Grails
多治見IT勉強会 Groovy Grails
Tsuyoshi Yamamoto
 
Indices APIs - Elasticsearch Reference
Indices APIs - Elasticsearch ReferenceIndices APIs - Elasticsearch Reference
Indices APIs - Elasticsearch Reference
Daniel Ku
 
MongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineMongoDB - Aggregation Pipeline
MongoDB - Aggregation Pipeline
Jason Terpko
 
JSDC 2014 - functional java script, why or why not
JSDC 2014 - functional java script, why or why notJSDC 2014 - functional java script, why or why not
JSDC 2014 - functional java script, why or why not
ChengHui Weng
 
Redis for the Everyday Developer
Redis for the Everyday DeveloperRedis for the Everyday Developer
Redis for the Everyday Developer
Ross Tuck
 
From Zero to Application Delivery with NixOS
From Zero to Application Delivery with NixOSFrom Zero to Application Delivery with NixOS
From Zero to Application Delivery with NixOS
Susan Potter
 
PuppetDB: A Single Source for Storing Your Puppet Data - PUG NY
PuppetDB: A Single Source for Storing Your Puppet Data - PUG NYPuppetDB: A Single Source for Storing Your Puppet Data - PUG NY
PuppetDB: A Single Source for Storing Your Puppet Data - PUG NY
Puppet
 
Ramda, a functional JavaScript library
Ramda, a functional JavaScript libraryRamda, a functional JavaScript library
Ramda, a functional JavaScript library
Derek Willian Stavis
 
Designing Opeation Oriented Web Applications / YAPC::Asia Tokyo 2011
Designing Opeation Oriented Web Applications / YAPC::Asia Tokyo 2011Designing Opeation Oriented Web Applications / YAPC::Asia Tokyo 2011
Designing Opeation Oriented Web Applications / YAPC::Asia Tokyo 2011
Masahiro Nagano
 
Sql cheat sheet
Sql cheat sheetSql cheat sheet
Sql cheat sheet
solgenomics
 
Php 102: Out with the Bad, In with the Good
Php 102: Out with the Bad, In with the GoodPhp 102: Out with the Bad, In with the Good
Php 102: Out with the Bad, In with the Good
Jeremy Kendall
 
Living with garbage
Living with garbageLiving with garbage
Living with garbage
lucenerevolution
 
Making an Object System with Tcl 8.5
Making an Object System with Tcl 8.5Making an Object System with Tcl 8.5
Making an Object System with Tcl 8.5
Donal Fellows
 
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6
Neatly Hashing a Tree: FP tree-fold in Perl5 & Perl6
Workhorse Computing
 
Manifests of Future Past
Manifests of Future PastManifests of Future Past
Manifests of Future Past
Puppet
 
Solr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg DonovanSolr & Lucene @ Etsy by Gregg Donovan
Solr & Lucene @ Etsy by Gregg Donovan
Gregg Donovan
 
Apache avro and overview hadoop tools
Apache avro and overview hadoop toolsApache avro and overview hadoop tools
Apache avro and overview hadoop tools
alireza alikhani
 
Modern technologies in data science
Modern technologies in data science Modern technologies in data science
Modern technologies in data science
Chucheng Hsieh
 
多治見IT勉強会 Groovy Grails
多治見IT勉強会 Groovy Grails多治見IT勉強会 Groovy Grails
多治見IT勉強会 Groovy Grails
Tsuyoshi Yamamoto
 
Indices APIs - Elasticsearch Reference
Indices APIs - Elasticsearch ReferenceIndices APIs - Elasticsearch Reference
Indices APIs - Elasticsearch Reference
Daniel Ku
 
MongoDB - Aggregation Pipeline
MongoDB - Aggregation PipelineMongoDB - Aggregation Pipeline
MongoDB - Aggregation Pipeline
Jason Terpko
 
JSDC 2014 - functional java script, why or why not
JSDC 2014 - functional java script, why or why notJSDC 2014 - functional java script, why or why not
JSDC 2014 - functional java script, why or why not
ChengHui Weng
 
Redis for the Everyday Developer
Redis for the Everyday DeveloperRedis for the Everyday Developer
Redis for the Everyday Developer
Ross Tuck
 
From Zero to Application Delivery with NixOS
From Zero to Application Delivery with NixOSFrom Zero to Application Delivery with NixOS
From Zero to Application Delivery with NixOS
Susan Potter
 
PuppetDB: A Single Source for Storing Your Puppet Data - PUG NY
PuppetDB: A Single Source for Storing Your Puppet Data - PUG NYPuppetDB: A Single Source for Storing Your Puppet Data - PUG NY
PuppetDB: A Single Source for Storing Your Puppet Data - PUG NY
Puppet
 
Ramda, a functional JavaScript library
Ramda, a functional JavaScript libraryRamda, a functional JavaScript library
Ramda, a functional JavaScript library
Derek Willian Stavis
 
Designing Opeation Oriented Web Applications / YAPC::Asia Tokyo 2011
Designing Opeation Oriented Web Applications / YAPC::Asia Tokyo 2011Designing Opeation Oriented Web Applications / YAPC::Asia Tokyo 2011
Designing Opeation Oriented Web Applications / YAPC::Asia Tokyo 2011
Masahiro Nagano
 
Php 102: Out with the Bad, In with the Good
Php 102: Out with the Bad, In with the GoodPhp 102: Out with the Bad, In with the Good
Php 102: Out with the Bad, In with the Good
Jeremy Kendall
 

Viewers also liked (14)

Elasticsearch - Zero to Hero
Elasticsearch - Zero to HeroElasticsearch - Zero to Hero
Elasticsearch - Zero to Hero
Daniel Ziv
 
Elastic meetup june16
Elastic meetup june16Elastic meetup june16
Elastic meetup june16
Miguel Bosin
 
Configuring elasticsearch for performance and scale
Configuring elasticsearch for performance and scaleConfiguring elasticsearch for performance and scale
Configuring elasticsearch for performance and scale
Bharvi Dixit
 
Elasticsearch in Production (London version)
Elasticsearch in Production (London version)Elasticsearch in Production (London version)
Elasticsearch in Production (London version)
foundsearch
 
You know, for search. Querying 24 Billion Documents in 900ms
You know, for search. Querying 24 Billion Documents in 900msYou know, for search. Querying 24 Billion Documents in 900ms
You know, for search. Querying 24 Billion Documents in 900ms
Jodok Batlogg
 
Search and analyze your data with elasticsearch
Search and analyze your data with elasticsearchSearch and analyze your data with elasticsearch
Search and analyze your data with elasticsearch
Anton Udovychenko
 
Introduction à ElasticSearch
Introduction à ElasticSearchIntroduction à ElasticSearch
Introduction à ElasticSearch
Fadel Chafai
 
Elasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & AggregationsElasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & Aggregations
Alaa Elhadba
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
BeyondTrees
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Spark Summit
 
Elasticsearch in Zalando
Elasticsearch in ZalandoElasticsearch in Zalando
Elasticsearch in Zalando
Alaa Elhadba
 
Data modeling for Elasticsearch
Data modeling for ElasticsearchData modeling for Elasticsearch
Data modeling for Elasticsearch
Florian Hopf
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Rafał Kuć
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
Jurriaan Persyn
 
Elasticsearch - Zero to Hero
Elasticsearch - Zero to HeroElasticsearch - Zero to Hero
Elasticsearch - Zero to Hero
Daniel Ziv
 
Elastic meetup june16
Elastic meetup june16Elastic meetup june16
Elastic meetup june16
Miguel Bosin
 
Configuring elasticsearch for performance and scale
Configuring elasticsearch for performance and scaleConfiguring elasticsearch for performance and scale
Configuring elasticsearch for performance and scale
Bharvi Dixit
 
Elasticsearch in Production (London version)
Elasticsearch in Production (London version)Elasticsearch in Production (London version)
Elasticsearch in Production (London version)
foundsearch
 
You know, for search. Querying 24 Billion Documents in 900ms
You know, for search. Querying 24 Billion Documents in 900msYou know, for search. Querying 24 Billion Documents in 900ms
You know, for search. Querying 24 Billion Documents in 900ms
Jodok Batlogg
 
Search and analyze your data with elasticsearch
Search and analyze your data with elasticsearchSearch and analyze your data with elasticsearch
Search and analyze your data with elasticsearch
Anton Udovychenko
 
Introduction à ElasticSearch
Introduction à ElasticSearchIntroduction à ElasticSearch
Introduction à ElasticSearch
Fadel Chafai
 
Elasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & AggregationsElasticsearch Introduction to Data model, Search & Aggregations
Elasticsearch Introduction to Data model, Search & Aggregations
Alaa Elhadba
 
ElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learnedElasticSearch in Production: lessons learned
ElasticSearch in Production: lessons learned
BeyondTrees
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Spark Summit
 
Elasticsearch in Zalando
Elasticsearch in ZalandoElasticsearch in Zalando
Elasticsearch in Zalando
Alaa Elhadba
 
Data modeling for Elasticsearch
Data modeling for ElasticsearchData modeling for Elasticsearch
Data modeling for Elasticsearch
Florian Hopf
 
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - Sematext
Rafał Kuć
 
An Introduction to Elastic Search.
An Introduction to Elastic Search.An Introduction to Elastic Search.
An Introduction to Elastic Search.
Jurriaan Persyn
 
Ad

Similar to ElasticSearch 5.x - New Tricks - 2017-02-08 - Elasticsearch Meetup (20)

Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.
Prajal Kulkarni
 
Mongodb in-anger-boston-rb-2011
Mongodb in-anger-boston-rb-2011Mongodb in-anger-boston-rb-2011
Mongodb in-anger-boston-rb-2011
bostonrb
 
Workshop: Learning Elasticsearch
Workshop: Learning ElasticsearchWorkshop: Learning Elasticsearch
Workshop: Learning Elasticsearch
Anurag Patel
 
Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !
Microsoft
 
Service discovery and configuration provisioning
Service discovery and configuration provisioningService discovery and configuration provisioning
Service discovery and configuration provisioning
Source Ministry
 
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websitesBurn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Lindsay Holmwood
 
Itb session v_memcached
Itb session v_memcachedItb session v_memcached
Itb session v_memcached
Skills Matter
 
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Codemotion
 
ELK: a log management framework
ELK: a log management frameworkELK: a log management framework
ELK: a log management framework
Giovanni Bechis
 
REST with Eve and Python
REST with Eve and PythonREST with Eve and Python
REST with Eve and Python
PiXeL16
 
Cloud Meetup - Automation in the Cloud
Cloud Meetup - Automation in the CloudCloud Meetup - Automation in the Cloud
Cloud Meetup - Automation in the Cloud
petriojala123
 
Elasticsearch in 15 Minutes
Elasticsearch in 15 MinutesElasticsearch in 15 Minutes
Elasticsearch in 15 Minutes
Karel Minarik
 
DevOps Fest 2019. Сергей Марченко. Terraform: a novel about modules, provider...
DevOps Fest 2019. Сергей Марченко. Terraform: a novel about modules, provider...DevOps Fest 2019. Сергей Марченко. Terraform: a novel about modules, provider...
DevOps Fest 2019. Сергей Марченко. Terraform: a novel about modules, provider...
DevOps_Fest
 
Mojolicious
MojoliciousMojolicious
Mojolicious
Marcos Rebelo
 
Es part 2 pdf no build
Es part 2 pdf no buildEs part 2 pdf no build
Es part 2 pdf no build
Erik Rose
 
Attack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and KibanaAttack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and Kibana
Prajal Kulkarni
 
Elasticsearch (R)Evolution — You Know, for Search… by Philipp Krenn at Big Da...
Elasticsearch (R)Evolution — You Know, for Search… by Philipp Krenn at Big Da...Elasticsearch (R)Evolution — You Know, for Search… by Philipp Krenn at Big Da...
Elasticsearch (R)Evolution — You Know, for Search… by Philipp Krenn at Big Da...
Big Data Spain
 
Voxxed Athens 2018 - Elasticsearch (R)Evolution — You Know, for Search...
Voxxed Athens 2018 - Elasticsearch (R)Evolution — You Know, for Search...Voxxed Athens 2018 - Elasticsearch (R)Evolution — You Know, for Search...
Voxxed Athens 2018 - Elasticsearch (R)Evolution — You Know, for Search...
Voxxed Athens
 
Fatc
FatcFatc
Fatc
Wade Arnold
 
Good practices for PrestaShop code security and optimization
Good practices for PrestaShop code security and optimizationGood practices for PrestaShop code security and optimization
Good practices for PrestaShop code security and optimization
PrestaShop
 
Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.Null Bachaav - May 07 Attack Monitoring workshop.
Null Bachaav - May 07 Attack Monitoring workshop.
Prajal Kulkarni
 
Mongodb in-anger-boston-rb-2011
Mongodb in-anger-boston-rb-2011Mongodb in-anger-boston-rb-2011
Mongodb in-anger-boston-rb-2011
bostonrb
 
Workshop: Learning Elasticsearch
Workshop: Learning ElasticsearchWorkshop: Learning Elasticsearch
Workshop: Learning Elasticsearch
Anurag Patel
 
Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !Elasticsearch sur Azure : Make sense of your (BIG) data !
Elasticsearch sur Azure : Make sense of your (BIG) data !
Microsoft
 
Service discovery and configuration provisioning
Service discovery and configuration provisioningService discovery and configuration provisioning
Service discovery and configuration provisioning
Source Ministry
 
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websitesBurn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Lindsay Holmwood
 
Itb session v_memcached
Itb session v_memcachedItb session v_memcached
Itb session v_memcached
Skills Matter
 
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Making your elastic cluster perform - Jettro Coenradie - Codemotion Amsterdam...
Codemotion
 
ELK: a log management framework
ELK: a log management frameworkELK: a log management framework
ELK: a log management framework
Giovanni Bechis
 
REST with Eve and Python
REST with Eve and PythonREST with Eve and Python
REST with Eve and Python
PiXeL16
 
Cloud Meetup - Automation in the Cloud
Cloud Meetup - Automation in the CloudCloud Meetup - Automation in the Cloud
Cloud Meetup - Automation in the Cloud
petriojala123
 
Elasticsearch in 15 Minutes
Elasticsearch in 15 MinutesElasticsearch in 15 Minutes
Elasticsearch in 15 Minutes
Karel Minarik
 
DevOps Fest 2019. Сергей Марченко. Terraform: a novel about modules, provider...
DevOps Fest 2019. Сергей Марченко. Terraform: a novel about modules, provider...DevOps Fest 2019. Сергей Марченко. Terraform: a novel about modules, provider...
DevOps Fest 2019. Сергей Марченко. Terraform: a novel about modules, provider...
DevOps_Fest
 
Es part 2 pdf no build
Es part 2 pdf no buildEs part 2 pdf no build
Es part 2 pdf no build
Erik Rose
 
Attack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and KibanaAttack monitoring using ElasticSearch Logstash and Kibana
Attack monitoring using ElasticSearch Logstash and Kibana
Prajal Kulkarni
 
Elasticsearch (R)Evolution — You Know, for Search… by Philipp Krenn at Big Da...
Elasticsearch (R)Evolution — You Know, for Search… by Philipp Krenn at Big Da...Elasticsearch (R)Evolution — You Know, for Search… by Philipp Krenn at Big Da...
Elasticsearch (R)Evolution — You Know, for Search… by Philipp Krenn at Big Da...
Big Data Spain
 
Voxxed Athens 2018 - Elasticsearch (R)Evolution — You Know, for Search...
Voxxed Athens 2018 - Elasticsearch (R)Evolution — You Know, for Search...Voxxed Athens 2018 - Elasticsearch (R)Evolution — You Know, for Search...
Voxxed Athens 2018 - Elasticsearch (R)Evolution — You Know, for Search...
Voxxed Athens
 
Good practices for PrestaShop code security and optimization
Good practices for PrestaShop code security and optimizationGood practices for PrestaShop code security and optimization
Good practices for PrestaShop code security and optimization
PrestaShop
 
Ad

More from Alberto Paro (9)

Data streaming
Data streamingData streaming
Data streaming
Alberto Paro
 
LUISS - Deep Learning and data analyses - 09/01/19
LUISS - Deep Learning and data analyses - 09/01/19LUISS - Deep Learning and data analyses - 09/01/19
LUISS - Deep Learning and data analyses - 09/01/19
Alberto Paro
 
2018 07-11 - kafka integration patterns
2018 07-11 - kafka integration patterns2018 07-11 - kafka integration patterns
2018 07-11 - kafka integration patterns
Alberto Paro
 
Elasticsearch in architetture Big Data - EsInADay-2017
Elasticsearch in architetture Big Data - EsInADay-2017Elasticsearch in architetture Big Data - EsInADay-2017
Elasticsearch in architetture Big Data - EsInADay-2017
Alberto Paro
 
2017 02-07 - elastic & spark. building a search geo locator
2017 02-07 - elastic & spark. building a search geo locator2017 02-07 - elastic & spark. building a search geo locator
2017 02-07 - elastic & spark. building a search geo locator
Alberto Paro
 
2016 02-24 - Piattaforme per i Big Data
2016 02-24 - Piattaforme per i Big Data2016 02-24 - Piattaforme per i Big Data
2016 02-24 - Piattaforme per i Big Data
Alberto Paro
 
What's Big Data? - Big Data Tech - 2015 - Firenze
What's Big Data? - Big Data Tech - 2015 - FirenzeWhat's Big Data? - Big Data Tech - 2015 - Firenze
What's Big Data? - Big Data Tech - 2015 - Firenze
Alberto Paro
 
ElasticSearch Meetup 30 - 10 - 2014
ElasticSearch Meetup 30 - 10 - 2014ElasticSearch Meetup 30 - 10 - 2014
ElasticSearch Meetup 30 - 10 - 2014
Alberto Paro
 
Scala Italy 2015 - Hands On ScalaJS
Scala Italy 2015 - Hands On ScalaJSScala Italy 2015 - Hands On ScalaJS
Scala Italy 2015 - Hands On ScalaJS
Alberto Paro
 
LUISS - Deep Learning and data analyses - 09/01/19
LUISS - Deep Learning and data analyses - 09/01/19LUISS - Deep Learning and data analyses - 09/01/19
LUISS - Deep Learning and data analyses - 09/01/19
Alberto Paro
 
2018 07-11 - kafka integration patterns
2018 07-11 - kafka integration patterns2018 07-11 - kafka integration patterns
2018 07-11 - kafka integration patterns
Alberto Paro
 
Elasticsearch in architetture Big Data - EsInADay-2017
Elasticsearch in architetture Big Data - EsInADay-2017Elasticsearch in architetture Big Data - EsInADay-2017
Elasticsearch in architetture Big Data - EsInADay-2017
Alberto Paro
 
2017 02-07 - elastic & spark. building a search geo locator
2017 02-07 - elastic & spark. building a search geo locator2017 02-07 - elastic & spark. building a search geo locator
2017 02-07 - elastic & spark. building a search geo locator
Alberto Paro
 
2016 02-24 - Piattaforme per i Big Data
2016 02-24 - Piattaforme per i Big Data2016 02-24 - Piattaforme per i Big Data
2016 02-24 - Piattaforme per i Big Data
Alberto Paro
 
What's Big Data? - Big Data Tech - 2015 - Firenze
What's Big Data? - Big Data Tech - 2015 - FirenzeWhat's Big Data? - Big Data Tech - 2015 - Firenze
What's Big Data? - Big Data Tech - 2015 - Firenze
Alberto Paro
 
ElasticSearch Meetup 30 - 10 - 2014
ElasticSearch Meetup 30 - 10 - 2014ElasticSearch Meetup 30 - 10 - 2014
ElasticSearch Meetup 30 - 10 - 2014
Alberto Paro
 
Scala Italy 2015 - Hands On ScalaJS
Scala Italy 2015 - Hands On ScalaJSScala Italy 2015 - Hands On ScalaJS
Scala Italy 2015 - Hands On ScalaJS
Alberto Paro
 

Recently uploaded (20)

Analysis of Billboards hot 100 toop five hit makers on the chart.docx
Analysis of Billboards hot 100 toop five hit makers on the chart.docxAnalysis of Billboards hot 100 toop five hit makers on the chart.docx
Analysis of Billboards hot 100 toop five hit makers on the chart.docx
hershtara1
 
AI ------------------------------ W1L2.pptx
AI ------------------------------ W1L2.pptxAI ------------------------------ W1L2.pptx
AI ------------------------------ W1L2.pptx
AyeshaJalil6
 
AWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdfAWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdf
philsparkshome
 
Multi-tenant Data Pipeline Orchestration
Multi-tenant Data Pipeline OrchestrationMulti-tenant Data Pipeline Orchestration
Multi-tenant Data Pipeline Orchestration
Romi Kuntsman
 
50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd
emir73065
 
Controlling Financial Processes at a Municipality
Controlling Financial Processes at a MunicipalityControlling Financial Processes at a Municipality
Controlling Financial Processes at a Municipality
Process mining Evangelist
 
Mining a Global Trade Process with Data Science - Microsoft
Mining a Global Trade Process with Data Science - MicrosoftMining a Global Trade Process with Data Science - Microsoft
Mining a Global Trade Process with Data Science - Microsoft
Process mining Evangelist
 
lecture_13 tree in mmmmmmmm mmmmmfftro.pptx
lecture_13 tree in mmmmmmmm     mmmmmfftro.pptxlecture_13 tree in mmmmmmmm     mmmmmfftro.pptx
lecture_13 tree in mmmmmmmm mmmmmfftro.pptx
sarajafffri058
 
national income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptxnational income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptx
j2492618
 
L1_Slides_Foundational Concepts_508.pptx
L1_Slides_Foundational Concepts_508.pptxL1_Slides_Foundational Concepts_508.pptx
L1_Slides_Foundational Concepts_508.pptx
38NoopurPatel
 
Automated Melanoma Detection via Image Processing.pptx
Automated Melanoma Detection via Image Processing.pptxAutomated Melanoma Detection via Image Processing.pptx
Automated Melanoma Detection via Image Processing.pptx
handrymaharjan23
 
Z14_IBM__APL_by_Christian_Demmer_IBM.pdf
Z14_IBM__APL_by_Christian_Demmer_IBM.pdfZ14_IBM__APL_by_Christian_Demmer_IBM.pdf
Z14_IBM__APL_by_Christian_Demmer_IBM.pdf
Fariborz Seyedloo
 
How to Set Up Process Mining in a Decentralized Organization?
How to Set Up Process Mining in a Decentralized Organization?How to Set Up Process Mining in a Decentralized Organization?
How to Set Up Process Mining in a Decentralized Organization?
Process mining Evangelist
 
Process Mining Machine Recoveries to Reduce Downtime
Process Mining Machine Recoveries to Reduce DowntimeProcess Mining Machine Recoveries to Reduce Downtime
Process Mining Machine Recoveries to Reduce Downtime
Process mining Evangelist
 
real illuminati Uganda agent 0782561496/0756664682
real illuminati Uganda agent 0782561496/0756664682real illuminati Uganda agent 0782561496/0756664682
real illuminati Uganda agent 0782561496/0756664682
way to join real illuminati Agent In Kampala Call/WhatsApp+256782561496/0756664682
 
2024 Digital Equity Accelerator Report.pdf
2024 Digital Equity Accelerator Report.pdf2024 Digital Equity Accelerator Report.pdf
2024 Digital Equity Accelerator Report.pdf
dominikamizerska1
 
hersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distributionhersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distribution
hershtara1
 
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
Taqyea
 
Understanding Complex Development Processes
Understanding Complex Development ProcessesUnderstanding Complex Development Processes
Understanding Complex Development Processes
Process mining Evangelist
 
CS-404 COA COURSE FILE JAN JUN 2025.docx
CS-404 COA COURSE FILE JAN JUN 2025.docxCS-404 COA COURSE FILE JAN JUN 2025.docx
CS-404 COA COURSE FILE JAN JUN 2025.docx
nidarizvitit
 
Analysis of Billboards hot 100 toop five hit makers on the chart.docx
Analysis of Billboards hot 100 toop five hit makers on the chart.docxAnalysis of Billboards hot 100 toop five hit makers on the chart.docx
Analysis of Billboards hot 100 toop five hit makers on the chart.docx
hershtara1
 
AI ------------------------------ W1L2.pptx
AI ------------------------------ W1L2.pptxAI ------------------------------ W1L2.pptx
AI ------------------------------ W1L2.pptx
AyeshaJalil6
 
AWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdfAWS Certified Machine Learning Slides.pdf
AWS Certified Machine Learning Slides.pdf
philsparkshome
 
Multi-tenant Data Pipeline Orchestration
Multi-tenant Data Pipeline OrchestrationMulti-tenant Data Pipeline Orchestration
Multi-tenant Data Pipeline Orchestration
Romi Kuntsman
 
50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd
emir73065
 
Controlling Financial Processes at a Municipality
Controlling Financial Processes at a MunicipalityControlling Financial Processes at a Municipality
Controlling Financial Processes at a Municipality
Process mining Evangelist
 
Mining a Global Trade Process with Data Science - Microsoft
Mining a Global Trade Process with Data Science - MicrosoftMining a Global Trade Process with Data Science - Microsoft
Mining a Global Trade Process with Data Science - Microsoft
Process mining Evangelist
 
lecture_13 tree in mmmmmmmm mmmmmfftro.pptx
lecture_13 tree in mmmmmmmm     mmmmmfftro.pptxlecture_13 tree in mmmmmmmm     mmmmmfftro.pptx
lecture_13 tree in mmmmmmmm mmmmmfftro.pptx
sarajafffri058
 
national income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptxnational income & related aggregates (1)(1).pptx
national income & related aggregates (1)(1).pptx
j2492618
 
L1_Slides_Foundational Concepts_508.pptx
L1_Slides_Foundational Concepts_508.pptxL1_Slides_Foundational Concepts_508.pptx
L1_Slides_Foundational Concepts_508.pptx
38NoopurPatel
 
Automated Melanoma Detection via Image Processing.pptx
Automated Melanoma Detection via Image Processing.pptxAutomated Melanoma Detection via Image Processing.pptx
Automated Melanoma Detection via Image Processing.pptx
handrymaharjan23
 
Z14_IBM__APL_by_Christian_Demmer_IBM.pdf
Z14_IBM__APL_by_Christian_Demmer_IBM.pdfZ14_IBM__APL_by_Christian_Demmer_IBM.pdf
Z14_IBM__APL_by_Christian_Demmer_IBM.pdf
Fariborz Seyedloo
 
How to Set Up Process Mining in a Decentralized Organization?
How to Set Up Process Mining in a Decentralized Organization?How to Set Up Process Mining in a Decentralized Organization?
How to Set Up Process Mining in a Decentralized Organization?
Process mining Evangelist
 
Process Mining Machine Recoveries to Reduce Downtime
Process Mining Machine Recoveries to Reduce DowntimeProcess Mining Machine Recoveries to Reduce Downtime
Process Mining Machine Recoveries to Reduce Downtime
Process mining Evangelist
 
2024 Digital Equity Accelerator Report.pdf
2024 Digital Equity Accelerator Report.pdf2024 Digital Equity Accelerator Report.pdf
2024 Digital Equity Accelerator Report.pdf
dominikamizerska1
 
hersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distributionhersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distribution
hershtara1
 
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
文凭证书美国SDSU文凭圣地亚哥州立大学学生证学历认证查询
Taqyea
 
CS-404 COA COURSE FILE JAN JUN 2025.docx
CS-404 COA COURSE FILE JAN JUN 2025.docxCS-404 COA COURSE FILE JAN JUN 2025.docx
CS-404 COA COURSE FILE JAN JUN 2025.docx
nidarizvitit
 

ElasticSearch 5.x - New Tricks - 2017-02-08 - Elasticsearch Meetup

  • 1. Roma – 8 Febbraio 2017 presenta Alberto Paro, Seacom ElasticSearch 5.x New Tricks
  • 2. Alberto Paro  Laureato in Ingegneria Informatica (POLIMI)  Autore di 3 libri su ElasticSearch da 1 a 5.x + 6 Tech review  Lavoro principalmente in Scala e su tecnologie BD (Akka, Spray.io, Playframework, Apache Spark) e NoSQL (Accumulo, Cassandra, ElasticSearch e MongoDB)  Evangelist linguaggio Scala e Scala.JS
  • 3. Tip 1: Shrink - 1/5 Why?  The wrong number of shards during the initial design sizing. Often sizing the shards without knowing the correct data/text distribution tends to oversize the number of shards  Reducing the number of shards to reduce memory and resource usage  Reducing the number of shards to speed up searching
  • 4. Tip 1: Shrink - 2/5 - Where is your data? We can retrieve it via the _nodes API: curl -XGET 'http://localhost:9200/_nodes?pretty' In the result there will be a similar section: .... "nodes" : { "5Sei9ip8Qhee3J0o9dTV4g" : { "name" : "Gin Genie", "transport_address" : "127.0.0.1:9300", "host" : "127.0.0.1", "ip" : "127.0.0.1", "version" : "5.1.1",.... The name of my node is Gin Genie
  • 5. Tip 1: Shrink - 3/5 - Relocate your data We can change the index settings, forcing allocation to a single node for our index, and disabling the writing for the index. curl -XPUT 'http://localhost:9200/myindex/_settings' -d ’ { "settings": { "index.routing.allocation.require._name": "Gin Genie", "index.blocks.write": true } }’ We can check for the green status: curl -XGET 'http://localhost:9200/_cluster/health?pretty'
  • 6. Tip 1: Shrink - 4/5 – Shrink our shards We need to disable the writing for the index via: curl -XPUT 'http://localhost:9200/myindex/_settings?index.blocks.write=true' The shrink call for creating the reduced_index, will be: curl -XPOST 'http://localhost:9200/myindex/_shrink/reduced_index' -d '{ "settings": { "index.number_of_replicas": 1, "index.number_of_shards": 1, "index.codec": "best_compression” }, "aliases": {"my_search_indices": {}} }'
  • 7. Tip 1: Shrink - 5/5 – Post Shrinking We can also wait for a yellow status if the index it is ready to work: curl -XGET 'http://localhost:9200/_cluster/health? wait_for_status=yellow’ Now we can remove the read-only by changing the index settings: curl -XPUT 'http://localhost:9200/myindex/_settings? index.blocks.write=true'
  • 8. Tip 2: Reindex - 1/2 Why?  Changing an analyzer for a mapping  Adding a new subfield to a mapping and you need to reprocess all the records to search for the new subfield  Removing an unused mapping  Changing a record structure that requires a new mapping
  • 9. Tip 2: Reindex - 2/2 curl -XPOST 'http://localhost:9200/_reindex?pretty=true' -d '{ "source": { "index": "myindex” "type": "mytype", "query": "…" }, "dest": { "index": "myindex2", "script": "…" } }'
  • 10. Tip 3: Update By Query with painless Add a new Field 1. Create your mapping (i.e modified: date) 2. Call an update by query curl -XPOST http://$server/$index/$mapping/_update_by_query -d '{ "script": { "inline": "ctx._source.modified="2015-10-06T00:00:00.000+00:00"", "lang": "painless” }, "query": { "bool": {"must_not":[{"exists":{"field":"modified"} }]} } }'
  • 11. Tip 4: Use search_after Step 1: curl -XGET 'http://$server/$index/$type/_search' -d ’{ "size": 100, "query": { "match_all" : {} }, "sort": [{"_uid": "desc"} ] }’ Step n, n>1: curl -XGET 'http://$server/$index/$type/_search' -d ’{ "size": 100, "query": { "match_all" : {} }, "search_after": ["$type#100"], "sort": [{"_uid": "desc"} ] }’
  • 12. Tip 5: Reindex for a remote node – 1/2 Why?  The backup is a safe Lucene index copy, so it depends on the Elasticsearch version used. If you are switching from a version of Elastisearch that is prior to version 5.x, it's not possible to restore old indices.  It's not possible to restore backups of a newer Elasticsearch version in an older version. The restore is only forward- compatible.  It's not possible to restore partial data from a backup.
  • 13. Tip 5: Reindex for a remote node – 2/2 In config/elasticsearch.yml add: reindex.remote.whitelist: ["192.168.1.227:9200"] Then: curl -XPOST "http://$server/_reindex" -d' { "source": { "remote": { "host": "http://192.168.1.227:9200" }, "index": "test-source” }, "dest": { "index": "test-dest” } }'
  • 14. Tip 6: Ingest Pipeline – 1/2 Why  Adding/Removing fields without changing your code  Manipulate your records before ingesting  Computed fields  Also supports scripting
  • 15. Tip 6: Ingest Pipeline – 2/2 curl -XPUT 'http://127.0.0.1:9200/_ingest/pipeline/add-user-john' -d '{ "description" : "Add user john field", "processors" : [ { "set" : { "field": "user", "value": "john"} } ], "version":1 }’ curl -XPUT http://$server/$index/$type/$id?pipeline=add-user-john -d '{}'
  • 17. Q&A

Editor's Notes

  • #4: Spark SQL (DB, Json, case class)
  • #5: Spark SQL (DB, Json, case class)
  • #6: Spark SQL (DB, Json, case class)
  • #7: Spark SQL (DB, Json, case class)
  • #8: Spark SQL (DB, Json, case class)
  • #9: Spark SQL (DB, Json, case class)
  • #10: Spark SQL (DB, Json, case class)
  • #11: Spark SQL (DB, Json, case class)
  • #12: Spark SQL (DB, Json, case class)
  翻译: