SlideShare a Scribd company logo
1
Introducing the
wildcard field
Mark Harwood
Developer, Elasticsearch
2
This presentation and the accompanying oral presentation contain forward-looking statements, including statements
concerning plans for future offerings; the expected strength, performance or benefits of our offerings; and our future
operations and expected performance. These forward-looking statements are subject to the safe harbor provisions
under the Private Securities Litigation Reform Act of 1995. Our expectations and beliefs in light of currently
available information regarding these matters may not materialize. Actual outcomes and results may differ materially
from those contemplated by these forward-looking statements due to uncertainties, risks, and changes in
circumstances, including, but not limited to those related to: the impact of the COVID-19 pandemic on our business
and our customers and partners; our ability to continue to deliver and improve our offerings and successfully
develop new offerings, including security-related product offerings and SaaS offerings; customer acceptance and
purchase of our existing offerings and new offerings, including the expansion and adoption of our SaaS offerings;
our ability to realize value from investments in the business, including R&D investments; our ability to maintain and
expand our user and customer base; our international expansion strategy; our ability to successfully execute our
go-to-market strategy and expand in our existing markets and into new markets, and our ability to forecast customer
retention and expansion; and general market, political, economic and business conditions.
Additional risks and uncertainties that could cause actual outcomes and results to differ materially are included in
our filings with the Securities and Exchange Commission (the “SEC”), including our Annual Report on Form 10-K for
the most recent fiscal year, our quarterly report on Form 10-Q for the most recent fiscal quarter, and any
subsequent reports filed with the SEC. SEC filings are available on the Investor Relations section of Elastic’s
website at ir.elastic.co and the SEC’s website at www.sec.gov.
Any features or functions of services or products referenced in this presentation, or in any presentations, press
releases or public statements, which are not currently available or not currently available as a general availability
release, may not be delivered on time or at all. The development, release, and timing of any features or functionality
described for our products remains at our sole discretion. Customers who purchase our products and services
should make the purchase decisions based upon services and product features and functions that are currently
available.
All statements are made only as of the date of the presentation, and Elastic assumes no obligation to, and does not
currently intend to, update any forward-looking statements or statements relating to features or functions of services
or products, except as required by law.
Forward-Looking Statements
When your string content
doesn’t work well with
keyword or text fields
A better choice for log messages?
Roots: text search
Text is split into words
The quick
brown fox
jumps over
the lazy dog.
Documents
brown 1
dog 1
fox 1
jump 1
lazy 1
over 1
quick 1
the 1,2
Search indexIndexer
The quick
brown fox
jumps over
the lazy
dog.
Roots: text search
User queries are split into words too
User queries
brown 1
dog 1
fox 1
jump 1
lazy 1
over 1
quick 1
the 1, 2
Search indexSearcher
Jumping foxes
Roots: text search
This works because user and index both agree what words are
User queries Index
the quick brown fox jumps over the lazy dog
7
Newsflash!
Not everyone is interested in finding quick brown foxes...
The rise of machine-generated content
Log files have disrupted the conversation
User queries Index
???
Q Is this document:
● One word?
● Four words?
● Nine words?
● Ten words?
A The end user very rarely knows
CWindowsSystem32WindowsPowerShellv1.0powershell.exe
Will it match?
Matching machine-generated content is not straight-forward
powershell.exe
CWindowsSystem32WindowsPowerShellv1.0powershell.exe
Q Will this security analyst’s search match this document?
A It depends…
3 text indexing choices, 3 different query options
The choice of tokenizer dictates the type of query required
exe 1
powershell 1
Index A
powershell.exe 1
Index B
c:windowssystem32windowspo
wershellv1.0powershell.exe
1
Index C
Requires a phrase
query to match
Requires a term query
to match
Requires a leading wildcard query to match
powershell.exe
“Match” query
can do the right
thing for these
indices
“Match” query
can do the right
thing for these
indices
But match query
can’t help with
this index
Text field summary
Elasticsearch’s “text” field has some great features for dealing with
words found in human language:
● Tokenisation - punctuation removal
● Case normalisation
● Stemming
● Synonym expansion
But, these features get in the way of searching machine-generated
content such as:
● URLs
● File paths
● Stack traces
12
Is the keyword field
any better for log data?
Many in the security field today use keyword fields
Our own Elastic Common Schema (ECS uses keyword fields
c:documentsworkexpenses.doc 1
c:documentsworkpresentation1.doc 3
c:documentsworkpresentation2.doc 2
c:windowssystem32windowspowershellv1.0powershell.exe 5
Keyword index
Searchers run wildcard or regexp queries on untokenized strings
*powershell.exe
Two issues with keyword fields
Speed, size limits
c:documentsworkexpenses.doc 1
c:documentsworkpresentation1.doc 3
c:documentsworkpresentation2.doc 2
c:windowssystem32windowspowershellv1.0powershell.exe 5
*powershell.exe
!! Linear scans of all
entries in the index
!! Limits on string lengths = blind spots
Summary: caught between ...
● Fast search of “words”
that don’t exist
● Slow searches
● Blind spots
16
Enter the wildcard
field...
An alternative to word-based matching or brute-force scans
Wildcard field
Identical requests+results to keyword field - just faster
{
"query": {
"wildcard": {
"Myfield":{ "value":"*.exe"
}
}
}
}
{
"query": {
"wildcard": {
"Myfield":{ "value":"*.exe"
}
}
}
}
Keyword field Wildcard field
==
Wildcard field
.ex 1,133
1.0 1
dow 1
ell 1
em3 1
ers 1
exe 1,133
... ...
ngram index
Indexer
C:WindowsSystem32WindowsPowerShellv1.0powershell.exe
Compressed doc value store
Compression icon by twist.glyph from the Noun Project
Documents are stored in two data structures behind the scenes
C:WindowsSystem32WindowsPowerShellv1.0powershell.exe
Wildcard field
.ex 1,133
1.0 1
dow 1
ell 1
em3 1
ers 1
exe 1,133
... ...
ngram index
Searcher
*.exe
Searches can start at any position, using any characters
Compressed doc value store
Wildcard field
.ex 1,133
1.0 1
dow 1
ell 1
em3 1
ers 1
exe 1,133
... ...
Searcher
C:WindowsSystem32WindowsPowerShellv1.0powershell.exe
*.exe
The ngram index is fast - but can produce false positives
/System/Library/Apple Logic/Instruments/vexed/default.exs
.ex AND exe
Compressed doc value store
ngram index
Wildcard field
.ex 1,133
1.0 1
dow 1
ell 1
em3 1
ers 1
exe 1,133
... ...
Searcher
C:WindowsSystem32WindowsPowerShellv1.0powershell.exe
*.exe
A final check on match candidates is performed using the full values
/System/Library/Apple Logic/Instruments/vexed/default.exs
Compressed doc value store
ngram index
Wildcard field
.ex 1,133
1.0 1
dow 1
ell 1
em3 1
ers 1
exe 1,133
... ...
Searcher
C:WindowsSystem32WindowsPowerShellv1.0powershell.exe
*.exe
Only fully verified matches are returned
Compressed doc value store
ngram index
Compressed doc value store
Wildcard field
.ex 1,133
1.0 1
dow 1
ell 1
em3 1
ers 1
exe 1,133
... ...
The real smarts in wildcard field...
GOAL
Minimise the number of
blocks we decompress
for verification purposes
ngram index
Wildcard field
Regular expression Equivalent ngram query
.*.(dll|exe) (.dl AND ll_) OR (.ex AND xe_)
Accelerating regular expression queries
Regular expressions are automatically parsed into the most selective equivalent
ngram query we can safely make.
Stricter ngram queries = fewer false positives = less verification checks = quicker searches.
Wildcard field - testing
Keyword
field
Accelerating regular expression queries safely
Thorough (but slow) keyword
field results are compared with
wildcard field results.
fHs^e([A|B]|w{1,5})AbA.* ?
Wildcard
field
Millions of randomly-generated
regular expressions are used in
searches
Wildcard field
Regular expression Equivalent ngram query
.*[Cc][Mm][Dd].[Ee][Xx][Ee].* cmd AND d.e AND exe
Case insensitivity
Many security rules written today use regular expressions with this syntax:
In 7.10 querying will be simpler:
"regexp": {
"my_wildcard_field": {
"value": ".*cmd.exe.*",
"case_insensitive": true
}
}
27
Summary
When to use the wildcard field?
Which part
of strings do
you want to
match?
Use keyword field
Whole value or beginning
Is it obvious
where
words begin
and end?
Anywhere
Use the text field
Yes, it’s everyday language
How big is
the largest
value?
No, a layman might
call it “gobbledygook”
Use the wildcard field
Large (>32k)
How many
unique
values are
there?
32k
Use keyword field
Thousands or less
Use the wildcard field
Millions or more
Try it out!
.ex 1,133
1.0 1
dow 1
ell 1
exe 1,133
https://cloud.elastic.co/registration
Spin up a free trial at:
30
Thank You!
Ad

More Related Content

What's hot (20)

An Introduction to Confluent Cloud: Apache Kafka as a Service
An Introduction to Confluent Cloud: Apache Kafka as a ServiceAn Introduction to Confluent Cloud: Apache Kafka as a Service
An Introduction to Confluent Cloud: Apache Kafka as a Service
confluent
 
KubeCon 2022 EU Flux Security.pdf
KubeCon 2022 EU Flux Security.pdfKubeCon 2022 EU Flux Security.pdf
KubeCon 2022 EU Flux Security.pdf
Weaveworks
 
初探 OpenTelemetry - 蒐集遙測數據的新標準
初探 OpenTelemetry - 蒐集遙測數據的新標準初探 OpenTelemetry - 蒐集遙測數據的新標準
初探 OpenTelemetry - 蒐集遙測數據的新標準
Marcus Tung
 
Observability of InfluxDB IOx: Tracing, Metrics and System Tables
Observability of InfluxDB IOx: Tracing, Metrics and System TablesObservability of InfluxDB IOx: Tracing, Metrics and System Tables
Observability of InfluxDB IOx: Tracing, Metrics and System Tables
InfluxData
 
Intro to Telegraf
Intro to TelegrafIntro to Telegraf
Intro to Telegraf
InfluxData
 
PromQL Deep Dive - The Prometheus Query Language
PromQL Deep Dive - The Prometheus Query Language PromQL Deep Dive - The Prometheus Query Language
PromQL Deep Dive - The Prometheus Query Language
Weaveworks
 
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOxInfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxData
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


Cloudera, Inc.
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For Operators
Kevin Brockhoff
 
Introduction to CI/CD
Introduction to CI/CDIntroduction to CI/CD
Introduction to CI/CD
Steve Mactaggart
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafka
confluent
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks
 
Building an Observability platform with ClickHouse
Building an Observability platform with ClickHouseBuilding an Observability platform with ClickHouse
Building an Observability platform with ClickHouse
Altinity Ltd
 
Online Testing Learning to Rank with Solr Interleaving
Online Testing Learning to Rank with Solr InterleavingOnline Testing Learning to Rank with Solr Interleaving
Online Testing Learning to Rank with Solr Interleaving
Sease
 
Deep Dive Into Elasticsearch
Deep Dive Into ElasticsearchDeep Dive Into Elasticsearch
Deep Dive Into Elasticsearch
Knoldus Inc.
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
Altinity Ltd
 
Observability
ObservabilityObservability
Observability
Martin Gross
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
Ismaeel Enjreny
 
Software development in the modern age
Software development in the modern ageSoftware development in the modern age
Software development in the modern age
Roy Wasse
 
OpenSearch
OpenSearchOpenSearch
OpenSearch
hchen1
 
An Introduction to Confluent Cloud: Apache Kafka as a Service
An Introduction to Confluent Cloud: Apache Kafka as a ServiceAn Introduction to Confluent Cloud: Apache Kafka as a Service
An Introduction to Confluent Cloud: Apache Kafka as a Service
confluent
 
KubeCon 2022 EU Flux Security.pdf
KubeCon 2022 EU Flux Security.pdfKubeCon 2022 EU Flux Security.pdf
KubeCon 2022 EU Flux Security.pdf
Weaveworks
 
初探 OpenTelemetry - 蒐集遙測數據的新標準
初探 OpenTelemetry - 蒐集遙測數據的新標準初探 OpenTelemetry - 蒐集遙測數據的新標準
初探 OpenTelemetry - 蒐集遙測數據的新標準
Marcus Tung
 
Observability of InfluxDB IOx: Tracing, Metrics and System Tables
Observability of InfluxDB IOx: Tracing, Metrics and System TablesObservability of InfluxDB IOx: Tracing, Metrics and System Tables
Observability of InfluxDB IOx: Tracing, Metrics and System Tables
InfluxData
 
Intro to Telegraf
Intro to TelegrafIntro to Telegraf
Intro to Telegraf
InfluxData
 
PromQL Deep Dive - The Prometheus Query Language
PromQL Deep Dive - The Prometheus Query Language PromQL Deep Dive - The Prometheus Query Language
PromQL Deep Dive - The Prometheus Query Language
Weaveworks
 
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOxInfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxData
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


Cloudera, Inc.
 
OpenTelemetry For Operators
OpenTelemetry For OperatorsOpenTelemetry For Operators
OpenTelemetry For Operators
Kevin Brockhoff
 
KSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for KafkaKSQL: Streaming SQL for Kafka
KSQL: Streaming SQL for Kafka
confluent
 
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks Technical Workshop: Interactive Query with Apache Hive
Hortonworks
 
Building an Observability platform with ClickHouse
Building an Observability platform with ClickHouseBuilding an Observability platform with ClickHouse
Building an Observability platform with ClickHouse
Altinity Ltd
 
Online Testing Learning to Rank with Solr Interleaving
Online Testing Learning to Rank with Solr InterleavingOnline Testing Learning to Rank with Solr Interleaving
Online Testing Learning to Rank with Solr Interleaving
Sease
 
Deep Dive Into Elasticsearch
Deep Dive Into ElasticsearchDeep Dive Into Elasticsearch
Deep Dive Into Elasticsearch
Knoldus Inc.
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
Altinity Ltd
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
Ismaeel Enjreny
 
Software development in the modern age
Software development in the modern ageSoftware development in the modern age
Software development in the modern age
Roy Wasse
 
OpenSearch
OpenSearchOpenSearch
OpenSearch
hchen1
 

Similar to Elasticsearch: Introducing the wildcard field (20)

The importance of normalizing your security data to ECS
The importance of normalizing your security data to ECSThe importance of normalizing your security data to ECS
The importance of normalizing your security data to ECS
Elasticsearch
 
Automating the Elastic Stack
Automating the Elastic StackAutomating the Elastic Stack
Automating the Elastic Stack
Elasticsearch
 
The Elastic clients: Recent developments
The Elastic clients: Recent developmentsThe Elastic clients: Recent developments
The Elastic clients: Recent developments
Elasticsearch
 
SIEM, malware protection, deep data visibility — for free
SIEM, malware protection, deep data visibility — for freeSIEM, malware protection, deep data visibility — for free
SIEM, malware protection, deep data visibility — for free
Elasticsearch
 
Preview: Local Development for Lightning Web Components
Preview: Local Development for Lightning Web ComponentsPreview: Local Development for Lightning Web Components
Preview: Local Development for Lightning Web Components
Developer Force
 
Understanding Multitenancy and the Architecture of the Salesforce Platform
Understanding Multitenancy and the Architecture of the Salesforce PlatformUnderstanding Multitenancy and the Architecture of the Salesforce Platform
Understanding Multitenancy and the Architecture of the Salesforce Platform
Salesforce Developers
 
Developing Offline-Capable Apps with the Salesforce Mobile SDK and SmartStore
Developing Offline-Capable Apps with the Salesforce Mobile SDK and SmartStoreDeveloping Offline-Capable Apps with the Salesforce Mobile SDK and SmartStore
Developing Offline-Capable Apps with the Salesforce Mobile SDK and SmartStore
Salesforce Developers
 
Apex Trigger Debugging: Solving the Hard Problems
Apex Trigger Debugging: Solving the Hard ProblemsApex Trigger Debugging: Solving the Hard Problems
Apex Trigger Debugging: Solving the Hard Problems
Salesforce Developers
 
What's new at Elastic: Update on major initiatives and releases
What's new at Elastic: Update on major initiatives and releasesWhat's new at Elastic: Update on major initiatives and releases
What's new at Elastic: Update on major initiatives and releases
Elasticsearch
 
Forever free and open Enterprise Search
Forever free and open Enterprise SearchForever free and open Enterprise Search
Forever free and open Enterprise Search
Elasticsearch
 
Master the AI-102 Exam: Expert Dumps and Study Tips for 2025
Master the AI-102 Exam: Expert Dumps and Study Tips for 2025Master the AI-102 Exam: Expert Dumps and Study Tips for 2025
Master the AI-102 Exam: Expert Dumps and Study Tips for 2025
rl7159133
 
Salesforce Multitenant Architecture: How We Do the Magic We Do
Salesforce Multitenant Architecture: How We Do the Magic We DoSalesforce Multitenant Architecture: How We Do the Magic We Do
Salesforce Multitenant Architecture: How We Do the Magic We Do
Salesforce Developers
 
Introduction to Backend Development (1).pptx
Introduction to Backend Development (1).pptxIntroduction to Backend Development (1).pptx
Introduction to Backend Development (1).pptx
OsuGodbless
 
Breaking silos between DevOps and SecOps with Elastic
Breaking silos between DevOps and SecOps with ElasticBreaking silos between DevOps and SecOps with Elastic
Breaking silos between DevOps and SecOps with Elastic
Elasticsearch
 
How to run an Enterprise PHP Shop
How to run an Enterprise PHP ShopHow to run an Enterprise PHP Shop
How to run an Enterprise PHP Shop
Jim Plush
 
SURE Research Report
SURE Research ReportSURE Research Report
SURE Research Report
Alex Sumner
 
group-2 PPTsrs sds projectolx resale.pptx
group-2 PPTsrs sds projectolx resale.pptxgroup-2 PPTsrs sds projectolx resale.pptx
group-2 PPTsrs sds projectolx resale.pptx
GunjanSaini32
 
Ramesh Selenium
Ramesh SeleniumRamesh Selenium
Ramesh Selenium
Ramesh Khamari
 
Declarative Multilingual Information Extraction with SystemT
Declarative Multilingual Information Extraction with SystemTDeclarative Multilingual Information Extraction with SystemT
Declarative Multilingual Information Extraction with SystemT
Laura Chiticariu
 
Get full visibility and find hidden security issues
Get full visibility and find hidden security issuesGet full visibility and find hidden security issues
Get full visibility and find hidden security issues
Elasticsearch
 
The importance of normalizing your security data to ECS
The importance of normalizing your security data to ECSThe importance of normalizing your security data to ECS
The importance of normalizing your security data to ECS
Elasticsearch
 
Automating the Elastic Stack
Automating the Elastic StackAutomating the Elastic Stack
Automating the Elastic Stack
Elasticsearch
 
The Elastic clients: Recent developments
The Elastic clients: Recent developmentsThe Elastic clients: Recent developments
The Elastic clients: Recent developments
Elasticsearch
 
SIEM, malware protection, deep data visibility — for free
SIEM, malware protection, deep data visibility — for freeSIEM, malware protection, deep data visibility — for free
SIEM, malware protection, deep data visibility — for free
Elasticsearch
 
Preview: Local Development for Lightning Web Components
Preview: Local Development for Lightning Web ComponentsPreview: Local Development for Lightning Web Components
Preview: Local Development for Lightning Web Components
Developer Force
 
Understanding Multitenancy and the Architecture of the Salesforce Platform
Understanding Multitenancy and the Architecture of the Salesforce PlatformUnderstanding Multitenancy and the Architecture of the Salesforce Platform
Understanding Multitenancy and the Architecture of the Salesforce Platform
Salesforce Developers
 
Developing Offline-Capable Apps with the Salesforce Mobile SDK and SmartStore
Developing Offline-Capable Apps with the Salesforce Mobile SDK and SmartStoreDeveloping Offline-Capable Apps with the Salesforce Mobile SDK and SmartStore
Developing Offline-Capable Apps with the Salesforce Mobile SDK and SmartStore
Salesforce Developers
 
Apex Trigger Debugging: Solving the Hard Problems
Apex Trigger Debugging: Solving the Hard ProblemsApex Trigger Debugging: Solving the Hard Problems
Apex Trigger Debugging: Solving the Hard Problems
Salesforce Developers
 
What's new at Elastic: Update on major initiatives and releases
What's new at Elastic: Update on major initiatives and releasesWhat's new at Elastic: Update on major initiatives and releases
What's new at Elastic: Update on major initiatives and releases
Elasticsearch
 
Forever free and open Enterprise Search
Forever free and open Enterprise SearchForever free and open Enterprise Search
Forever free and open Enterprise Search
Elasticsearch
 
Master the AI-102 Exam: Expert Dumps and Study Tips for 2025
Master the AI-102 Exam: Expert Dumps and Study Tips for 2025Master the AI-102 Exam: Expert Dumps and Study Tips for 2025
Master the AI-102 Exam: Expert Dumps and Study Tips for 2025
rl7159133
 
Salesforce Multitenant Architecture: How We Do the Magic We Do
Salesforce Multitenant Architecture: How We Do the Magic We DoSalesforce Multitenant Architecture: How We Do the Magic We Do
Salesforce Multitenant Architecture: How We Do the Magic We Do
Salesforce Developers
 
Introduction to Backend Development (1).pptx
Introduction to Backend Development (1).pptxIntroduction to Backend Development (1).pptx
Introduction to Backend Development (1).pptx
OsuGodbless
 
Breaking silos between DevOps and SecOps with Elastic
Breaking silos between DevOps and SecOps with ElasticBreaking silos between DevOps and SecOps with Elastic
Breaking silos between DevOps and SecOps with Elastic
Elasticsearch
 
How to run an Enterprise PHP Shop
How to run an Enterprise PHP ShopHow to run an Enterprise PHP Shop
How to run an Enterprise PHP Shop
Jim Plush
 
SURE Research Report
SURE Research ReportSURE Research Report
SURE Research Report
Alex Sumner
 
group-2 PPTsrs sds projectolx resale.pptx
group-2 PPTsrs sds projectolx resale.pptxgroup-2 PPTsrs sds projectolx resale.pptx
group-2 PPTsrs sds projectolx resale.pptx
GunjanSaini32
 
Declarative Multilingual Information Extraction with SystemT
Declarative Multilingual Information Extraction with SystemTDeclarative Multilingual Information Extraction with SystemT
Declarative Multilingual Information Extraction with SystemT
Laura Chiticariu
 
Get full visibility and find hidden security issues
Get full visibility and find hidden security issuesGet full visibility and find hidden security issues
Get full visibility and find hidden security issues
Elasticsearch
 
Ad

More from Elasticsearch (20)

An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolbox
Elasticsearch
 
From MSP to MSSP using Elastic
From MSP to MSSP using ElasticFrom MSP to MSSP using Elastic
From MSP to MSSP using Elastic
Elasticsearch
 
Cómo crear excelentes experiencias de búsqueda en sitios web
Cómo crear excelentes experiencias de búsqueda en sitios webCómo crear excelentes experiencias de búsqueda en sitios web
Cómo crear excelentes experiencias de búsqueda en sitios web
Elasticsearch
 
Te damos la bienvenida a una nueva forma de realizar búsquedas
Te damos la bienvenida a una nueva forma de realizar búsquedas Te damos la bienvenida a una nueva forma de realizar búsquedas
Te damos la bienvenida a una nueva forma de realizar búsquedas
Elasticsearch
 
Tirez pleinement parti d'Elastic grâce à Elastic Cloud
Tirez pleinement parti d'Elastic grâce à Elastic CloudTirez pleinement parti d'Elastic grâce à Elastic Cloud
Tirez pleinement parti d'Elastic grâce à Elastic Cloud
Elasticsearch
 
Comment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitablesComment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitables
Elasticsearch
 
Plongez au cœur de la recherche dans tous ses états.
Plongez au cœur de la recherche dans tous ses états.Plongez au cœur de la recherche dans tous ses états.
Plongez au cœur de la recherche dans tous ses états.
Elasticsearch
 
Modernising One Legal Se@rch with Elastic Enterprise Search [Customer Story]
Modernising One Legal Se@rch with Elastic Enterprise Search [Customer Story]Modernising One Legal Se@rch with Elastic Enterprise Search [Customer Story]
Modernising One Legal Se@rch with Elastic Enterprise Search [Customer Story]
Elasticsearch
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolbox
Elasticsearch
 
Welcome to a new state of find
Welcome to a new state of findWelcome to a new state of find
Welcome to a new state of find
Elasticsearch
 
Building great website search experiences
Building great website search experiencesBuilding great website search experiences
Building great website search experiences
Elasticsearch
 
Keynote: Harnessing the power of Elasticsearch for simplified search
Keynote: Harnessing the power of Elasticsearch for simplified searchKeynote: Harnessing the power of Elasticsearch for simplified search
Keynote: Harnessing the power of Elasticsearch for simplified search
Elasticsearch
 
Cómo transformar los datos en análisis con los que tomar decisiones
Cómo transformar los datos en análisis con los que tomar decisionesCómo transformar los datos en análisis con los que tomar decisiones
Cómo transformar los datos en análisis con los que tomar decisiones
Elasticsearch
 
Explore relève les défis Big Data avec Elastic Cloud
Explore relève les défis Big Data avec Elastic Cloud Explore relève les défis Big Data avec Elastic Cloud
Explore relève les défis Big Data avec Elastic Cloud
Elasticsearch
 
Comment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitablesComment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitables
Elasticsearch
 
Transforming data into actionable insights
Transforming data into actionable insightsTransforming data into actionable insights
Transforming data into actionable insights
Elasticsearch
 
Opening Keynote: Why Elastic?
Opening Keynote: Why Elastic?Opening Keynote: Why Elastic?
Opening Keynote: Why Elastic?
Elasticsearch
 
Empowering agencies using Elastic as a Service inside Government
Empowering agencies using Elastic as a Service inside GovernmentEmpowering agencies using Elastic as a Service inside Government
Empowering agencies using Elastic as a Service inside Government
Elasticsearch
 
The opportunities and challenges of data for public good
The opportunities and challenges of data for public goodThe opportunities and challenges of data for public good
The opportunities and challenges of data for public good
Elasticsearch
 
Enterprise search and unstructured data with CGI and Elastic
Enterprise search and unstructured data with CGI and ElasticEnterprise search and unstructured data with CGI and Elastic
Enterprise search and unstructured data with CGI and Elastic
Elasticsearch
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolbox
Elasticsearch
 
From MSP to MSSP using Elastic
From MSP to MSSP using ElasticFrom MSP to MSSP using Elastic
From MSP to MSSP using Elastic
Elasticsearch
 
Cómo crear excelentes experiencias de búsqueda en sitios web
Cómo crear excelentes experiencias de búsqueda en sitios webCómo crear excelentes experiencias de búsqueda en sitios web
Cómo crear excelentes experiencias de búsqueda en sitios web
Elasticsearch
 
Te damos la bienvenida a una nueva forma de realizar búsquedas
Te damos la bienvenida a una nueva forma de realizar búsquedas Te damos la bienvenida a una nueva forma de realizar búsquedas
Te damos la bienvenida a una nueva forma de realizar búsquedas
Elasticsearch
 
Tirez pleinement parti d'Elastic grâce à Elastic Cloud
Tirez pleinement parti d'Elastic grâce à Elastic CloudTirez pleinement parti d'Elastic grâce à Elastic Cloud
Tirez pleinement parti d'Elastic grâce à Elastic Cloud
Elasticsearch
 
Comment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitablesComment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitables
Elasticsearch
 
Plongez au cœur de la recherche dans tous ses états.
Plongez au cœur de la recherche dans tous ses états.Plongez au cœur de la recherche dans tous ses états.
Plongez au cœur de la recherche dans tous ses états.
Elasticsearch
 
Modernising One Legal Se@rch with Elastic Enterprise Search [Customer Story]
Modernising One Legal Se@rch with Elastic Enterprise Search [Customer Story]Modernising One Legal Se@rch with Elastic Enterprise Search [Customer Story]
Modernising One Legal Se@rch with Elastic Enterprise Search [Customer Story]
Elasticsearch
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolbox
Elasticsearch
 
Welcome to a new state of find
Welcome to a new state of findWelcome to a new state of find
Welcome to a new state of find
Elasticsearch
 
Building great website search experiences
Building great website search experiencesBuilding great website search experiences
Building great website search experiences
Elasticsearch
 
Keynote: Harnessing the power of Elasticsearch for simplified search
Keynote: Harnessing the power of Elasticsearch for simplified searchKeynote: Harnessing the power of Elasticsearch for simplified search
Keynote: Harnessing the power of Elasticsearch for simplified search
Elasticsearch
 
Cómo transformar los datos en análisis con los que tomar decisiones
Cómo transformar los datos en análisis con los que tomar decisionesCómo transformar los datos en análisis con los que tomar decisiones
Cómo transformar los datos en análisis con los que tomar decisiones
Elasticsearch
 
Explore relève les défis Big Data avec Elastic Cloud
Explore relève les défis Big Data avec Elastic Cloud Explore relève les défis Big Data avec Elastic Cloud
Explore relève les défis Big Data avec Elastic Cloud
Elasticsearch
 
Comment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitablesComment transformer vos données en informations exploitables
Comment transformer vos données en informations exploitables
Elasticsearch
 
Transforming data into actionable insights
Transforming data into actionable insightsTransforming data into actionable insights
Transforming data into actionable insights
Elasticsearch
 
Opening Keynote: Why Elastic?
Opening Keynote: Why Elastic?Opening Keynote: Why Elastic?
Opening Keynote: Why Elastic?
Elasticsearch
 
Empowering agencies using Elastic as a Service inside Government
Empowering agencies using Elastic as a Service inside GovernmentEmpowering agencies using Elastic as a Service inside Government
Empowering agencies using Elastic as a Service inside Government
Elasticsearch
 
The opportunities and challenges of data for public good
The opportunities and challenges of data for public goodThe opportunities and challenges of data for public good
The opportunities and challenges of data for public good
Elasticsearch
 
Enterprise search and unstructured data with CGI and Elastic
Enterprise search and unstructured data with CGI and ElasticEnterprise search and unstructured data with CGI and Elastic
Enterprise search and unstructured data with CGI and Elastic
Elasticsearch
 
Ad

Recently uploaded (20)

Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptxSmart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Seasia Infotech
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Safe Software
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
Agentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community MeetupAgentic Automation - Delhi UiPath Community Meetup
Agentic Automation - Delhi UiPath Community Meetup
Manoj Batra (1600 + Connections)
 
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
João Esperancinha
 
An Overview of Salesforce Health Cloud & How is it Transforming Patient Care
An Overview of Salesforce Health Cloud & How is it Transforming Patient CareAn Overview of Salesforce Health Cloud & How is it Transforming Patient Care
An Overview of Salesforce Health Cloud & How is it Transforming Patient Care
Cyntexa
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent LasterAI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
Com fer un pla de gestió de dades amb l'eiNa DMP (en anglès)
CSUC - Consorci de Serveis Universitaris de Catalunya
 
Bepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firmBepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firm
Benard76
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
Build With AI - In Person Session Slides.pdf
Build With AI - In Person Session Slides.pdfBuild With AI - In Person Session Slides.pdf
Build With AI - In Person Session Slides.pdf
Google Developer Group - Harare
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 
Config 2025 presentation recap covering both days
Config 2025 presentation recap covering both daysConfig 2025 presentation recap covering both days
Config 2025 presentation recap covering both days
TrishAntoni1
 
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptxSmart Investments Leveraging Agentic AI for Real Estate Success.pptx
Smart Investments Leveraging Agentic AI for Real Estate Success.pptx
Seasia Infotech
 
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Kit-Works Team Study_아직도 Dockefile.pdf_김성호
Wonjun Hwang
 
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Integrating FME with Python: Tips, Demos, and Best Practices for Powerful Aut...
Safe Software
 
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Enterprise Integration Is Dead! Long Live AI-Driven Integration with Apache C...
Markus Eisele
 
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
Could Virtual Threads cast away the usage of Kotlin Coroutines - DevoxxUK2025
João Esperancinha
 
An Overview of Salesforce Health Cloud & How is it Transforming Patient Care
An Overview of Salesforce Health Cloud & How is it Transforming Patient CareAn Overview of Salesforce Health Cloud & How is it Transforming Patient Care
An Overview of Salesforce Health Cloud & How is it Transforming Patient Care
Cyntexa
 
AI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamsonAI-proof your career by Olivier Vroom and David WIlliamson
AI-proof your career by Olivier Vroom and David WIlliamson
UXPA Boston
 
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Challenges in Migrating Imperative Deep Learning Programs to Graph Execution:...
Raffi Khatchadourian
 
May Patch Tuesday
May Patch TuesdayMay Patch Tuesday
May Patch Tuesday
Ivanti
 
Unlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web AppsUnlocking Generative AI in your Web Apps
Unlocking Generative AI in your Web Apps
Maximiliano Firtman
 
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent LasterAI 3-in-1: Agents, RAG, and Local Models - Brent Laster
AI 3-in-1: Agents, RAG, and Local Models - Brent Laster
All Things Open
 
Bepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firmBepents tech services - a premier cybersecurity consulting firm
Bepents tech services - a premier cybersecurity consulting firm
Benard76
 
IT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information TechnologyIT484 Cyber Forensics_Information Technology
IT484 Cyber Forensics_Information Technology
SHEHABALYAMANI
 
Cybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and MitigationCybersecurity Threat Vectors and Mitigation
Cybersecurity Threat Vectors and Mitigation
VICTOR MAESTRE RAMIREZ
 
fennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solutionfennec fox optimization algorithm for optimal solution
fennec fox optimization algorithm for optimal solution
shallal2
 
How to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabberHow to Install & Activate ListGrabber - eGrabber
How to Install & Activate ListGrabber - eGrabber
eGrabber
 

Elasticsearch: Introducing the wildcard field

  • 1. 1 Introducing the wildcard field Mark Harwood Developer, Elasticsearch
  • 2. 2 This presentation and the accompanying oral presentation contain forward-looking statements, including statements concerning plans for future offerings; the expected strength, performance or benefits of our offerings; and our future operations and expected performance. These forward-looking statements are subject to the safe harbor provisions under the Private Securities Litigation Reform Act of 1995. Our expectations and beliefs in light of currently available information regarding these matters may not materialize. Actual outcomes and results may differ materially from those contemplated by these forward-looking statements due to uncertainties, risks, and changes in circumstances, including, but not limited to those related to: the impact of the COVID-19 pandemic on our business and our customers and partners; our ability to continue to deliver and improve our offerings and successfully develop new offerings, including security-related product offerings and SaaS offerings; customer acceptance and purchase of our existing offerings and new offerings, including the expansion and adoption of our SaaS offerings; our ability to realize value from investments in the business, including R&D investments; our ability to maintain and expand our user and customer base; our international expansion strategy; our ability to successfully execute our go-to-market strategy and expand in our existing markets and into new markets, and our ability to forecast customer retention and expansion; and general market, political, economic and business conditions. Additional risks and uncertainties that could cause actual outcomes and results to differ materially are included in our filings with the Securities and Exchange Commission (the “SEC”), including our Annual Report on Form 10-K for the most recent fiscal year, our quarterly report on Form 10-Q for the most recent fiscal quarter, and any subsequent reports filed with the SEC. SEC filings are available on the Investor Relations section of Elastic’s website at ir.elastic.co and the SEC’s website at www.sec.gov. Any features or functions of services or products referenced in this presentation, or in any presentations, press releases or public statements, which are not currently available or not currently available as a general availability release, may not be delivered on time or at all. The development, release, and timing of any features or functionality described for our products remains at our sole discretion. Customers who purchase our products and services should make the purchase decisions based upon services and product features and functions that are currently available. All statements are made only as of the date of the presentation, and Elastic assumes no obligation to, and does not currently intend to, update any forward-looking statements or statements relating to features or functions of services or products, except as required by law. Forward-Looking Statements
  • 3. When your string content doesn’t work well with keyword or text fields A better choice for log messages?
  • 4. Roots: text search Text is split into words The quick brown fox jumps over the lazy dog. Documents brown 1 dog 1 fox 1 jump 1 lazy 1 over 1 quick 1 the 1,2 Search indexIndexer The quick brown fox jumps over the lazy dog.
  • 5. Roots: text search User queries are split into words too User queries brown 1 dog 1 fox 1 jump 1 lazy 1 over 1 quick 1 the 1, 2 Search indexSearcher Jumping foxes
  • 6. Roots: text search This works because user and index both agree what words are User queries Index the quick brown fox jumps over the lazy dog
  • 7. 7 Newsflash! Not everyone is interested in finding quick brown foxes...
  • 8. The rise of machine-generated content Log files have disrupted the conversation User queries Index ??? Q Is this document: ● One word? ● Four words? ● Nine words? ● Ten words? A The end user very rarely knows CWindowsSystem32WindowsPowerShellv1.0powershell.exe
  • 9. Will it match? Matching machine-generated content is not straight-forward powershell.exe CWindowsSystem32WindowsPowerShellv1.0powershell.exe Q Will this security analyst’s search match this document? A It depends…
  • 10. 3 text indexing choices, 3 different query options The choice of tokenizer dictates the type of query required exe 1 powershell 1 Index A powershell.exe 1 Index B c:windowssystem32windowspo wershellv1.0powershell.exe 1 Index C Requires a phrase query to match Requires a term query to match Requires a leading wildcard query to match powershell.exe “Match” query can do the right thing for these indices “Match” query can do the right thing for these indices But match query can’t help with this index
  • 11. Text field summary Elasticsearch’s “text” field has some great features for dealing with words found in human language: ● Tokenisation - punctuation removal ● Case normalisation ● Stemming ● Synonym expansion But, these features get in the way of searching machine-generated content such as: ● URLs ● File paths ● Stack traces
  • 12. 12 Is the keyword field any better for log data?
  • 13. Many in the security field today use keyword fields Our own Elastic Common Schema (ECS uses keyword fields c:documentsworkexpenses.doc 1 c:documentsworkpresentation1.doc 3 c:documentsworkpresentation2.doc 2 c:windowssystem32windowspowershellv1.0powershell.exe 5 Keyword index Searchers run wildcard or regexp queries on untokenized strings *powershell.exe
  • 14. Two issues with keyword fields Speed, size limits c:documentsworkexpenses.doc 1 c:documentsworkpresentation1.doc 3 c:documentsworkpresentation2.doc 2 c:windowssystem32windowspowershellv1.0powershell.exe 5 *powershell.exe !! Linear scans of all entries in the index !! Limits on string lengths = blind spots
  • 15. Summary: caught between ... ● Fast search of “words” that don’t exist ● Slow searches ● Blind spots
  • 16. 16 Enter the wildcard field... An alternative to word-based matching or brute-force scans
  • 17. Wildcard field Identical requests+results to keyword field - just faster { "query": { "wildcard": { "Myfield":{ "value":"*.exe" } } } } { "query": { "wildcard": { "Myfield":{ "value":"*.exe" } } } } Keyword field Wildcard field ==
  • 18. Wildcard field .ex 1,133 1.0 1 dow 1 ell 1 em3 1 ers 1 exe 1,133 ... ... ngram index Indexer C:WindowsSystem32WindowsPowerShellv1.0powershell.exe Compressed doc value store Compression icon by twist.glyph from the Noun Project Documents are stored in two data structures behind the scenes C:WindowsSystem32WindowsPowerShellv1.0powershell.exe
  • 19. Wildcard field .ex 1,133 1.0 1 dow 1 ell 1 em3 1 ers 1 exe 1,133 ... ... ngram index Searcher *.exe Searches can start at any position, using any characters Compressed doc value store
  • 20. Wildcard field .ex 1,133 1.0 1 dow 1 ell 1 em3 1 ers 1 exe 1,133 ... ... Searcher C:WindowsSystem32WindowsPowerShellv1.0powershell.exe *.exe The ngram index is fast - but can produce false positives /System/Library/Apple Logic/Instruments/vexed/default.exs .ex AND exe Compressed doc value store ngram index
  • 21. Wildcard field .ex 1,133 1.0 1 dow 1 ell 1 em3 1 ers 1 exe 1,133 ... ... Searcher C:WindowsSystem32WindowsPowerShellv1.0powershell.exe *.exe A final check on match candidates is performed using the full values /System/Library/Apple Logic/Instruments/vexed/default.exs Compressed doc value store ngram index
  • 22. Wildcard field .ex 1,133 1.0 1 dow 1 ell 1 em3 1 ers 1 exe 1,133 ... ... Searcher C:WindowsSystem32WindowsPowerShellv1.0powershell.exe *.exe Only fully verified matches are returned Compressed doc value store ngram index
  • 23. Compressed doc value store Wildcard field .ex 1,133 1.0 1 dow 1 ell 1 em3 1 ers 1 exe 1,133 ... ... The real smarts in wildcard field... GOAL Minimise the number of blocks we decompress for verification purposes ngram index
  • 24. Wildcard field Regular expression Equivalent ngram query .*.(dll|exe) (.dl AND ll_) OR (.ex AND xe_) Accelerating regular expression queries Regular expressions are automatically parsed into the most selective equivalent ngram query we can safely make. Stricter ngram queries = fewer false positives = less verification checks = quicker searches.
  • 25. Wildcard field - testing Keyword field Accelerating regular expression queries safely Thorough (but slow) keyword field results are compared with wildcard field results. fHs^e([A|B]|w{1,5})AbA.* ? Wildcard field Millions of randomly-generated regular expressions are used in searches
  • 26. Wildcard field Regular expression Equivalent ngram query .*[Cc][Mm][Dd].[Ee][Xx][Ee].* cmd AND d.e AND exe Case insensitivity Many security rules written today use regular expressions with this syntax: In 7.10 querying will be simpler: "regexp": { "my_wildcard_field": { "value": ".*cmd.exe.*", "case_insensitive": true } }
  • 28. When to use the wildcard field? Which part of strings do you want to match? Use keyword field Whole value or beginning Is it obvious where words begin and end? Anywhere Use the text field Yes, it’s everyday language How big is the largest value? No, a layman might call it “gobbledygook” Use the wildcard field Large (>32k) How many unique values are there? 32k Use keyword field Thousands or less Use the wildcard field Millions or more
  • 29. Try it out! .ex 1,133 1.0 1 dow 1 ell 1 exe 1,133 https://cloud.elastic.co/registration Spin up a free trial at:
  翻译: