SlideShare a Scribd company logo
ElasticSearch for DevOps
What’s ElasticSearch?
• “flexible and powerful open source,
distributed real-time
search and analytics engine for the
cloud”
• http://www.elasticsearch.org/
What’s ElasticSearch?
• “flexible and powerful open source,
distributed real-time
search and analytics engine for the
cloud”
• JSON-oriented;
• RESTful API;
• Schema free.
MySQL ElasticSearch
database Index
table Type
column field
Defined data type Auto detected
What’s ElasticSearch?
• “flexible and powerful open source,
distributed real-time
search and analytics engine for the
cloud”
• Master nodes & data nodes;
• Auto-organize for replicas and shards;
• Asynchronous transport between nodes.
What’s ElasticSearch?
• “flexible and powerful open source,
distributed real-time
search and analytics engine for the
cloud”
• Flush every 1 second.
What’s ElasticSearch?
• “flexible and powerful open source,
distributed real-time
search and analytics engine for the
cloud”
• Build on Apache lucene.
• Also has facets just as solr.
What’s ElasticSearch?
• “flexible and powerful open source,
distributed real-time
search and analytics engine for the
cloud”
• Give a cluster name, auto-discovery by
unicast/multicast ping or EC2 key.
• No zookeeper needed.
Howto Curl
• Index
$ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elastic Search"
}‘
{"ok":true,"_index":“twitter","_type":“tweet","_id":"1","_v
ersion":1}
Howto Curl
• Get
$ curl -XGET 'http://localhost:9200/twitter/tweet/1'
{
"_index" : "twitter",
"_type" : "tweet",
"_id" : "1",
"_source" : {
"user" : "kimchy",
"postDate" : "2009-11-15T14:12:12",
"message" : "trying out Elastic Search"
}
}
Howto Curl
• Query
$ curl -XPOST 'http://localhost:9200/twitter/tweet/_search?
pretty=1&size=1' -d '{
"query" : {
"term" : { "user" : "kimchy" }
"fields": ["message"]
}
}'
Howto Curl
• Query
• Term => { match some terms (after analyzed)}
• Match => { match whole field (no analyzed)}
• Prefix => { match field prefix (no analyzed)}
• Range => { from, to}
• Regexp => { .* }
• Query_string => { this AND that OR thus }
• Must/must_not => {query}
• Shoud => [{query},{}]
• Bool => {must,must_not,should,…}
Howto Curl
• Filter
$ curl -XPOST 'http://localhost:9200/twitter/tweet/_search?
pretty=1&size=1' -d '{
"query" : {
“match_all" : {}
},
"filter" : {
"term" : { “user" : “kimchy" }
}
}'
Much faster because filter is cacheable and do not calcute
_score.
Howto Curl
• Filter
• And => [{filter},{filter}] (only two)
• Not => {filter}
• Or => [{filter},{filter}](only two)
• Script => {“script”:”doc[‘field’].value > 10”}
• Other like the query DSL
Howto Curl
• Facets
$ curl -XPOST 'http://localhost:9200/twitter/tweet/_search?pretty=1&size=0'
-d '{
"query" : {
“match_all" : {}
},
"filter" : {
“prefix" : { “user" : “k" }
},
"facets" : {
“usergroup" : {
"terms" : { "field" : “user" }
}
}
}'
Howto Curl
• Facets
• terms => [{“term”:”kimchy”,”count”:20},{}]
• Range <= [{“from”:10,”to”:20},]
• Histogram <= {“field”:”user”,”interval”:10}
• Statistical <= {“field”:”reqtime”}
=> [{“min”:,”max”:,”avg”:,”count”:}]
Howto Perl – ElasticSearch.pm
use ElasticSearch;
my $es = ElasticSearch->new(
servers => 'search.foo.com:9200', # default '127.0.0.1:9200'
transport => 'http' # default 'http'
| 'httplite ' # 30% faster, future default
| 'httptiny ' # 1% more faster
| 'curl'
| 'aehttp'
| 'aecurl'
| 'thrift', # generated code too slow
max_requests => 10_000, # default 10000
trace_calls => 'log_file',
no_refresh => 0 | 1,
);
Howto Perl – ElasticSearch.pm
use ElasticSearch;
my $es = ElasticSearch->new(
servers => 'search.foo.com:9200',
transport => 'httptiny ‘,
max_requests => 10_000,
trace_calls => 'log_file',
no_refresh => 0 | 1,
);
• Get nodelist by /_cluster API from the $servers;
• Rand change request to other node after
$max_requests.
Howto Perl – ElasticSearch.pm
$es->index(
index => 'twitter',
type => 'tweet',
id => 1,
data => {
user => 'kimchy',
post_date => '2009-11-15T14:12:12',
message => 'trying out Elastic Search'
}
);
Howto Perl – ElasticSearch.pm
$es->search(
facets => {
wow_facet => {
query => { text => { content => 'wow' }},
facet_filter => { term => {status => 'active' }},
}
}
)
Howto Perl – ElasticSearch.pm
$es->search(
facets => {
wow_facet => {
queryb => { content => 'wow' },
facet_filterb => { status => 'active' },
}
}
)
ElasticSearch::SearchBuilder
More perlish
SQL::Abstract-like
But I don’t like ==!
Howto Perl – Elastic::Model
• Tie a Moose object to elasticsearch
package MyApp;
use Elastic::Model;
has_namespace 'myapp' => {
user => 'MyApp::User'
};
no Elastic::Model;
1;
Howto Perl – Elastic::Model
package MyApp::User;
use Elastic::Doc;
use DateTime;
has 'name' => (
is => 'rw',
isa => 'Str',
);
has 'email' => (
is => 'rw',
isa => 'Str',
);
has 'created' => (
is => 'ro',
isa => 'DateTime',
default => sub { DateTime->now }
);
no Elastic::Doc;
1;
Howto Perl – Elastic::Model
package MyApp::User;
use Moose;
use DateTime;
has 'name' => (
is => 'rw',
isa => 'Str',
);
has 'email' => (
is => 'rw',
isa => 'Str',
);
has 'created' => (
is => 'ro',
isa => 'DateTime',
default => sub { DateTime->now }
);
no Moose;
1;
Howto Perl – Elastic::Model
• Connect to db
my $es = ElasticSearch->new( servers => 'localhost:9200' );
my $model = MyApp->new( es => $es );
• Create database and table
$model->namespace('myapp')->index->create();
• CRUD
my $domain = $model->domain('myapp');
$domain->newdoc()|get();
• search
my $search = $domain->view->type(‘user’)->query(…)->filterb(…);
$results = $search->search;
say "Total results found: ".$results->total;
while (my $doc = $results->next_doc) {
say $doc->name;
}
ES for Dev -- Github
• 20TB data;
• 1300000000 files;
• 130000000000 code lines.
• Using 26 Elasticsearch storage nodes(each
has 2TB SSD) managed by puppet.
• 1replica + 20 shards.
• https://github.com/blog/1381-a-whole-new-code-search
• https://github.com/blog/1397-recent-code-search-outages
ES for Dev – Git::Search
• Thank you, Mateu Hunter!
• https://github.com/mateu/Git-Search
cpanm --installdeps .
cp git-search.conf git-search-local.conf
edit git-search-local.conf
perl -Ilib bin/insert_docs.pl
plackup -Ilib
curl http://localhost:5000/text_you_want
ES for Perler -- Metacpan
• search.cpan.org => metacpan.org
• use ElasticSearch as API backend;
• use Catalyst build website frontend.
• Learn API:
https://github.com/CPAN-API/cpan-api/wiki/API-docs
• Have a try:
http://explorer.metacpan.org/
ES for Perler – index-weekly
• A Perl script (55 lines) to index
devopsweekly into elasticsearch.
• https://github.com/alcy/index-weekly
• We can do same thing to perlweekly,right?
ES for logging - Logstash
• “logstash is a tool for managing events
and logs. You can use it to collect logs,
parse them, and store them for later use.”
• http://logstash.net/
ES for logging - Logstash
• “logstash is a tool for managing events
and logs. You can use it to collect logs,
parse them, and store them for later use.”
• Log is stream, not file!
• Event is something not only oneline!
ES for logging - Logstash
• “logstash is a tool for managing events
and logs. You can use it to collect logs,
parse them, and store them for later use.”
• file/*mq/stdin/tcp/udp/websocket…(34
input plugins now)
ES for logging - Logstash
• “logstash is a tool for managing events
and logs. You can use it to collect logs,
parse them, and store them for later use.”
• date/geoip/grok/multiline/mutate…(29
filter plugins now)
ES for logging - Logstash
• “logstash is a tool for managing events
and logs. You can use it to collect logs,
parse them, and store them for later use.”
• transfer:stdout/*mq/tcp/udp/file/websocket…
• alert:ganglia/nagios/opentsdb/graphite/irc/xmpp
/email…
• store:elasticsearch/mongodb/riak
• (47 output plugins now)
ES for logging - Logstash
ES for logging - Logstash
input {
redis {
host => "127.0.0.1“
type => "redis-input“
data_type => "list“
key => "logstash“
}
}
filter {
grok {
type => “redis-input“
pattern => "%{COMBINEDAPACHELOG}"
}
}
output {
elasticsearch {
host => "127.0.0.1“
}
}
ES for logging - Logstash
• Grok(Regexp capture):
%{IP:client:string}
%{NUMBER:bytes:int}
More default patterns at source:
https://github.com/logstash/logstash/tree/master/patterns
ES for logging - Logstash
For example:
10.2.21.130 - - [08/Apr/2013:11:13:40 +0800] "GET
/mediawiki/load.php HTTP/1.1" 304 -
"http://som.d.xiaonei.com/mediawiki/index.php"
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3)
AppleWebKit/536.28.10 (KHTML, like Gecko) Version/6.0.3
Safari/536.28.10"
ES for logging - Logstash
{"@source":"file://chenryn-Lenovo/home/chenryn/test.txt",
"@tags":[],
"@fields":{
"clientip":["10.2.21.130"],
"ident":["-"],
"auth":["-"],
"timestamp":["08/Apr/2013:11:13:40 +0800"],
"verb":["GET"],
"request":["/mediawiki/load.php"],
"httpversion":["1.1"],
"response":["304"],
"referrer":[""http://som.d.xiaonei.com/mediawiki/index.php""],
"agent":[""Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/536.28.10 (KHTML, like
Gecko) Version/6.0.3 Safari/536.28.10""]
},
"@timestamp":"2013-04-08T03:34:37.959Z",
"@source_host":"chenryn-Lenovo",
"@source_path":"/home/chenryn/test.txt",
"@message":"10.2.21.130 - - [08/Apr/2013:11:13:40 +0800] "GET /mediawiki/load.php HTTP/1.1"
304 - "http://som.d.xiaonei.com/mediawiki/index.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X
10_8_3) AppleWebKit/536.28.10 (KHTML, like Gecko) Version/6.0.3 Safari/536.28.10"",
"@type":"apache“
}
ES for logging - Logstash
"properties" : {
"@fields" : {
"dynamic" : "true",
"properties" : {
"client" : {
"type" : "string",
"index" : "not_analyzed“
},
"size" : {
"type" : "long",
"index" : "not_analyzed“
},
"status" : {
"type" : "string",
"index" : "not_analyzed“
},
"upstreamtime" : {
"type" : "double“
},
}
},
ES for logging - Kibana
ES for logging – Message::Passing
• Logstash port to Perl5
• 17 CPAN modules
ES for logging – Message::Passing
use Message::Passing::DSL;
run_message_server message_chain {
output elasticsearch => (
class => 'ElasticSearch',
elasticsearch_servers => ['127.0.0.1:9200'],
);
filter regexp => (
class => 'Regexp',
format => ':nginxaccesslog',
capture => [qw( ts status remotehost url oh responsetime upstreamtime bytes )]
output_to => 'elasticsearch',
);
filter tologstash => (
class => 'ToLogstash',
output_to => 'regexp',
);
input file => (
class => 'FileTail',
output_to => ‘tologstash',
);
};
Message::Passing vs Logstash
100_000 lines nginx access log
logstash::output::elasticsearch_http
(default)
4m30.013s
logstash::output::elasticsearch_http
(flush_size => 1000)
3m41.657s
message::passing::filter::regexp
(v0.01 call $self->_regex->regexp() everyline)
1m22.519s
message::passing::filter::regexp
(v0.04 store $self->_regex->regexp() to $self->_re)
0m44.606s
D::P::Elasticsearch & D::P::Ajax
Build Website using PerlDancer
get '/' => require_role SOM => sub {
my $indices = elsearch->cluster_state->{routing_table}->{indices};
template 'psa/map',
{
providers => [ sort keys %$default_provider ],
datasources =>
[ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ],
inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ),
inputto => strftime( "%FT%T", localtime() ),
};
};
ajax '/api/area' => sub {
my $param = from_json( request->body );
my $index = $index_prefix . $param->{'datasource'};
my $limit = $param->{'limit'} || 50;
my $from = $param->{'from'} || 'now-10d';
my $to = $param->{'to'} || 'now';
my $res = pct_terms( $index, $limit, $from, $to );
return to_json($res);
};
use Dancer ‘:syntax’;
get '/' => require_role SOM => sub {
my $indices = elsearch->cluster_state->{routing_table}->{indices};
template 'psa/map',
{
providers => [ sort keys %$default_provider ],
datasources =>
[ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ],
inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ),
inputto => strftime( "%FT%T", localtime() ),
};
};
ajax '/api/area' => sub {
my $param = from_json( request->body );
my $index = $index_prefix . $param->{'datasource'};
my $limit = $param->{'limit'} || 50;
my $from = $param->{'from'} || 'now-10d';
my $to = $param->{'to'} || 'now';
my $res = pct_terms( $index, $limit, $from, $to );
return to_json($res);
};
use Dancer::Plugin::Auth::Extensible;
get '/' => require_role SOM => sub {
my $indices = elsearch->cluster_state->{routing_table}->{indices};
template 'psa/map',
{
providers => [ sort keys %$default_provider ],
datasources =>
[ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ],
inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ),
inputto => strftime( "%FT%T", localtime() ),
};
};
ajax '/api/area' => sub {
my $param = from_json( request->body );
my $index = $index_prefix . $param->{'datasource'};
my $limit = $param->{'limit'} || 50;
my $from = $param->{'from'} || 'now-10d';
my $to = $param->{'to'} || 'now';
my $res = pct_terms( $index, $limit, $from, $to );
return to_json($res);
};
use Dancer::Plugin::Ajax;
get '/' => require_role SOM => sub {
my $indices = elsearch->cluster_state->{routing_table}->{indices};
template 'psa/map',
{
providers => [ sort keys %$default_provider ],
datasources =>
[ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ],
inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ),
inputto => strftime( "%FT%T", localtime() ),
};
};
ajax '/api/area' => sub {
my $param = from_json( request->body );
my $index = $index_prefix . $param->{'datasource'};
my $limit = $param->{'limit'} || 50;
my $from = $param->{'from'} || 'now-10d';
my $to = $param->{'to'} || 'now';
my $res = pct_terms( $index, $limit, $from, $to );
return to_json($res);
};
use Dancer::Plugin::ElasticSearch;
get '/' => require_role SOM => sub {
my $indices = elsearch->cluster_state->{routing_table}->{indices};
template 'psa/map',
{
providers => [ sort keys %$default_provider ],
datasources =>
[ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ],
inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ),
inputto => strftime( "%FT%T", localtime() ),
};
};
ajax '/api/area' => sub {
my $param = from_json( request->body );
my $index = $index_prefix . $param->{'datasource'};
my $limit = $param->{'limit'} || 50;
my $from = $param->{'from'} || 'now-10d';
my $to = $param->{'to'} || 'now';
my $res = pct_terms( $index, $limit, $from, $to );
return to_json($res);
};
use Dancer::Plugin::ElasticSearch;
sub area_terms {
my ( $index, $level, $limit, $from, $to ) = @_;
my $data = elsearch->search(
index => $index,
type => $type,
facets => {
area => {
facet_filter => {
and => [
{ range => { date => { from => $from, to => $to } } },
{ numeric_range => { timeCost => { gte => $level } } },
],
},
terms => {
field => "fromArea",
size => $limit,
}
}
}
);
return $data->{facets}->{area}->{terms};
}
ES for monitor – oculus(Etsy Kale)
• Kale to detect anomalous metrics and see
if any other metrics look similar.
• http://codeascraft.com/2013/06/11/introd
ucing-kale/
ES for monitor – oculus(Etsy Kale)
• Kale to detect anomalous metrics and see
if any other metrics look similar.
• https://github.com/etsy/skyline
ES for monitor – oculus(Etsy Kale)
• Kale to detect anomalous metrics and see
if any other metrics look similar.
• https://github.com/etsy/oculus
ES for monitor – oculus(Etsy Kale)
• import monitor data from redis/ganglia to
elasticsearch
• Using native script to calculate distance:
script.native:
oculus_euclidian.type:
com.etsy.oculus.tsscorers.EuclidianScriptFactory
oculus_dtw.type:
com.etsy.oculus.tsscorers.DTWScriptFactory
ES for monitor – oculus(Etsy Kale)
• https://speakerdeck.com/astanway/bring-the-noise-
continuously-deploying-under-a-hailstorm-of-metrics
VBox example
• apt-get install -y git cpanminus virtualbox
• cpanm Rex
• git clone https://github.com/chenryn/esdevops
• cd esdevops
• rex init --name esdevops
How ElasticSearch lives in my DevOps life

More Related Content

PPTX
Elk stack
PDF
Machine Learning in a Twitter ETL using ELK
PDF
Logstash-Elasticsearch-Kibana
PPTX
Introduction to ELK
PPTX
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
PDF
ElasticSearch
PPT
Logstash
PDF
Debugging and Testing ES Systems
Elk stack
Machine Learning in a Twitter ETL using ELK
Logstash-Elasticsearch-Kibana
Introduction to ELK
Elasticsearch, Logstash, Kibana. Cool search, analytics, data mining and more...
ElasticSearch
Logstash
Debugging and Testing ES Systems

What's hot (20)

PDF
Monitoring with Graylog - a modern approach to monitoring?
PDF
Logstash + Elasticsearch + Kibana Presentation on Startit Tech Meetup
PDF
Logstash: Get to know your logs
PDF
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
PPTX
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
PDF
elk_stack_alexander_szalonnas
PDF
Advanced troubleshooting linux performance
PPTX
ELK Stack
PDF
LogStash in action
PPTX
Logstash
PDF
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
PDF
Mobile Analytics mit Elasticsearch und Kibana
PDF
Logstash family introduction
PDF
Logging with Elasticsearch, Logstash & Kibana
PDF
Logging logs with Logstash - Devops MK 10-02-2016
PDF
Experiences in ELK with D3.js for Large Log Analysis and Visualization
PDF
Logs aggregation and analysis
ODP
Using Logstash, elasticsearch & kibana
PDF
Elk stack @inbot
Monitoring with Graylog - a modern approach to monitoring?
Logstash + Elasticsearch + Kibana Presentation on Startit Tech Meetup
Logstash: Get to know your logs
Journée DevOps : Des dashboards pour tous avec ElasticSearch, Logstash et Kibana
Customer Intelligence: Using the ELK Stack to Analyze ForgeRock OpenAM Audit ...
elk_stack_alexander_szalonnas
Advanced troubleshooting linux performance
ELK Stack
LogStash in action
Logstash
From Zero to Production Hero: Log Analysis with Elasticsearch (from Velocity ...
Mobile Analytics mit Elasticsearch und Kibana
Logstash family introduction
Logging with Elasticsearch, Logstash & Kibana
Logging logs with Logstash - Devops MK 10-02-2016
Experiences in ELK with D3.js for Large Log Analysis and Visualization
Logs aggregation and analysis
Using Logstash, elasticsearch & kibana
Elk stack @inbot
Ad

Similar to How ElasticSearch lives in my DevOps life (20)

PDF
ELK stack introduction
PDF
Null Bachaav - May 07 Attack Monitoring workshop.
PDF
Introduction to Elasticsearch
PPTX
Attack monitoring using ElasticSearch Logstash and Kibana
PPT
Craig Brown speaks on ElasticSearch
PPTX
Elastic pivorak
PDF
Log analysis with the elk stack
PPT
Elk presentation1#3
PPTX
ElasticSearch AJUG 2013
PDF
Workshop: Learning Elasticsearch
PDF
Elasticsearch Basics
PPTX
PDF
ELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdf
PPTX
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
PPTX
Centralized log-management-with-elastic-stack
PDF
Elasticsearch Introduction at BigData meetup
PPTX
The Elastic Stack as a SIEM
PDF
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
PPTX
The ELK Stack - Launch and Learn presentation
PPTX
ElasticSearch - DevNexus Atlanta - 2014
ELK stack introduction
Null Bachaav - May 07 Attack Monitoring workshop.
Introduction to Elasticsearch
Attack monitoring using ElasticSearch Logstash and Kibana
Craig Brown speaks on ElasticSearch
Elastic pivorak
Log analysis with the elk stack
Elk presentation1#3
ElasticSearch AJUG 2013
Workshop: Learning Elasticsearch
Elasticsearch Basics
ELK-Stack-Essential-Concepts-TheELKStack-LunchandLearn.pdf
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Centralized log-management-with-elastic-stack
Elasticsearch Introduction at BigData meetup
The Elastic Stack as a SIEM
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
The ELK Stack - Launch and Learn presentation
ElasticSearch - DevNexus Atlanta - 2014
Ad

More from 琛琳 饶 (10)

PPT
{{more}} Kibana4
PPT
ELK stack at weibo.com
PPTX
More kibana
PPT
Monitor is all for ops
PPT
Perl调用微博API实现自动查询应答
PDF
Add mailinglist command to gitolite
PPT
Skyline 简介
PPT
DNS协议与应用简介
DOC
Mysql测试报告
PPT
Perl在nginx里的应用
{{more}} Kibana4
ELK stack at weibo.com
More kibana
Monitor is all for ops
Perl调用微博API实现自动查询应答
Add mailinglist command to gitolite
Skyline 简介
DNS协议与应用简介
Mysql测试报告
Perl在nginx里的应用

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Approach and Philosophy of On baking technology
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Getting Started with Data Integration: FME Form 101
PDF
Encapsulation theory and applications.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
1. Introduction to Computer Programming.pptx
PPTX
Tartificialntelligence_presentation.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
A comparative analysis of optical character recognition models for extracting...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Electronic commerce courselecture one. Pdf
Approach and Philosophy of On baking technology
Advanced methodologies resolving dimensionality complications for autism neur...
Empathic Computing: Creating Shared Understanding
Reach Out and Touch Someone: Haptics and Empathic Computing
gpt5_lecture_notes_comprehensive_20250812015547.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
A Presentation on Artificial Intelligence
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
SOPHOS-XG Firewall Administrator PPT.pptx
Getting Started with Data Integration: FME Form 101
Encapsulation theory and applications.pdf
Encapsulation_ Review paper, used for researhc scholars
1. Introduction to Computer Programming.pptx
Tartificialntelligence_presentation.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
MIND Revenue Release Quarter 2 2025 Press Release
A comparative analysis of optical character recognition models for extracting...

How ElasticSearch lives in my DevOps life

  • 2. What’s ElasticSearch? • “flexible and powerful open source, distributed real-time search and analytics engine for the cloud” • http://www.elasticsearch.org/
  • 3. What’s ElasticSearch? • “flexible and powerful open source, distributed real-time search and analytics engine for the cloud” • JSON-oriented; • RESTful API; • Schema free. MySQL ElasticSearch database Index table Type column field Defined data type Auto detected
  • 4. What’s ElasticSearch? • “flexible and powerful open source, distributed real-time search and analytics engine for the cloud” • Master nodes & data nodes; • Auto-organize for replicas and shards; • Asynchronous transport between nodes.
  • 5. What’s ElasticSearch? • “flexible and powerful open source, distributed real-time search and analytics engine for the cloud” • Flush every 1 second.
  • 6. What’s ElasticSearch? • “flexible and powerful open source, distributed real-time search and analytics engine for the cloud” • Build on Apache lucene. • Also has facets just as solr.
  • 7. What’s ElasticSearch? • “flexible and powerful open source, distributed real-time search and analytics engine for the cloud” • Give a cluster name, auto-discovery by unicast/multicast ping or EC2 key. • No zookeeper needed.
  • 8. Howto Curl • Index $ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{ "user" : "kimchy", "post_date" : "2009-11-15T14:12:12", "message" : "trying out Elastic Search" }‘ {"ok":true,"_index":“twitter","_type":“tweet","_id":"1","_v ersion":1}
  • 9. Howto Curl • Get $ curl -XGET 'http://localhost:9200/twitter/tweet/1' { "_index" : "twitter", "_type" : "tweet", "_id" : "1", "_source" : { "user" : "kimchy", "postDate" : "2009-11-15T14:12:12", "message" : "trying out Elastic Search" } }
  • 10. Howto Curl • Query $ curl -XPOST 'http://localhost:9200/twitter/tweet/_search? pretty=1&size=1' -d '{ "query" : { "term" : { "user" : "kimchy" } "fields": ["message"] } }'
  • 11. Howto Curl • Query • Term => { match some terms (after analyzed)} • Match => { match whole field (no analyzed)} • Prefix => { match field prefix (no analyzed)} • Range => { from, to} • Regexp => { .* } • Query_string => { this AND that OR thus } • Must/must_not => {query} • Shoud => [{query},{}] • Bool => {must,must_not,should,…}
  • 12. Howto Curl • Filter $ curl -XPOST 'http://localhost:9200/twitter/tweet/_search? pretty=1&size=1' -d '{ "query" : { “match_all" : {} }, "filter" : { "term" : { “user" : “kimchy" } } }' Much faster because filter is cacheable and do not calcute _score.
  • 13. Howto Curl • Filter • And => [{filter},{filter}] (only two) • Not => {filter} • Or => [{filter},{filter}](only two) • Script => {“script”:”doc[‘field’].value > 10”} • Other like the query DSL
  • 14. Howto Curl • Facets $ curl -XPOST 'http://localhost:9200/twitter/tweet/_search?pretty=1&size=0' -d '{ "query" : { “match_all" : {} }, "filter" : { “prefix" : { “user" : “k" } }, "facets" : { “usergroup" : { "terms" : { "field" : “user" } } } }'
  • 15. Howto Curl • Facets • terms => [{“term”:”kimchy”,”count”:20},{}] • Range <= [{“from”:10,”to”:20},] • Histogram <= {“field”:”user”,”interval”:10} • Statistical <= {“field”:”reqtime”} => [{“min”:,”max”:,”avg”:,”count”:}]
  • 16. Howto Perl – ElasticSearch.pm use ElasticSearch; my $es = ElasticSearch->new( servers => 'search.foo.com:9200', # default '127.0.0.1:9200' transport => 'http' # default 'http' | 'httplite ' # 30% faster, future default | 'httptiny ' # 1% more faster | 'curl' | 'aehttp' | 'aecurl' | 'thrift', # generated code too slow max_requests => 10_000, # default 10000 trace_calls => 'log_file', no_refresh => 0 | 1, );
  • 17. Howto Perl – ElasticSearch.pm use ElasticSearch; my $es = ElasticSearch->new( servers => 'search.foo.com:9200', transport => 'httptiny ‘, max_requests => 10_000, trace_calls => 'log_file', no_refresh => 0 | 1, ); • Get nodelist by /_cluster API from the $servers; • Rand change request to other node after $max_requests.
  • 18. Howto Perl – ElasticSearch.pm $es->index( index => 'twitter', type => 'tweet', id => 1, data => { user => 'kimchy', post_date => '2009-11-15T14:12:12', message => 'trying out Elastic Search' } );
  • 19. Howto Perl – ElasticSearch.pm $es->search( facets => { wow_facet => { query => { text => { content => 'wow' }}, facet_filter => { term => {status => 'active' }}, } } )
  • 20. Howto Perl – ElasticSearch.pm $es->search( facets => { wow_facet => { queryb => { content => 'wow' }, facet_filterb => { status => 'active' }, } } ) ElasticSearch::SearchBuilder More perlish SQL::Abstract-like But I don’t like ==!
  • 21. Howto Perl – Elastic::Model • Tie a Moose object to elasticsearch package MyApp; use Elastic::Model; has_namespace 'myapp' => { user => 'MyApp::User' }; no Elastic::Model; 1;
  • 22. Howto Perl – Elastic::Model package MyApp::User; use Elastic::Doc; use DateTime; has 'name' => ( is => 'rw', isa => 'Str', ); has 'email' => ( is => 'rw', isa => 'Str', ); has 'created' => ( is => 'ro', isa => 'DateTime', default => sub { DateTime->now } ); no Elastic::Doc; 1;
  • 23. Howto Perl – Elastic::Model package MyApp::User; use Moose; use DateTime; has 'name' => ( is => 'rw', isa => 'Str', ); has 'email' => ( is => 'rw', isa => 'Str', ); has 'created' => ( is => 'ro', isa => 'DateTime', default => sub { DateTime->now } ); no Moose; 1;
  • 24. Howto Perl – Elastic::Model • Connect to db my $es = ElasticSearch->new( servers => 'localhost:9200' ); my $model = MyApp->new( es => $es ); • Create database and table $model->namespace('myapp')->index->create(); • CRUD my $domain = $model->domain('myapp'); $domain->newdoc()|get(); • search my $search = $domain->view->type(‘user’)->query(…)->filterb(…); $results = $search->search; say "Total results found: ".$results->total; while (my $doc = $results->next_doc) { say $doc->name; }
  • 25. ES for Dev -- Github • 20TB data; • 1300000000 files; • 130000000000 code lines. • Using 26 Elasticsearch storage nodes(each has 2TB SSD) managed by puppet. • 1replica + 20 shards. • https://github.com/blog/1381-a-whole-new-code-search • https://github.com/blog/1397-recent-code-search-outages
  • 26. ES for Dev – Git::Search • Thank you, Mateu Hunter! • https://github.com/mateu/Git-Search cpanm --installdeps . cp git-search.conf git-search-local.conf edit git-search-local.conf perl -Ilib bin/insert_docs.pl plackup -Ilib curl http://localhost:5000/text_you_want
  • 27. ES for Perler -- Metacpan • search.cpan.org => metacpan.org • use ElasticSearch as API backend; • use Catalyst build website frontend. • Learn API: https://github.com/CPAN-API/cpan-api/wiki/API-docs • Have a try: http://explorer.metacpan.org/
  • 28. ES for Perler – index-weekly • A Perl script (55 lines) to index devopsweekly into elasticsearch. • https://github.com/alcy/index-weekly • We can do same thing to perlweekly,right?
  • 29. ES for logging - Logstash • “logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use.” • http://logstash.net/
  • 30. ES for logging - Logstash • “logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use.” • Log is stream, not file! • Event is something not only oneline!
  • 31. ES for logging - Logstash • “logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use.” • file/*mq/stdin/tcp/udp/websocket…(34 input plugins now)
  • 32. ES for logging - Logstash • “logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use.” • date/geoip/grok/multiline/mutate…(29 filter plugins now)
  • 33. ES for logging - Logstash • “logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use.” • transfer:stdout/*mq/tcp/udp/file/websocket… • alert:ganglia/nagios/opentsdb/graphite/irc/xmpp /email… • store:elasticsearch/mongodb/riak • (47 output plugins now)
  • 34. ES for logging - Logstash
  • 35. ES for logging - Logstash input { redis { host => "127.0.0.1“ type => "redis-input“ data_type => "list“ key => "logstash“ } } filter { grok { type => “redis-input“ pattern => "%{COMBINEDAPACHELOG}" } } output { elasticsearch { host => "127.0.0.1“ } }
  • 36. ES for logging - Logstash • Grok(Regexp capture): %{IP:client:string} %{NUMBER:bytes:int} More default patterns at source: https://github.com/logstash/logstash/tree/master/patterns
  • 37. ES for logging - Logstash For example: 10.2.21.130 - - [08/Apr/2013:11:13:40 +0800] "GET /mediawiki/load.php HTTP/1.1" 304 - "http://som.d.xiaonei.com/mediawiki/index.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/536.28.10 (KHTML, like Gecko) Version/6.0.3 Safari/536.28.10"
  • 38. ES for logging - Logstash {"@source":"file://chenryn-Lenovo/home/chenryn/test.txt", "@tags":[], "@fields":{ "clientip":["10.2.21.130"], "ident":["-"], "auth":["-"], "timestamp":["08/Apr/2013:11:13:40 +0800"], "verb":["GET"], "request":["/mediawiki/load.php"], "httpversion":["1.1"], "response":["304"], "referrer":[""http://som.d.xiaonei.com/mediawiki/index.php""], "agent":[""Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/536.28.10 (KHTML, like Gecko) Version/6.0.3 Safari/536.28.10""] }, "@timestamp":"2013-04-08T03:34:37.959Z", "@source_host":"chenryn-Lenovo", "@source_path":"/home/chenryn/test.txt", "@message":"10.2.21.130 - - [08/Apr/2013:11:13:40 +0800] "GET /mediawiki/load.php HTTP/1.1" 304 - "http://som.d.xiaonei.com/mediawiki/index.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/536.28.10 (KHTML, like Gecko) Version/6.0.3 Safari/536.28.10"", "@type":"apache“ }
  • 39. ES for logging - Logstash "properties" : { "@fields" : { "dynamic" : "true", "properties" : { "client" : { "type" : "string", "index" : "not_analyzed“ }, "size" : { "type" : "long", "index" : "not_analyzed“ }, "status" : { "type" : "string", "index" : "not_analyzed“ }, "upstreamtime" : { "type" : "double“ }, } },
  • 40. ES for logging - Kibana
  • 41. ES for logging – Message::Passing • Logstash port to Perl5 • 17 CPAN modules
  • 42. ES for logging – Message::Passing use Message::Passing::DSL; run_message_server message_chain { output elasticsearch => ( class => 'ElasticSearch', elasticsearch_servers => ['127.0.0.1:9200'], ); filter regexp => ( class => 'Regexp', format => ':nginxaccesslog', capture => [qw( ts status remotehost url oh responsetime upstreamtime bytes )] output_to => 'elasticsearch', ); filter tologstash => ( class => 'ToLogstash', output_to => 'regexp', ); input file => ( class => 'FileTail', output_to => ‘tologstash', ); };
  • 43. Message::Passing vs Logstash 100_000 lines nginx access log logstash::output::elasticsearch_http (default) 4m30.013s logstash::output::elasticsearch_http (flush_size => 1000) 3m41.657s message::passing::filter::regexp (v0.01 call $self->_regex->regexp() everyline) 1m22.519s message::passing::filter::regexp (v0.04 store $self->_regex->regexp() to $self->_re) 0m44.606s
  • 45. Build Website using PerlDancer get '/' => require_role SOM => sub { my $indices = elsearch->cluster_state->{routing_table}->{indices}; template 'psa/map', { providers => [ sort keys %$default_provider ], datasources => [ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ], inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ), inputto => strftime( "%FT%T", localtime() ), }; }; ajax '/api/area' => sub { my $param = from_json( request->body ); my $index = $index_prefix . $param->{'datasource'}; my $limit = $param->{'limit'} || 50; my $from = $param->{'from'} || 'now-10d'; my $to = $param->{'to'} || 'now'; my $res = pct_terms( $index, $limit, $from, $to ); return to_json($res); };
  • 46. use Dancer ‘:syntax’; get '/' => require_role SOM => sub { my $indices = elsearch->cluster_state->{routing_table}->{indices}; template 'psa/map', { providers => [ sort keys %$default_provider ], datasources => [ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ], inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ), inputto => strftime( "%FT%T", localtime() ), }; }; ajax '/api/area' => sub { my $param = from_json( request->body ); my $index = $index_prefix . $param->{'datasource'}; my $limit = $param->{'limit'} || 50; my $from = $param->{'from'} || 'now-10d'; my $to = $param->{'to'} || 'now'; my $res = pct_terms( $index, $limit, $from, $to ); return to_json($res); };
  • 47. use Dancer::Plugin::Auth::Extensible; get '/' => require_role SOM => sub { my $indices = elsearch->cluster_state->{routing_table}->{indices}; template 'psa/map', { providers => [ sort keys %$default_provider ], datasources => [ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ], inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ), inputto => strftime( "%FT%T", localtime() ), }; }; ajax '/api/area' => sub { my $param = from_json( request->body ); my $index = $index_prefix . $param->{'datasource'}; my $limit = $param->{'limit'} || 50; my $from = $param->{'from'} || 'now-10d'; my $to = $param->{'to'} || 'now'; my $res = pct_terms( $index, $limit, $from, $to ); return to_json($res); };
  • 48. use Dancer::Plugin::Ajax; get '/' => require_role SOM => sub { my $indices = elsearch->cluster_state->{routing_table}->{indices}; template 'psa/map', { providers => [ sort keys %$default_provider ], datasources => [ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ], inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ), inputto => strftime( "%FT%T", localtime() ), }; }; ajax '/api/area' => sub { my $param = from_json( request->body ); my $index = $index_prefix . $param->{'datasource'}; my $limit = $param->{'limit'} || 50; my $from = $param->{'from'} || 'now-10d'; my $to = $param->{'to'} || 'now'; my $res = pct_terms( $index, $limit, $from, $to ); return to_json($res); };
  • 49. use Dancer::Plugin::ElasticSearch; get '/' => require_role SOM => sub { my $indices = elsearch->cluster_state->{routing_table}->{indices}; template 'psa/map', { providers => [ sort keys %$default_provider ], datasources => [ grep { /^$index_prefix/ && s/$index_prefix// } keys %$indices ], inputfrom => strftime( "%FT%T", localtime( time() - 864000 ) ), inputto => strftime( "%FT%T", localtime() ), }; }; ajax '/api/area' => sub { my $param = from_json( request->body ); my $index = $index_prefix . $param->{'datasource'}; my $limit = $param->{'limit'} || 50; my $from = $param->{'from'} || 'now-10d'; my $to = $param->{'to'} || 'now'; my $res = pct_terms( $index, $limit, $from, $to ); return to_json($res); };
  • 50. use Dancer::Plugin::ElasticSearch; sub area_terms { my ( $index, $level, $limit, $from, $to ) = @_; my $data = elsearch->search( index => $index, type => $type, facets => { area => { facet_filter => { and => [ { range => { date => { from => $from, to => $to } } }, { numeric_range => { timeCost => { gte => $level } } }, ], }, terms => { field => "fromArea", size => $limit, } } } ); return $data->{facets}->{area}->{terms}; }
  • 51. ES for monitor – oculus(Etsy Kale) • Kale to detect anomalous metrics and see if any other metrics look similar. • http://codeascraft.com/2013/06/11/introd ucing-kale/
  • 52. ES for monitor – oculus(Etsy Kale) • Kale to detect anomalous metrics and see if any other metrics look similar. • https://github.com/etsy/skyline
  • 53. ES for monitor – oculus(Etsy Kale) • Kale to detect anomalous metrics and see if any other metrics look similar. • https://github.com/etsy/oculus
  • 54. ES for monitor – oculus(Etsy Kale) • import monitor data from redis/ganglia to elasticsearch • Using native script to calculate distance: script.native: oculus_euclidian.type: com.etsy.oculus.tsscorers.EuclidianScriptFactory oculus_dtw.type: com.etsy.oculus.tsscorers.DTWScriptFactory
  • 55. ES for monitor – oculus(Etsy Kale) • https://speakerdeck.com/astanway/bring-the-noise- continuously-deploying-under-a-hailstorm-of-metrics
  • 56. VBox example • apt-get install -y git cpanminus virtualbox • cpanm Rex • git clone https://github.com/chenryn/esdevops • cd esdevops • rex init --name esdevops

Editor's Notes

  • #39: Using LogStash::Outputs::STDOUT with `debug =&gt; true`
  • #40: Schema free, but please define schema using /_mapping or template.json for performance.
  • #41: http://demo.kibana.org http://demo.logstash.net