SlideShare a Scribd company logo
ElasticSearch + python
Getting started with ElasticSearch
Valeria Chemtai
Software Developer, Andela
@valeriachemtai
Search…
What is search?
What is the main objective of search?
How search works
Technologies involved - web crawlers, inverted index, scoring, search
Inverted Index
Some image from stackoverflow
Expectations:
1. Understand Elasticsearch and its’ basic
concepts
2. Install and setup a node server
3. Index, update and delete documents
4. Incorporate Elasticsearch into a simple python
application.
Prerequisites
1. Command line knowledge
2. Familiarity with RESTful APIs
Presentation materials
Github https://github.com/valeria-
chemtai/python_meetup
Elasticsearch is a highly scalable open-source full-
text search and analytics engine.
It can be considered as a nosql distributed full text
database.
Elasticsearch is built on top of Apache Lucene
which is free (open sourced)
Other Technologies similar to Elasticsearch
1. Apache Solr
2. Nutch
3. CrateDB - open source SQL distributed DB
● Speed - Operates near real time
● Easy to use with REST API calls
● Scalable
● Robust search - Flexible Query DSL
● Offers statistical analysis tool.
● Many extensions - cloud services, client libraries in
many languages
Why Use Elasticsearch
1. Very poor documentation
2. Not good for use with relational data
3. Only supports JSON data
Limitations of Elasticsearch
Where can I use Elasticsearch
Elasticsearch is mainly used for non-relational
data:
● Blogs
● Data analytics
● Documents with schemaless structure
1. Document - Information to be indexed in JSON form.
2. Type - Logical grouping of documents
3. Index - Collection of types of documents with some similarities.
4. Node - Single server that takes part in indexing
5. Shards - Multiple elements within a node
6. Replicas - Copies of shards
7. Cluster - A collection of nodes
Basic Concepts
SQL Database vs Elasticsearch
Elasticsearch python
Elasticsearch python
1. Install Java version 7 and above from:
https://www.java.com/en/
2. Install Elasticsearch version 5 and above:
https://www.elastic.co/downloads/elasticsearch
3. Then configure system variables.
Installation and Setup
1. Check cluster health
2. Create an index
3. Add documents to an index
4. Retrieve documents
5. Update documents
6. Delete documents and an entire index
Exploring Elasticsearch with curl
Code Implementation
Create a virtual environment
Install elasticsearch in the environment:
● $ pip install elasticsearch
Invoke python shell:
● $ python
Exploring Elasticsearch with Python
1. Create an index
2. Add documents to an index
3. Retrieve documents
4. Update documents
5. Delete documents and an entire index
Exploring Elasticsearch with python
Code Implementation
Code Implementation
Wrapping it all up into a simple
command line app
Q
U
E
S
T
I
O
N
S
?
More on Elasticsearch
Getting started with ElasticSearch-Python :: Part One
Getting started with ElasticSearch-Python :: Part Two
Elastic Website
More
Questions?
valeriachemtai28@gmail.com
gitter.im/valeria-chemtai
medium.com/@valeriachemtai28

More Related Content

What's hot (20)

PDF
Solving PostgreSQL wicked problems
Alexander Korotkov
 
PPTX
Introduction to Redis
Arnab Mitra
 
PDF
Operating PostgreSQL at Scale with Kubernetes
Jonathan Katz
 
PDF
Windows Server 2019 で Container を使ってみる
Kazuki Takai
 
PDF
PostgreSQL 15の新機能を徹底解説
Masahiko Sawada
 
PPTX
Introduction to Redis
Maarten Smeets
 
PPTX
Lessons learned from scaling YARN to 40K machines in a multi tenancy environment
DataWorks Summit
 
PDF
Hoodie - DataEngConf 2017
Vinoth Chandar
 
PDF
Building .NET Microservices
VMware Tanzu
 
PDF
Apache Arrow - データ処理ツールの次世代プラットフォーム
Kouhei Sutou
 
ODP
Protocol Buffers
Knoldus Inc.
 
PDF
db tech showcase 2019 SQL Database Hyperscale 徹底分析 - 最新アーキテクチャの特徴を理解する
Masayuki Ozawa
 
PDF
不揮発メモリ(NVDIMM)とLinuxの対応動向について
Yasunori Goto
 
PDF
10 Good Reasons to Use ClickHouse
rpolat
 
PDF
Cassandra Introduction & Features
DataStax Academy
 
PDF
Configurable horizontal pod autoscaler
Paul Guth
 
PDF
Crimson: Ceph for the Age of NVMe and Persistent Memory
ScyllaDB
 
PPTX
Introduction to Apache ZooKeeper
Saurav Haloi
 
KEY
Introduction to memcached
Jurriaan Persyn
 
PPTX
Fugaku, the Successes and the Lessons Learned
RCCSRENKEI
 
Solving PostgreSQL wicked problems
Alexander Korotkov
 
Introduction to Redis
Arnab Mitra
 
Operating PostgreSQL at Scale with Kubernetes
Jonathan Katz
 
Windows Server 2019 で Container を使ってみる
Kazuki Takai
 
PostgreSQL 15の新機能を徹底解説
Masahiko Sawada
 
Introduction to Redis
Maarten Smeets
 
Lessons learned from scaling YARN to 40K machines in a multi tenancy environment
DataWorks Summit
 
Hoodie - DataEngConf 2017
Vinoth Chandar
 
Building .NET Microservices
VMware Tanzu
 
Apache Arrow - データ処理ツールの次世代プラットフォーム
Kouhei Sutou
 
Protocol Buffers
Knoldus Inc.
 
db tech showcase 2019 SQL Database Hyperscale 徹底分析 - 最新アーキテクチャの特徴を理解する
Masayuki Ozawa
 
不揮発メモリ(NVDIMM)とLinuxの対応動向について
Yasunori Goto
 
10 Good Reasons to Use ClickHouse
rpolat
 
Cassandra Introduction & Features
DataStax Academy
 
Configurable horizontal pod autoscaler
Paul Guth
 
Crimson: Ceph for the Age of NVMe and Persistent Memory
ScyllaDB
 
Introduction to Apache ZooKeeper
Saurav Haloi
 
Introduction to memcached
Jurriaan Persyn
 
Fugaku, the Successes and the Lessons Learned
RCCSRENKEI
 

Similar to Elasticsearch python (20)

ODP
Elasticsearch for beginners
Neil Baker
 
PPTX
Introduction to ElasticSearch
Manav Shrivastava
 
PPTX
Elasticsearch
Divij Sehgal
 
PPTX
Elasticsearch Introduction
Roopendra Vishwakarma
 
ODP
Elasticsearch V/s Relational Database
Richa Budhraja
 
PDF
Using elasticsearch with rails
Tom Z Zeng
 
PPTX
Elasticsearch as a search alternative to a relational database
Kristijan Duvnjak
 
PDF
JavaCro'15 - Elasticsearch as a search alternative to a relational database -...
HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association
 
PPTX
Elastic Search
Navule Rao
 
PDF
ElasticSearch - index server used as a document database
Robert Lujo
 
PDF
Intro to Elasticsearch
Clifford James
 
PDF
Elasticsearch Introduction at BigData meetup
Eric Rodriguez (Hiring in Lex)
 
PDF
Roaring with elastic search sangam2018
Vinay Kumar
 
PPTX
ElasticSearch Basic Introduction
Mayur Rathod
 
PPSX
Elasticsearch - basics and beyond
Ernesto Reig
 
PDF
Elasticsearch and Spark
Audible, Inc.
 
PDF
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
Daniel N
 
PDF
Elasticsearch: An Overview
Ruby Shrestha
 
PPTX
Elasticsearch - DevNexus 2015
Roy Russo
 
PPTX
Elastic pivorak
Pivorak MeetUp
 
Elasticsearch for beginners
Neil Baker
 
Introduction to ElasticSearch
Manav Shrivastava
 
Elasticsearch
Divij Sehgal
 
Elasticsearch Introduction
Roopendra Vishwakarma
 
Elasticsearch V/s Relational Database
Richa Budhraja
 
Using elasticsearch with rails
Tom Z Zeng
 
Elasticsearch as a search alternative to a relational database
Kristijan Duvnjak
 
JavaCro'15 - Elasticsearch as a search alternative to a relational database -...
HUJAK - Hrvatska udruga Java korisnika / Croatian Java User Association
 
Elastic Search
Navule Rao
 
ElasticSearch - index server used as a document database
Robert Lujo
 
Intro to Elasticsearch
Clifford James
 
Elasticsearch Introduction at BigData meetup
Eric Rodriguez (Hiring in Lex)
 
Roaring with elastic search sangam2018
Vinay Kumar
 
ElasticSearch Basic Introduction
Mayur Rathod
 
Elasticsearch - basics and beyond
Ernesto Reig
 
Elasticsearch and Spark
Audible, Inc.
 
ElasticSearch: Distributed Multitenant NoSQL Datastore and Search Engine
Daniel N
 
Elasticsearch: An Overview
Ruby Shrestha
 
Elasticsearch - DevNexus 2015
Roy Russo
 
Elastic pivorak
Pivorak MeetUp
 
Ad

Recently uploaded (20)

PPTX
Practical Applications of AI in Local Government
OnBoard
 
PDF
Kubernetes - Architecture & Components.pdf
geethak285
 
PDF
Simplify Your FME Flow Setup: Fault-Tolerant Deployment Made Easy with Packer...
Safe Software
 
PPTX
Reimaginando la Ciberdefensa: De Copilots a Redes de Agentes
Cristian Garcia G.
 
PDF
Automating the Geo-Referencing of Historic Aerial Photography in Flanders
Safe Software
 
PDF
Pipeline Industry IoT - Real Time Data Monitoring
Safe Software
 
PDF
Hello I'm "AI" Your New _________________
Dr. Tathagat Varma
 
PDF
How to Comply With Saudi Arabia’s National Cybersecurity Regulations.pdf
Bluechip Advanced Technologies
 
PDF
Proactive Server and System Monitoring with FME: Using HTTP and System Caller...
Safe Software
 
PDF
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
PDF
TrustArc Webinar - Navigating APAC Data Privacy Laws: Compliance & Challenges
TrustArc
 
PPTX
Smarter Governance with AI: What Every Board Needs to Know
OnBoard
 
PPTX
MARTSIA: A Tool for Confidential Data Exchange via Public Blockchain - Poster...
Michele Kryston
 
PDF
DoS Attack vs DDoS Attack_ The Silent Wars of the Internet.pdf
CyberPro Magazine
 
PPTX
01_Approach Cyber- DORA Incident Management.pptx
FinTech Belgium
 
PDF
Enhancing Environmental Monitoring with Real-Time Data Integration: Leveragin...
Safe Software
 
PDF
My Journey from CAD to BIM: A True Underdog Story
Safe Software
 
PPTX
2025 HackRedCon Cyber Career Paths.pptx Scott Stanton
Scott Stanton
 
PDF
Darley - FIRST Copenhagen Lightning Talk (2025-06-26) Epochalypse 2038 - Time...
treyka
 
PDF
Understanding AI Optimization AIO, LLMO, and GEO
CoDigital
 
Practical Applications of AI in Local Government
OnBoard
 
Kubernetes - Architecture & Components.pdf
geethak285
 
Simplify Your FME Flow Setup: Fault-Tolerant Deployment Made Easy with Packer...
Safe Software
 
Reimaginando la Ciberdefensa: De Copilots a Redes de Agentes
Cristian Garcia G.
 
Automating the Geo-Referencing of Historic Aerial Photography in Flanders
Safe Software
 
Pipeline Industry IoT - Real Time Data Monitoring
Safe Software
 
Hello I'm "AI" Your New _________________
Dr. Tathagat Varma
 
How to Comply With Saudi Arabia’s National Cybersecurity Regulations.pdf
Bluechip Advanced Technologies
 
Proactive Server and System Monitoring with FME: Using HTTP and System Caller...
Safe Software
 
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
TrustArc Webinar - Navigating APAC Data Privacy Laws: Compliance & Challenges
TrustArc
 
Smarter Governance with AI: What Every Board Needs to Know
OnBoard
 
MARTSIA: A Tool for Confidential Data Exchange via Public Blockchain - Poster...
Michele Kryston
 
DoS Attack vs DDoS Attack_ The Silent Wars of the Internet.pdf
CyberPro Magazine
 
01_Approach Cyber- DORA Incident Management.pptx
FinTech Belgium
 
Enhancing Environmental Monitoring with Real-Time Data Integration: Leveragin...
Safe Software
 
My Journey from CAD to BIM: A True Underdog Story
Safe Software
 
2025 HackRedCon Cyber Career Paths.pptx Scott Stanton
Scott Stanton
 
Darley - FIRST Copenhagen Lightning Talk (2025-06-26) Epochalypse 2038 - Time...
treyka
 
Understanding AI Optimization AIO, LLMO, and GEO
CoDigital
 
Ad

Elasticsearch python

Editor's Notes

  • #3: Find most relevant documents with our search terms. Search has to know of the documents existence, index the document, know the relevancy of the document, present/retrieve searched document by level of relevancy. Data is tokenized to individual terms extracted (from text to words) and stored in a data structure called the inverted index.
  • #4: Inverted index is the heart of every search engine
  • #8: Nosql - document based. Has no tables and columns, no schemas or structured query language for searching. Apache Lucene - High performance indexing and search library with full text search engine
  • #10: Easy to scale horizontally
  • #13: Index is like a database Clustering is what makes elasticsearch easy to scale horizontally
  • #14: SQL table with schema Documents in JSON format
  • #16: Scaling elasticsearch