SlideShare a Scribd company logo
Using Sphinx
for Search
Mike Lively
Slickdeals, LLC
What is Sphinx?
• A full-text search engine
• Quickly get high quality (relevant) results
• Designed to integrate well with SQL RDBMS
• Can work with any data source
• Can be queried using either an API or SQL
How do I know anything
about Sphinx?
• Manager of Software Architecture for
Slickdeals.net
• Alexa top 150 site (in the US)
• Have been working at improving our Sphinx
search engine for the last 2 months or so.
• Over 7 Million searches a month directly through
the interface, lots more happen indirectly.
When should I use Sphinx?
• Site / Product / Document searches
• Auto-suggest / Auto-Correct functionality
• Finding relevant and related items
Simple Architecture
• Often, search is offloaded
straight to the database
• Search goes to the backend
which performs queries on the
database
• Obviously very easy to
implement
Simple Architecture
• Simple “starts with” searches
on indexed fields can
sometimes work: `city` LIKE
‘Las%’
• Anything else will lock your
database for writes with
MyISAM.
• MySQL is not a great or
flexible full text engine
• It can sometimes be adequate
Sphinx Architecture
• Searchd is responsible for
receiving requests from
clients and executing the
searches against the sphinx
index.
• Indexer is responsible for
getting data into the sphinx
index.
• This separation allows
indexing and searching to be
scaled separately.
Sphinx Architecture
• Searchd has a binary protocol
for which there are several
clients available in multiple
languages.
• Searchd is also binary
compatible with MySQL’s
protocol since mysql 4.1
• Searchd is a daemon that
runs on your search servers
Sphinx Architecture
• Indexer is a shell program that
you can execute to build any
number of indexes.
• Can handle index rotation for
live indexing
Not So Quick Side Note
MySQL IS SLOWWWWWWWWWWWWW
(at text matches)
Still Not Quick Side Note
Indexes won’t help you…
Quicker Side Note
Full Text Search isn’t so bad
IF….
Sphinx Concepts
• Sphinx Indexes “Documents”
• Each document has a unique unsigned, non-
zero integer ID (either 32 bit or 64 bit space)
• Each document has one or more fields
• Each document has zero or more attributes
Indexes / Sources
• Sphinx indexes are created from one or more
sources.
• The source can be a database, xml, or tsv
stream.
• You can use multiple sources
• This is useful for maintaining updated indexes
• Also used to implement a sphinx cluster
Sphinx Fields
• Fields are what the full text index is comprised of.
• When searching you can search against any number
of fields.
• You can assign different relevancy weights to different
fields.
• The original value of a field is never stored by Sphinx.
• You should always have at least one.
Sphinx Attributes
• data that helps further describe the item being
indexed
• Can be returned as a part of the search
• Useful for filtering and sorting results
• These are not a part of the full text index.
MySQL Full Text Search
• You can get away with MyISAM tables or as of
version 5.6 InnoDB.
• You don’t care about morphology (think plurals)
• You don’t need anything but the most basic of
search operators
Creating An Index
• We are going to add an index that sources a
mysql database.
• The data being sourced is a list of the titles of
wikipedia posts.
Creating An Index
Indexer Configuration
• We are going to be peaking into a sphinx
configuration file now.
• You can rebuild the config file by concatenating
each section into a single file.
• On my VM this file is located in /usr/local/etc/
sphinx.conf
Source Definition
Source Definition
Defines the connection information
Connection information
• Ideally, you should create a
separate account for sphinx
• You can also connect via unix
socket
• I didn’t specify it here, but you
can also add a port.
Source Definition
The query that pulls data to populate the index
Source Index
• The index query MUST return
the id field as the first column
• Remember, the id needs to be
a unique, unsigned 64 bit (or
less number)
• The query must be on a single
line. Unless you escape new
lines with back slashes.
• Notice that we converted the
timestamp into a unix
timestamp. That is important.
Source Definition
How data is stored in the index
Source Fields
• The first column in the query is
always the ID.
• You specify any columns that
are attributes.
• Remember, attributes are
stored in the index as fields
that can be used to filter and
sort by.
• Any field besides the id that is
not specified as an attribute, is
assumed to be a text field (title)
Index Definition
Index Definition
• An Index includes one or
more sources.
• Each source gets it’s own
“source” line
• Multiple sources must all
define the same fields and
attributes.
• The ids need to be unique
across resources
Index Definition
• path is not actually a path, it’s
a filename with no extension.
• docinfo dictates if attributes
are stored in the index or
outside of the index.
• dict is not really important
now. Used to be either crc or
keywords. Now crc is
deprecated.
• min_word_len is the minimum
length of words to index
Rest of the Index Configuration
It’s time to build the index
indexer <index name>
Searching the Index
• searchd is the daemon that searches the index
• Binary Protocol



OR
• MySQL Compatible too!
searchd config
Included in the same config file as the rest
Spinning up searchd
–Sphinx
“I know MySQL”
MySQL Compatible
MySQL Compatible
• Tables == Indexes
• SHOW TABLES…Shows indexes.
• Select * From <index> works too.
Selecting from an index
Querying Indexes
• Default limit of 20 rows
• Notice the text fields are not
returned…
• They would be if we made
them attributes
(sql_field_string)
Querying Indexes
• The magic function in
SphinxQL is match()
• match() performs a full text
search against the entire
index…usually
• The ‘@field’ operator can
isolate which field is searched
on.
Querying Indexes
• You can query against
attributes
• You can sort results
• You can use the weight()
function to determine
relevancy.
Querying Indexes
• The 25387283 title was more
relevant because it matched
on the term “testing”
Getting PHP into the mix
• All we need? PDO.
• We will build a basic search page
• Accepts a query, displays up to 100 matching
results by relevancy with the matching keywords
highlighted.
Using Sphinx for Search in PHP
Pulling data from Sphinx
Fetching the data from Mysql
Adding the fancy yellow highlighting
The rest is pretty basic…
Cool things we would talk about
if I had like…3 more hours
• Auto-suggest, Auto-correct
• More on lemmatization and stemming
• Distributed Sphinx Clustering
• Delta indexes
• Real Time Indexes
• The plethora of operators you can use
• Ranged Queries
• ………
Additional Information
• The sphinx documentation is actually pretty
great
• http://sphinxsearch.com/docs/
• Slides are already on Slideshare
• Will link them to the meet up shortly
Questions?

More Related Content

What's hot (20)

PDF
Http security response headers
mohammadhosseinrouha
 
PPTX
Debugging .NET apps
Tess Ferrandez
 
PDF
X-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS Filter
Masato Kinugawa
 
PDF
Fuzzy Matching on Apache Spark with Jennifer Shin
Databricks
 
PDF
Ekoparty 2017 - The Bug Hunter's Methodology
bugcrowd
 
PDF
Cyber Threat hunting workshop
Arpan Raval
 
PPTX
Outlook and Exchange for the bad guys
Nick Landers
 
PDF
Little Big Data #1. 바닥부터 시작하는 데이터 인프라
Seongyun Byeon
 
PDF
Kubernates를 위한 Chaos Engineering in Action :: 윤석찬 (AWS 테크에반젤리스트)
Channy Yun
 
PPTX
Deep understanding on Cross-Site Scripting and SQL Injection
Vishal Kumar
 
PDF
MongoDB performance
Mydbops
 
PDF
Hunting for security bugs in AEM webapps
Mikhail Egorov
 
PPTX
Cross-Site Scripting (XSS)
Daniel Tumser
 
PDF
Cross site scripting
n|u - The Open Security Community
 
PDF
pandas - Python Data Analysis
Andrew Henshaw
 
PDF
Cobrix – a COBOL Data Source for Spark
DataWorks Summit
 
PPTX
OSINT for Proactive Defense - RootConf 2019
RedHunt Labs
 
PPTX
OWASP ZAP Workshop for QA Testers
Javan Rasokat
 
PDF
Neat tricks to bypass CSRF-protection
Mikhail Egorov
 
PPTX
Air traffic controller - Streams Processing meetup
Ed Yakabosky
 
Http security response headers
mohammadhosseinrouha
 
Debugging .NET apps
Tess Ferrandez
 
X-XSS-Nightmare: 1; mode=attack XSS Attacks Exploiting XSS Filter
Masato Kinugawa
 
Fuzzy Matching on Apache Spark with Jennifer Shin
Databricks
 
Ekoparty 2017 - The Bug Hunter's Methodology
bugcrowd
 
Cyber Threat hunting workshop
Arpan Raval
 
Outlook and Exchange for the bad guys
Nick Landers
 
Little Big Data #1. 바닥부터 시작하는 데이터 인프라
Seongyun Byeon
 
Kubernates를 위한 Chaos Engineering in Action :: 윤석찬 (AWS 테크에반젤리스트)
Channy Yun
 
Deep understanding on Cross-Site Scripting and SQL Injection
Vishal Kumar
 
MongoDB performance
Mydbops
 
Hunting for security bugs in AEM webapps
Mikhail Egorov
 
Cross-Site Scripting (XSS)
Daniel Tumser
 
Cross site scripting
n|u - The Open Security Community
 
pandas - Python Data Analysis
Andrew Henshaw
 
Cobrix – a COBOL Data Source for Spark
DataWorks Summit
 
OSINT for Proactive Defense - RootConf 2019
RedHunt Labs
 
OWASP ZAP Workshop for QA Testers
Javan Rasokat
 
Neat tricks to bypass CSRF-protection
Mikhail Egorov
 
Air traffic controller - Streams Processing meetup
Ed Yakabosky
 

Viewers also liked (20)

PDF
Real time fulltext search with sphinx
Adrian Nuta
 
PDF
Advanced fulltext search with Sphinx
Adrian Nuta
 
PDF
Inverted files for text search engines
unyil96
 
ODP
Sphinx y su integracion con PHP
Asier Ramos Martinez
 
PDF
Tips for Tuning Solr Search: No Coding Required
Acquia
 
PDF
Real-time индексы (Ярослав Ворожко)
Ontico
 
PPSX
CARTAGENA - LORCA
Manel Cantos
 
PDF
Transition to a secure and low-carbon Swiss energy system
IEA-ETSAP
 
PDF
Calendario efemérides ambientales
nicogrungelo
 
PDF
Hr tech trends
Confidential
 
PDF
How to Build Mobile Apps Fast with The Marketing App Cloud by Proscape
Proscape
 
DOCX
Ecologia miercoles
Julio Castro
 
PDF
`Kestrel global portfolio presentation 2015 05_08
Dominic Hardcastle
 
PDF
Heavy Metal Desde Cuba: ¿por Que Usted Debe preocuparse Acerca de la Hipnosis...
nola3clark6
 
PPTX
Tiendasvirtuales
veronik_gc
 
PPTX
TCILatinAmerica16 Producción y usos de producción y usos de proteína vegetal
TCI Network
 
PDF
Sprint 2016 Confianza Creativa (3de4) Jobs-to-be-Done
P3 Ventures
 
PDF
Nuevo folleto del master marketing politico UCV curso 2015-16
Silvia Moya Rozalén
 
PPTX
General presentation pshpp Hidro TARNITA
HIDRO TARNITA SA
 
Real time fulltext search with sphinx
Adrian Nuta
 
Advanced fulltext search with Sphinx
Adrian Nuta
 
Inverted files for text search engines
unyil96
 
Sphinx y su integracion con PHP
Asier Ramos Martinez
 
Tips for Tuning Solr Search: No Coding Required
Acquia
 
Real-time индексы (Ярослав Ворожко)
Ontico
 
CARTAGENA - LORCA
Manel Cantos
 
Transition to a secure and low-carbon Swiss energy system
IEA-ETSAP
 
Calendario efemérides ambientales
nicogrungelo
 
Hr tech trends
Confidential
 
How to Build Mobile Apps Fast with The Marketing App Cloud by Proscape
Proscape
 
Ecologia miercoles
Julio Castro
 
`Kestrel global portfolio presentation 2015 05_08
Dominic Hardcastle
 
Heavy Metal Desde Cuba: ¿por Que Usted Debe preocuparse Acerca de la Hipnosis...
nola3clark6
 
Tiendasvirtuales
veronik_gc
 
TCILatinAmerica16 Producción y usos de producción y usos de proteína vegetal
TCI Network
 
Sprint 2016 Confianza Creativa (3de4) Jobs-to-be-Done
P3 Ventures
 
Nuevo folleto del master marketing politico UCV curso 2015-16
Silvia Moya Rozalén
 
General presentation pshpp Hidro TARNITA
HIDRO TARNITA SA
 
Ad

Similar to Using Sphinx for Search in PHP (20)

PPTX
Sphinx
shinsantiger
 
PPT
SphinxSE with MySQL
Ritesh Puthran
 
PPTX
Sphinx2
shinsantiger
 
PDF
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
Pythian
 
PPTX
Sphinx - High performance full-text search for MySQL
Nguyen Van Vuong
 
PDF
Plugin Opensql2008 Sphinx
Liu Lizhi
 
PPT
ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...
ZFConf Conference
 
PDF
Sphinx new
rit2010
 
PDF
Sphinx: Leveraging Scalable Search in Drupal
elliando dias
 
PDF
PostgreSQL and Sphinx pgcon 2013
Emanuel Calvo
 
PDF
MariaDB with SphinxSE
Colin Charles
 
PPTX
Percona Live London 2014: Serve out any page with an HA Sphinx environment
spil-engineering
 
PDF
Scaling / optimizing search on netlog
removed_8e0e1d901e47de676f36b9b89e06dc97
 
PDF
Sphinx && Perl Houston Perl Mongers - May 8th, 2014
Brett Estrade
 
PDF
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Jeremy Zawodny
 
PPT
Using Thinking Sphinx with rails
Rishav Dixit
 
PPT
Phpconf2008 Sphinx En
Murugan Krishnamoorthy
 
PPT
Xapian vs sphinx
panjunyong
 
PPTX
Enhance WordPress Search Using Sphinx
Roshan Bhattarai
 
PDF
Full Text Search In PostgreSQL
Karwin Software Solutions LLC
 
Sphinx
shinsantiger
 
SphinxSE with MySQL
Ritesh Puthran
 
Sphinx2
shinsantiger
 
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
Pythian
 
Sphinx - High performance full-text search for MySQL
Nguyen Van Vuong
 
Plugin Opensql2008 Sphinx
Liu Lizhi
 
ZFConf 2011: Что такое Sphinx, зачем он вообще нужен и как его использовать с...
ZFConf Conference
 
Sphinx new
rit2010
 
Sphinx: Leveraging Scalable Search in Drupal
elliando dias
 
PostgreSQL and Sphinx pgcon 2013
Emanuel Calvo
 
MariaDB with SphinxSE
Colin Charles
 
Percona Live London 2014: Serve out any page with an HA Sphinx environment
spil-engineering
 
Scaling / optimizing search on netlog
removed_8e0e1d901e47de676f36b9b89e06dc97
 
Sphinx && Perl Houston Perl Mongers - May 8th, 2014
Brett Estrade
 
Realtime Search Infrastructure at Craigslist (OpenWest 2014)
Jeremy Zawodny
 
Using Thinking Sphinx with rails
Rishav Dixit
 
Phpconf2008 Sphinx En
Murugan Krishnamoorthy
 
Xapian vs sphinx
panjunyong
 
Enhance WordPress Search Using Sphinx
Roshan Bhattarai
 
Full Text Search In PostgreSQL
Karwin Software Solutions LLC
 
Ad

Recently uploaded (20)

PDF
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
PDF
Understanding The True Cost of DynamoDB Webinar
ScyllaDB
 
PDF
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 
PDF
Dev Dives: Accelerating agentic automation with Autopilot for Everyone
UiPathCommunity
 
PDF
Darley - FIRST Copenhagen Lightning Talk (2025-06-26) Epochalypse 2038 - Time...
treyka
 
PDF
Unlocking FME Flow’s Potential: Architecture Design for Modern Enterprises
Safe Software
 
PDF
TrustArc Webinar - Navigating APAC Data Privacy Laws: Compliance & Challenges
TrustArc
 
PDF
Plugging AI into everything: Model Context Protocol Simplified.pdf
Abati Adewale
 
PDF
Enhancing Environmental Monitoring with Real-Time Data Integration: Leveragin...
Safe Software
 
PPTX
Smart Factory Monitoring IIoT in Machine and Production Operations.pptx
Rejig Digital
 
PDF
Redefining Work in the Age of AI - What to expect? How to prepare? Why it mat...
Malinda Kapuruge
 
PDF
''Taming Explosive Growth: Building Resilience in a Hyper-Scaled Financial Pl...
Fwdays
 
PPTX
2025 HackRedCon Cyber Career Paths.pptx Scott Stanton
Scott Stanton
 
PDF
99 Bottles of Trust on the Wall — Operational Principles for Trust in Cyber C...
treyka
 
PDF
LLM Search Readiness Audit - Dentsu x SEO Square - June 2025.pdf
Nick Samuel
 
PDF
GDG Cloud Southlake #44: Eyal Bukchin: Tightening the Kubernetes Feedback Loo...
James Anderson
 
PDF
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
PDF
FME as an Orchestration Tool with Principles From Data Gravity
Safe Software
 
PDF
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
Edge AI and Vision Alliance
 
PPTX
Mastering Authorization: Integrating Authentication and Authorization Data in...
Hitachi, Ltd. OSS Solution Center.
 
Optimizing the trajectory of a wheel loader working in short loading cycles
Reno Filla
 
Understanding The True Cost of DynamoDB Webinar
ScyllaDB
 
Hyderabad MuleSoft In-Person Meetup (June 21, 2025) Slides
Ravi Tamada
 
Dev Dives: Accelerating agentic automation with Autopilot for Everyone
UiPathCommunity
 
Darley - FIRST Copenhagen Lightning Talk (2025-06-26) Epochalypse 2038 - Time...
treyka
 
Unlocking FME Flow’s Potential: Architecture Design for Modern Enterprises
Safe Software
 
TrustArc Webinar - Navigating APAC Data Privacy Laws: Compliance & Challenges
TrustArc
 
Plugging AI into everything: Model Context Protocol Simplified.pdf
Abati Adewale
 
Enhancing Environmental Monitoring with Real-Time Data Integration: Leveragin...
Safe Software
 
Smart Factory Monitoring IIoT in Machine and Production Operations.pptx
Rejig Digital
 
Redefining Work in the Age of AI - What to expect? How to prepare? Why it mat...
Malinda Kapuruge
 
''Taming Explosive Growth: Building Resilience in a Hyper-Scaled Financial Pl...
Fwdays
 
2025 HackRedCon Cyber Career Paths.pptx Scott Stanton
Scott Stanton
 
99 Bottles of Trust on the Wall — Operational Principles for Trust in Cyber C...
treyka
 
LLM Search Readiness Audit - Dentsu x SEO Square - June 2025.pdf
Nick Samuel
 
GDG Cloud Southlake #44: Eyal Bukchin: Tightening the Kubernetes Feedback Loo...
James Anderson
 
Quantum AI Discoveries: Fractal Patterns Consciousness and Cyclical Universes
Saikat Basu
 
FME as an Orchestration Tool with Principles From Data Gravity
Safe Software
 
“Scaling i.MX Applications Processors’ Native Edge AI with Discrete AI Accele...
Edge AI and Vision Alliance
 
Mastering Authorization: Integrating Authentication and Authorization Data in...
Hitachi, Ltd. OSS Solution Center.
 

Using Sphinx for Search in PHP

  • 1. Using Sphinx for Search Mike Lively Slickdeals, LLC
  • 2. What is Sphinx? • A full-text search engine • Quickly get high quality (relevant) results • Designed to integrate well with SQL RDBMS • Can work with any data source • Can be queried using either an API or SQL
  • 3. How do I know anything about Sphinx? • Manager of Software Architecture for Slickdeals.net • Alexa top 150 site (in the US) • Have been working at improving our Sphinx search engine for the last 2 months or so. • Over 7 Million searches a month directly through the interface, lots more happen indirectly.
  • 4. When should I use Sphinx? • Site / Product / Document searches • Auto-suggest / Auto-Correct functionality • Finding relevant and related items
  • 5. Simple Architecture • Often, search is offloaded straight to the database • Search goes to the backend which performs queries on the database • Obviously very easy to implement
  • 6. Simple Architecture • Simple “starts with” searches on indexed fields can sometimes work: `city` LIKE ‘Las%’ • Anything else will lock your database for writes with MyISAM. • MySQL is not a great or flexible full text engine • It can sometimes be adequate
  • 7. Sphinx Architecture • Searchd is responsible for receiving requests from clients and executing the searches against the sphinx index. • Indexer is responsible for getting data into the sphinx index. • This separation allows indexing and searching to be scaled separately.
  • 8. Sphinx Architecture • Searchd has a binary protocol for which there are several clients available in multiple languages. • Searchd is also binary compatible with MySQL’s protocol since mysql 4.1 • Searchd is a daemon that runs on your search servers
  • 9. Sphinx Architecture • Indexer is a shell program that you can execute to build any number of indexes. • Can handle index rotation for live indexing
  • 10. Not So Quick Side Note MySQL IS SLOWWWWWWWWWWWWW (at text matches)
  • 11. Still Not Quick Side Note Indexes won’t help you…
  • 12. Quicker Side Note Full Text Search isn’t so bad IF….
  • 13. Sphinx Concepts • Sphinx Indexes “Documents” • Each document has a unique unsigned, non- zero integer ID (either 32 bit or 64 bit space) • Each document has one or more fields • Each document has zero or more attributes
  • 14. Indexes / Sources • Sphinx indexes are created from one or more sources. • The source can be a database, xml, or tsv stream. • You can use multiple sources • This is useful for maintaining updated indexes • Also used to implement a sphinx cluster
  • 15. Sphinx Fields • Fields are what the full text index is comprised of. • When searching you can search against any number of fields. • You can assign different relevancy weights to different fields. • The original value of a field is never stored by Sphinx. • You should always have at least one.
  • 16. Sphinx Attributes • data that helps further describe the item being indexed • Can be returned as a part of the search • Useful for filtering and sorting results • These are not a part of the full text index.
  • 17. MySQL Full Text Search • You can get away with MyISAM tables or as of version 5.6 InnoDB. • You don’t care about morphology (think plurals) • You don’t need anything but the most basic of search operators
  • 18. Creating An Index • We are going to add an index that sources a mysql database. • The data being sourced is a list of the titles of wikipedia posts.
  • 20. Indexer Configuration • We are going to be peaking into a sphinx configuration file now. • You can rebuild the config file by concatenating each section into a single file. • On my VM this file is located in /usr/local/etc/ sphinx.conf
  • 22. Source Definition Defines the connection information
  • 23. Connection information • Ideally, you should create a separate account for sphinx • You can also connect via unix socket • I didn’t specify it here, but you can also add a port.
  • 24. Source Definition The query that pulls data to populate the index
  • 25. Source Index • The index query MUST return the id field as the first column • Remember, the id needs to be a unique, unsigned 64 bit (or less number) • The query must be on a single line. Unless you escape new lines with back slashes. • Notice that we converted the timestamp into a unix timestamp. That is important.
  • 26. Source Definition How data is stored in the index
  • 27. Source Fields • The first column in the query is always the ID. • You specify any columns that are attributes. • Remember, attributes are stored in the index as fields that can be used to filter and sort by. • Any field besides the id that is not specified as an attribute, is assumed to be a text field (title)
  • 29. Index Definition • An Index includes one or more sources. • Each source gets it’s own “source” line • Multiple sources must all define the same fields and attributes. • The ids need to be unique across resources
  • 30. Index Definition • path is not actually a path, it’s a filename with no extension. • docinfo dictates if attributes are stored in the index or outside of the index. • dict is not really important now. Used to be either crc or keywords. Now crc is deprecated. • min_word_len is the minimum length of words to index
  • 31. Rest of the Index Configuration
  • 32. It’s time to build the index indexer <index name>
  • 33. Searching the Index • searchd is the daemon that searches the index • Binary Protocol
 
 OR • MySQL Compatible too!
  • 34. searchd config Included in the same config file as the rest
  • 38. MySQL Compatible • Tables == Indexes • SHOW TABLES…Shows indexes. • Select * From <index> works too.
  • 40. Querying Indexes • Default limit of 20 rows • Notice the text fields are not returned… • They would be if we made them attributes (sql_field_string)
  • 41. Querying Indexes • The magic function in SphinxQL is match() • match() performs a full text search against the entire index…usually • The ‘@field’ operator can isolate which field is searched on.
  • 42. Querying Indexes • You can query against attributes • You can sort results • You can use the weight() function to determine relevancy.
  • 43. Querying Indexes • The 25387283 title was more relevant because it matched on the term “testing”
  • 44. Getting PHP into the mix • All we need? PDO. • We will build a basic search page • Accepts a query, displays up to 100 matching results by relevancy with the matching keywords highlighted.
  • 47. Fetching the data from Mysql
  • 48. Adding the fancy yellow highlighting
  • 49. The rest is pretty basic…
  • 50. Cool things we would talk about if I had like…3 more hours • Auto-suggest, Auto-correct • More on lemmatization and stemming • Distributed Sphinx Clustering • Delta indexes • Real Time Indexes • The plethora of operators you can use • Ranged Queries • ………
  • 51. Additional Information • The sphinx documentation is actually pretty great • http://sphinxsearch.com/docs/ • Slides are already on Slideshare • Will link them to the meet up shortly