SlideShare a Scribd company logo
PROCESSING TEXT
WITH REGEX
WHY IS REGEX NECESSARY?
 Question: What does the following script do?
def isPhoneNUmber(text):
if len(text) != 12:
return False
for i in range(0, 3):
if not text[i].isdecimal():
return False
if text[3] != ‘-’:
return False
for i in range(4, 7):
if not text[i].isdecimal():
return False
if text[7] != ‘-’:
return False
for i in range(8, 12):
if not text[i].isdecimal():
return False
return True
WHY IS REGEX NECESSARY?
 Question: What about this one?
message = raw_input(“Enter a string”)
for I in range(len(message)):
chunk = message[i: i+12]
if (isPhoneNumber(chunk):
print “Phone number found: “ + chunk
Print “Done”
ARE THEY THAT IMPORTANT?
 Regular Expressions as we have previously discussed are dynamic
descriptive patterns designated for searching (pattern recognition).
 Ex.
 Without regular expressions you are hard coding fixed values to
search for
 = vs. like
INCORPORATING REGEX IN
PYTHON
 Python once again makes life simpler by having a prebuilt module to
simplify incorporating the code into your scripts.
 Enter the re module
 Code: import re
 There are 2 benefits to using the re module:
1) Predefined Functions: compile(), search(), findall()
2) The RegEx syntax is almost identical to Perl
PYTHON’S REGEX CHEAT SHEET
COMPILING A REGEX EXPRESSION
 Each iteration that a RegEx expression is used in python must be reread
and interepretted.
 Thus if you were to search through an entire document each line would
have to reinterpret the expression.
 This can cause increased execution times and inefficiency.
 The ‘re’ module has a function that will compile the expression for easy
reusability.
 Code: varName = re.compile(REGEX EXPRESSION)
 Ex. phoneNumRegEx = re.compile(“ddd-ddd-dddd”)
THE SEARCH FUNCTION
 The search() function will search a document for the first occurrence of
the pattern.
 It will return a True or False value depending on if there was a match to
the pattern.
 Code: compExpVar.search(TEXT)
 Ex. phNumRegEx = re.compile(“ddd-ddd-dddd”)
mo = phNumRegEx.search(“Here is 444-343-3243”)
print mo
print mo.group()
LET’S FIND EVERYTHING
 In addition to the search() function, the ‘re’ module also has a findall()
function.
 findall() will return all of the strings that match the RegEx expression.
 Code: compExpVar.findall(TEXT)
 Ex. phNumRegEx = re.compile(“ddd”)
mo = phNumRegEx.findall(“Here is 444-343-3243”)
print mo

More Related Content

What's hot (20)

Regular Expressions in Java
Regular Expressions in Java
OblivionWalker
 
Regular Expressions
Regular Expressions
Satya Narayana
 
Regular expressions in oracle
Regular expressions in oracle
Logan Palanisamy
 
Finaal application on regular expression
Finaal application on regular expression
Gagan019
 
Regular Expressions grep and egrep
Regular Expressions grep and egrep
Tri Truong
 
11. using regular expressions with oracle database
11. using regular expressions with oracle database
Amrit Kaur
 
Operator precedence
Operator precedence
Akshaya Arunan
 
The Power of Regular Expression: use in notepad++
The Power of Regular Expression: use in notepad++
Anjesh Tuladhar
 
Textpad and Regular Expressions
Textpad and Regular Expressions
OCSI
 
Strings in Python
Strings in Python
nitamhaske
 
Python - Lecture 7
Python - Lecture 7
Ravi Kiran Khareedi
 
Regular expression
Regular expression
Larry Nung
 
15 practical grep command examples in linux
15 practical grep command examples in linux
Teja Bheemanapally
 
Bottom up parser
Bottom up parser
Akshaya Arunan
 
Regex Presentation
Regex Presentation
arnolambert
 
Les08
Les08
Sudharsan S
 
Andrei's Regex Clinic
Andrei's Regex Clinic
Andrei Zmievski
 
Grep
Grep
Dr.M.Karthika parthasarathy
 
Looking for Patterns
Looking for Patterns
Keith Wright
 
Introduction to R for beginners
Introduction to R for beginners
Abishek Purushothaman
 
Regular Expressions in Java
Regular Expressions in Java
OblivionWalker
 
Regular expressions in oracle
Regular expressions in oracle
Logan Palanisamy
 
Finaal application on regular expression
Finaal application on regular expression
Gagan019
 
Regular Expressions grep and egrep
Regular Expressions grep and egrep
Tri Truong
 
11. using regular expressions with oracle database
11. using regular expressions with oracle database
Amrit Kaur
 
The Power of Regular Expression: use in notepad++
The Power of Regular Expression: use in notepad++
Anjesh Tuladhar
 
Textpad and Regular Expressions
Textpad and Regular Expressions
OCSI
 
Strings in Python
Strings in Python
nitamhaske
 
Regular expression
Regular expression
Larry Nung
 
15 practical grep command examples in linux
15 practical grep command examples in linux
Teja Bheemanapally
 
Regex Presentation
Regex Presentation
arnolambert
 
Looking for Patterns
Looking for Patterns
Keith Wright
 

Viewers also liked (19)

Python Basics
Python Basics
primeteacher32
 
Reading and Writing Files
Reading and Writing Files
primeteacher32
 
CSV File Manipulation
CSV File Manipulation
primeteacher32
 
Json
Json
primeteacher32
 
44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...
44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...
44CON
 
Sending Email
Sending Email
primeteacher32
 
Regular Expression
Regular Expression
Syed Owais Ali Chishti
 
More Perl Basics
More Perl Basics
primeteacher32
 
Processing with Regular Expressions
Processing with Regular Expressions
primeteacher32
 
More Pattern Matching With RegEx
More Pattern Matching With RegEx
primeteacher32
 
Matching with Regular Expressions
Matching with Regular Expressions
primeteacher32
 
File I/O
File I/O
primeteacher32
 
Passing Arguments
Passing Arguments
primeteacher32
 
Regular Expressions
Regular Expressions
primeteacher32
 
Regular Expressions
Regular Expressions
Niek Schmoller
 
Subroutines
Subroutines
primeteacher32
 
Field Extractions: Making Regex Your Buddy
Field Extractions: Making Regex Your Buddy
Michael Wilde
 
Web Scraping
Web Scraping
primeteacher32
 
Examining Linux File Structures
Examining Linux File Structures
primeteacher32
 
Reading and Writing Files
Reading and Writing Files
primeteacher32
 
44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...
44CON London 2015 - Indicators of Compromise: From malware analysis to eradic...
44CON
 
Processing with Regular Expressions
Processing with Regular Expressions
primeteacher32
 
More Pattern Matching With RegEx
More Pattern Matching With RegEx
primeteacher32
 
Matching with Regular Expressions
Matching with Regular Expressions
primeteacher32
 
Field Extractions: Making Regex Your Buddy
Field Extractions: Making Regex Your Buddy
Michael Wilde
 
Examining Linux File Structures
Examining Linux File Structures
primeteacher32
 
Ad

Similar to Processing Regex Python (20)

Python regular expressions
Python regular expressions
Krishna Nanda
 
regular-expression.pdf
regular-expression.pdf
DarellMuchoko
 
Regular expressions
Regular expressions
Raghu nath
 
Regular expression in python for students
Regular expression in python for students
Manoj PAtil
 
Python : Regular expressions
Python : Regular expressions
Emertxe Information Technologies Pvt Ltd
 
Python Programming - XI. String Manipulation and Regular Expressions
Python Programming - XI. String Manipulation and Regular Expressions
Ranel Padon
 
A3 sec -_regular_expressions
A3 sec -_regular_expressions
a3sec
 
Module 3 - Regular Expressions, Dictionaries.pdf
Module 3 - Regular Expressions, Dictionaries.pdf
GaneshRaghu4
 
Pythonlearn-11-Regex.pptx
Pythonlearn-11-Regex.pptx
Dave Tan
 
Python Regular Expressions
Python Regular Expressions
BMS Institute of Technology and Management
 
Python lec5
Python lec5
Swarup Ghosh
 
UNIT-4( pythonRegular Expressions) (3).pptx
UNIT-4( pythonRegular Expressions) (3).pptx
YHarika2
 
Python- Regular expression
Python- Regular expression
Megha V
 
Python Regular Expressions
Python Regular Expressions
KALYAN KS
 
P3 2018 python_regexes
P3 2018 python_regexes
Prof. Wim Van Criekinge
 
Python_Regular Expression
Python_Regular Expression
Mohammed Sikander
 
unit-4 regular expression.pptx
unit-4 regular expression.pptx
PadreBhoj
 
Regular expressions, Alex Perry, Google, PyCon2014
Regular expressions, Alex Perry, Google, PyCon2014
alex_perry
 
Regular Expressions in Python.pptx
Regular Expressions in Python.pptx
Ramakrishna Reddy Bijjam
 
Regular Expressions
Regular Expressions
Akhil Kaushik
 
Python regular expressions
Python regular expressions
Krishna Nanda
 
regular-expression.pdf
regular-expression.pdf
DarellMuchoko
 
Regular expressions
Regular expressions
Raghu nath
 
Regular expression in python for students
Regular expression in python for students
Manoj PAtil
 
Python Programming - XI. String Manipulation and Regular Expressions
Python Programming - XI. String Manipulation and Regular Expressions
Ranel Padon
 
A3 sec -_regular_expressions
A3 sec -_regular_expressions
a3sec
 
Module 3 - Regular Expressions, Dictionaries.pdf
Module 3 - Regular Expressions, Dictionaries.pdf
GaneshRaghu4
 
Pythonlearn-11-Regex.pptx
Pythonlearn-11-Regex.pptx
Dave Tan
 
UNIT-4( pythonRegular Expressions) (3).pptx
UNIT-4( pythonRegular Expressions) (3).pptx
YHarika2
 
Python- Regular expression
Python- Regular expression
Megha V
 
Python Regular Expressions
Python Regular Expressions
KALYAN KS
 
unit-4 regular expression.pptx
unit-4 regular expression.pptx
PadreBhoj
 
Regular expressions, Alex Perry, Google, PyCon2014
Regular expressions, Alex Perry, Google, PyCon2014
alex_perry
 
Ad

More from primeteacher32 (20)

Software Development Life Cycle
Software Development Life Cycle
primeteacher32
 
Variable Scope
Variable Scope
primeteacher32
 
Returning Data
Returning Data
primeteacher32
 
Intro to Functions
Intro to Functions
primeteacher32
 
Introduction to GUIs with guizero
Introduction to GUIs with guizero
primeteacher32
 
Function Parameters
Function Parameters
primeteacher32
 
Nested Loops
Nested Loops
primeteacher32
 
Conditional Loops
Conditional Loops
primeteacher32
 
Introduction to Repetition Structures
Introduction to Repetition Structures
primeteacher32
 
Input Validation
Input Validation
primeteacher32
 
Windows File Systems
Windows File Systems
primeteacher32
 
Nesting Conditionals
Nesting Conditionals
primeteacher32
 
Conditionals
Conditionals
primeteacher32
 
Intro to Python with GPIO
Intro to Python with GPIO
primeteacher32
 
Variables and Statements
Variables and Statements
primeteacher32
 
Variables and User Input
Variables and User Input
primeteacher32
 
Intro to Python
Intro to Python
primeteacher32
 
Raspberry Pi
Raspberry Pi
primeteacher32
 
Hardware vs. Software Presentations
Hardware vs. Software Presentations
primeteacher32
 
Block chain security
Block chain security
primeteacher32
 

Recently uploaded (20)

Algebra fjuvrufguisdhvjbfjivjgrbvnbvjhu
Algebra fjuvrufguisdhvjbfjivjgrbvnbvjhu
mikylamongale
 
2025 English CV Sigve Hamilton Aspelund.docx
2025 English CV Sigve Hamilton Aspelund.docx
Sigve Hamilton Aspelund
 
Using Social Media in Job Search June 2025
Using Social Media in Job Search June 2025
Bruce Bennett
 
suruuuuuuuuxdvvvvvvvvvvvvvv ssssssrnbn bvcbvc
suruuuuuuuuxdvvvvvvvvvvvvvv ssssssrnbn bvcbvc
dineshkumarengg
 
PEACH Jobs Board - (Updated on June 12th)
PEACH Jobs Board - (Updated on June 12th)
PEACHOrgnization
 
Genomic study in fruit crops, coverse new investigation on Genomics and Genet...
Genomic study in fruit crops, coverse new investigation on Genomics and Genet...
7300511143
 
ad seminar new for seminar 2 presentation
ad seminar new for seminar 2 presentation
dineshkumarengg
 
cfe2-m-102-rounding-to-10-and-100-with-real-world-contexts-powerpoint_ver_1.ppt
cfe2-m-102-rounding-to-10-and-100-with-real-world-contexts-powerpoint_ver_1.ppt
yaminaung13112000
 
Environtal project study science of environment.pptx
Environtal project study science of environment.pptx
yskd364858g
 
Gives a structured overview of the skills measured in the DP-700 exam
Gives a structured overview of the skills measured in the DP-700 exam
thehulk1299
 
Latest Avaya 71301T Exam Prep | Questions
Latest Avaya 71301T Exam Prep | Questions
sabrina pinto
 
How to Become a CPA USA and Boost Your Career
How to Become a CPA USA and Boost Your Career
ipfcadwords
 
Interviewing Techniques updated version.ppt
Interviewing Techniques updated version.ppt
mandiikkj
 
最新版美国埃默里大学毕业证(Emory毕业证书)原版定制
最新版美国埃默里大学毕业证(Emory毕业证书)原版定制
Taqyea
 
The best Strategies for Developing your Resume
The best Strategies for Developing your Resume
marcojaramillohenao0
 
8queensproblemusingbacktracking-120903114053-phpapp01.pptx
8queensproblemusingbacktracking-120903114053-phpapp01.pptx
halderdhrubo6
 
Major emphasis on precocity in fruit breeding.pptx
Major emphasis on precocity in fruit breeding.pptx
7300511143
 
最新版美国休斯顿大学毕业证(UH毕业证书)原版定制
最新版美国休斯顿大学毕业证(UH毕业证书)原版定制
Taqyea
 
最新版西班牙拉古纳大学毕业证(ULL毕业证书)原版定制
最新版西班牙拉古纳大学毕业证(ULL毕业证书)原版定制
Taqyea
 
Modern trends in Fruit Breedings , A new techniques of fruit Breedings
Modern trends in Fruit Breedings , A new techniques of fruit Breedings
7300511143
 
Algebra fjuvrufguisdhvjbfjivjgrbvnbvjhu
Algebra fjuvrufguisdhvjbfjivjgrbvnbvjhu
mikylamongale
 
2025 English CV Sigve Hamilton Aspelund.docx
2025 English CV Sigve Hamilton Aspelund.docx
Sigve Hamilton Aspelund
 
Using Social Media in Job Search June 2025
Using Social Media in Job Search June 2025
Bruce Bennett
 
suruuuuuuuuxdvvvvvvvvvvvvvv ssssssrnbn bvcbvc
suruuuuuuuuxdvvvvvvvvvvvvvv ssssssrnbn bvcbvc
dineshkumarengg
 
PEACH Jobs Board - (Updated on June 12th)
PEACH Jobs Board - (Updated on June 12th)
PEACHOrgnization
 
Genomic study in fruit crops, coverse new investigation on Genomics and Genet...
Genomic study in fruit crops, coverse new investigation on Genomics and Genet...
7300511143
 
ad seminar new for seminar 2 presentation
ad seminar new for seminar 2 presentation
dineshkumarengg
 
cfe2-m-102-rounding-to-10-and-100-with-real-world-contexts-powerpoint_ver_1.ppt
cfe2-m-102-rounding-to-10-and-100-with-real-world-contexts-powerpoint_ver_1.ppt
yaminaung13112000
 
Environtal project study science of environment.pptx
Environtal project study science of environment.pptx
yskd364858g
 
Gives a structured overview of the skills measured in the DP-700 exam
Gives a structured overview of the skills measured in the DP-700 exam
thehulk1299
 
Latest Avaya 71301T Exam Prep | Questions
Latest Avaya 71301T Exam Prep | Questions
sabrina pinto
 
How to Become a CPA USA and Boost Your Career
How to Become a CPA USA and Boost Your Career
ipfcadwords
 
Interviewing Techniques updated version.ppt
Interviewing Techniques updated version.ppt
mandiikkj
 
最新版美国埃默里大学毕业证(Emory毕业证书)原版定制
最新版美国埃默里大学毕业证(Emory毕业证书)原版定制
Taqyea
 
The best Strategies for Developing your Resume
The best Strategies for Developing your Resume
marcojaramillohenao0
 
8queensproblemusingbacktracking-120903114053-phpapp01.pptx
8queensproblemusingbacktracking-120903114053-phpapp01.pptx
halderdhrubo6
 
Major emphasis on precocity in fruit breeding.pptx
Major emphasis on precocity in fruit breeding.pptx
7300511143
 
最新版美国休斯顿大学毕业证(UH毕业证书)原版定制
最新版美国休斯顿大学毕业证(UH毕业证书)原版定制
Taqyea
 
最新版西班牙拉古纳大学毕业证(ULL毕业证书)原版定制
最新版西班牙拉古纳大学毕业证(ULL毕业证书)原版定制
Taqyea
 
Modern trends in Fruit Breedings , A new techniques of fruit Breedings
Modern trends in Fruit Breedings , A new techniques of fruit Breedings
7300511143
 

Processing Regex Python

  • 2. WHY IS REGEX NECESSARY?  Question: What does the following script do? def isPhoneNUmber(text): if len(text) != 12: return False for i in range(0, 3): if not text[i].isdecimal(): return False if text[3] != ‘-’: return False for i in range(4, 7): if not text[i].isdecimal(): return False if text[7] != ‘-’: return False for i in range(8, 12): if not text[i].isdecimal(): return False return True
  • 3. WHY IS REGEX NECESSARY?  Question: What about this one? message = raw_input(“Enter a string”) for I in range(len(message)): chunk = message[i: i+12] if (isPhoneNumber(chunk): print “Phone number found: “ + chunk Print “Done”
  • 4. ARE THEY THAT IMPORTANT?  Regular Expressions as we have previously discussed are dynamic descriptive patterns designated for searching (pattern recognition).  Ex.  Without regular expressions you are hard coding fixed values to search for  = vs. like
  • 5. INCORPORATING REGEX IN PYTHON  Python once again makes life simpler by having a prebuilt module to simplify incorporating the code into your scripts.  Enter the re module  Code: import re  There are 2 benefits to using the re module: 1) Predefined Functions: compile(), search(), findall() 2) The RegEx syntax is almost identical to Perl
  • 7. COMPILING A REGEX EXPRESSION  Each iteration that a RegEx expression is used in python must be reread and interepretted.  Thus if you were to search through an entire document each line would have to reinterpret the expression.  This can cause increased execution times and inefficiency.  The ‘re’ module has a function that will compile the expression for easy reusability.  Code: varName = re.compile(REGEX EXPRESSION)  Ex. phoneNumRegEx = re.compile(“ddd-ddd-dddd”)
  • 8. THE SEARCH FUNCTION  The search() function will search a document for the first occurrence of the pattern.  It will return a True or False value depending on if there was a match to the pattern.  Code: compExpVar.search(TEXT)  Ex. phNumRegEx = re.compile(“ddd-ddd-dddd”) mo = phNumRegEx.search(“Here is 444-343-3243”) print mo print mo.group()
  • 9. LET’S FIND EVERYTHING  In addition to the search() function, the ‘re’ module also has a findall() function.  findall() will return all of the strings that match the RegEx expression.  Code: compExpVar.findall(TEXT)  Ex. phNumRegEx = re.compile(“ddd”) mo = phNumRegEx.findall(“Here is 444-343-3243”) print mo