SlideShare a Scribd company logo
MODULE 3 – PART 4
REGULAR EXPRESSIONS
By,
Ravi Kumar B N
Assistant professor, Dept. of CSE
BMSIT & M
➢ Regular expression is a sequence of characters that define a search pattern.
➢ patterns are used by string searching algorithms for "find" or "find and
replace" operations on strings, or for input validation.
➢ The regular expression library “re” must be imported into our program before
we can use it.
INTRODUCTION
➢ search() function: used to search for a particular string. will only return the first occurrence that
matches the specified pattern.
This function is available in “re” library.
➢ the caret character (^) : is used in regular expressions to match the beginning of a line.
➢ The dollar character ($) : is used in regular expressions to match the end of a line.
Example: program to match only lines where “From:” is at the beginning of the line
import re
hand = open('mbox1.txt')
for line in hand:
line = line.rstrip()
if re.search('^From:', line) :
print(line)
#Output
From:stephen Sat Jan 5 09:14:16 2008
From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008
From:zqian@umich.edu Fri Jan 4 16:10:39 2008
mbox1.txt
From:stephen Sat Jan 5 09:14:16 2008
Return-Path: <postmaster@collab.sakaiproject.org>
From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008
Subject: [sakai] svn commit:
From:zqian@umich.edu Fri Jan 4 16:10:39 2008
Return-Path: <postmaster@collab.sakaiproject.org>
✓ The instruction re.search('^From:', line) equivalent with the startswith() method from the
string library.
SEARCH() FUNCTION:
➢ The dot character (.) : The most commonly used special character is the period (”dot”) or full
stop, which matches any character.
The regular expression “F..m:” would match any of the following strings since the period
characters in the regular expression match any character.
“From:”, “Fxxm:”, “F12m:”, or “F!@m:”
➢ The program in the previous slide is rewritten using dot character which gives the same output
CHARACTER MATCHING IN REGULAR
EXPRESSIONS
import re
hand = open('mbox1.txt')
for line in hand:
line = line.rstrip()
if re.search(‘^F..m:', line) :
print(line)
#Output
From:stephen Sat Jan 5 09:14:16 2008
From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008
From:zqian@umich.edu Fri Jan 4 16:10:39 2008
Character can be repeated any number of times using the “*” or “+” characters in a
regular expression.
➢ The Asterisk character (*) : matches zero-or-more characters
➢ The Plus character (+) : matches one-or-more characters
Example: Program to match lines that start with “From:”, followed by mail-id
import re
hand = open('mbox1.txt')
for line in hand:
line = line.rstrip()
if re.search(‘^From:.+@', line) :
print(line)
#Output
From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008
From:zqian@umich.edu Fri Jan 4 16:10:39 2008
✓ The search string “ˆFrom:.+@” will successfully match lines that start with “From:”, followed by one
or more characters (“.+”), followed by an at-sign. The “.+” wildcard matches all the characters
between the colon character and the at-sign.
➢ non-whitespace character (S) - matches one non-whitespace character
➢findall() function: It is used to search for “all” occurrences that match a given pattern.
In contrast, search() function will only return the first occurrence that matches the specified pattern.
import re
s = 'Hello from csev@umich.edu to cwen@iupui.edu about the meeting @2PM'
lst = re.findall('S+@S+', s)
print(lst)
#output
['csev@umich.edu', 'cwen@iupui.edu']
Example1: Program returns a list of all of the strings that look like email addresses from a given line.
# same program using search() it will display only first mail id or first
matching string
import re
s = 'Hello from csev@umich.edu to cwen@iupui.edu about the meeting @2PM'
lst = re.search('S+@S+', s)
print(lst)
#output
<re.Match object; span=(11, 25), match='csev@umich.edu'>
'S+@S+’ this regular expression
matches substrings that have at least one
non-whitespace character, followed by an
at-sign, followed by at least one more
non-whitespace character
Example2: Program returns a list of all of the strings that look like email addresses from a given file.
import re
hand = open('mbox1.txt')
for line in hand:
line = line.rstrip()
x = re.findall('S+@S+', line)
if len(x) > 0 :
print(x)
#Output
['<postmaster@collab.sakaiproject.org>']
['louis@media.berkeley.edu']
['zqian@umich.edu']
['<postmaster@collab.sakaiproject.org>']
➢ Square brackets “[]” : square brackets are used to indicate a set of multiple acceptable characters we
are willing to consider matching.
Example: [a-z] matches single lowercase letter
[A-Z] matches single uppercase letter
[a-zA-Z] matches single lowercase letter or uppercase letter
[a-zA-Z0-9] matches single lowercase letter or uppercase letter or number
Some of our email addresses have incorrect characters like
“<” or “;” at the beginning or end. we are only interested in
the portion of the string that starts and ends with a letter or
a number. To get the proper output we have to use following
character.
[amk] matches 'a', 'm', or ’k’
[(+*)] matches any of the literal characters ’(‘ , '+’, '*’, or ’)’
[0-5][0-9] matches all the two-digits numbers from 00 to 59
➢ Characters that are not within a range can be matched by complementing the set
If the first character of the set is '^', all the characters that are not in the set will be matched.
For example,
[^5] will match any character except ’5’
Ex: Program returns list of all email addresses in proper format.
import re
hand = open('mbox.txt')
for line in hand:
line = line.rstrip()
x = re.findall('[a-zA-Z0-9]S*@S*[a-zA-Z]', line)
if len(x) > 0 :
print(x)
#output
['postmaster@collab.sakaiproject.org']
['louis@media.berkeley.edu']
['zqian@umich.edu']
['postmaster@collab.sakaiproject.org']
[a-zA-Z0-9]S*@S*[a-zA-Z] : substrings that start with a
single lowercase letter, uppercase letter, or number “[a-zA-
Z0-9]”, followed by zero or more non-blank characters “S*”,
followed by an at-sign, followed by zero or more non-blank
characters “S*”, followed by an uppercase or lowercase
letter “[a-zA-Z]”.
SEARCH AND EXTRACT
import re
hand = open('mbox2.txt')
for line in hand:
line = line.rstrip()
if re.search('^XS*: [0-9.]+', line) :
print(line)
#Output
X-DSPAM-Confidence: 0.8475
X-DSPAM-Probability: 0.9245
Example1: Find numbers on lines that start with the string “X-”
lines such as: X-DSPAM-Confidence: 0.8475
➢ parentheses “()” in regular expression : used to extract a portion of the substring that
matches the regular expression.
import re
hand = open('mbox2.txt')
for line in hand:
line = line.rstrip()
x = re.findall('^XS*: ([0-9.]+)', line)
if len(x) > 0 :
print(x) Search
#Output
['0.8475’] Extract
['0.9245']
mbox2.txt
From: stephen.marquard@uct.ac.za
Subject: [sakai] svn commit: r39772 - content/branches/sakai_2-5-x/conten
impl/impl/src/java/org
X-Content-Type-Outer-Envelope: text/plain; charset=UTF-8
X-Content-Type-Message-Body: text/plain; charset=UTF-8
Content-Type: text/plain; charset=UTF-8
X-DSPAM-Result: Innocent
X-DSPAM-Processed: Sat Jan 5 09:14:16 2008
X-DSPAM-Confidence: 0.8475
X-DSPAM-Probability: 0.9245
Above output has entire line we only want to extract
numbers from lines that have the above syntax
import re
hand = open('mbox1.txt')
for line in hand:
line = line.rstrip()
x = re.findall('^From.* ([0-3][0-9]):', line)
if len(x) > 0 :
print(x)
#Output
['09']
['16']
['16']
Example2: Program to print the day of received mails
RANDOM EXECUTION
>>> s=" 0.9 .90 1.0 1. 138 pqr“
>>> re.findall('[0-9.]+',s)
['0.9', '.90', '1.0', '1.', '138’]
>>> re.findall('[0-9]+[.][0-9]',s)
['0.9', '1.0’]
>>> re.findall('[0-9]+[.][0-9]+',s)
['0.9', '1.0']
>>> re.findall('[0-9]*[.][0-9]+’,s)
['0.9', '.90', '1.0’]
>>> usn="1bycs123, 1byec249, 1bycs009, 1byme209, 1byis112, 1byee190“
>>> re.findall('1bycs...',usn)
['1bycs123', '1bycs009’]
>>> re.findall('[a-zA-Z0-9]+cs[0-9]+',usn)
['1bycs123', '1bycs009’]
>>> usn="1bycs123, 1byec249, 1bycs009, 1byme209, 1vecs112, 1svcs190"
>>> re.findall('[a-zA-Z0-9]+cs[0-9]+',usn)
['1bycs123', '1bycs009', '1vecs112', '1svcs190’]
>>> re.findall('[0-9]+cs[0-9]+',usn)
[]
>>> re.findall('[a-zA-Z0-9]+cs([0-9]+)',usn)
['123', '009', '112', '190']
ESCAPE CHARACTER
➢ Escape character (backslash "" ) is a metacharacter in regular expressions. It allow special
characters to be used without invoking their special meaning.
If you want to match 1+1=2, the correct regex is 1+1=2. Otherwise, the plus sign has a
special meaning.
For example, we can find money amounts with the following regular expression.
>>>import re
>>>x = 'We just received $10.00 for cookies.’
>>>y = re.findall(‘$[0-9.]+’,x)
>>> y
['$10.00']
SUMMARY
Character Meaning
ˆ Matches the beginning of the line
$ Matches the end of the line
. Matches any character (a wildcard)
s Matches a whitespace character
S Matches a non-whitespace character (opposite of s)
* Applies to the immediately preceding character and indicates to match zero or more of the
preceding character(s)
*? Applies to the immediately preceding character and indicates to match zero or more of the
preceding character(s) in “non-greedy mode”
+ Applies to the immediately preceding character and indicates to match one or more of the
preceding character(s)
+? Applies to the immediately preceding character and indicates to match one or more of the
preceding character(s) in “non-greedy mode”.
[aeiou] Matches a single character as long as that character is in the specified set. In this example, it would
match “a”, “e”, “i”, “o”, or “u”, but no other characters.
[a-z0-9] You can specify ranges of characters using the minus sign. This example is a single character that
must be a lowercase letter or a digit.
Character Meaning
[ˆA-Za-z] When the first character in the set notation is a caret, it inverts the logic. This example matches
a single character that is anything other than an uppercase or lowercase letter.
( ) When parentheses are added to a regular expression, they are ignored for the purpose of
matching, but allow you to extract a particular subset of the matched string rather than the
whole string when using findall()
b Matches the empty string, but only at the start or end of a word.
B Matches the empty string, but not at the start or end of a word
d Matches any decimal digit; equivalent to the set [0-9].
D Matches any non-digit character; equivalent to the set [ˆ0-9]
ASSIGNMENT
1) Write a python program to check the validity of a Password In this program, we will be taking a
password as a combination of alphanumeric characters along with special characters, and check whether
the password is valid or not with the help of few conditions.
Primary conditions for password validation :
1.Minimum 8 characters.
2.The alphabets must be between [a-z]
3.At least one alphabet should be of Upper Case [A-Z]
4.At least 1 number or digit between [0-9].
5.At least 1 character from [ _ or @ or $ ].
2) Write a pattern for the following:
Pattern to extract lines starting with the word From (or from) and ending with edu.
Pattern to extract lines ending with any digit.
Start with upper case letters and end with digits.
Search for the first white-space character in the string and display its position.
Replace every white-space character with the number 9: consider a sample text txt = "The rain in Spain"
THANK
YOU
Ad

More Related Content

What's hot (20)

Python programming : Classes objects
Python programming : Classes objectsPython programming : Classes objects
Python programming : Classes objects
Emertxe Information Technologies Pvt Ltd
 
linked list (c#)
 linked list (c#) linked list (c#)
linked list (c#)
swajahatr
 
DataFrame in Python Pandas
DataFrame in Python PandasDataFrame in Python Pandas
DataFrame in Python Pandas
Sangita Panchal
 
Python Pandas
Python PandasPython Pandas
Python Pandas
Sunil OS
 
Python Dictionaries and Sets
Python Dictionaries and SetsPython Dictionaries and Sets
Python Dictionaries and Sets
Nicole Ryan
 
Command Line Arguments in C#
Command Line Arguments in C#Command Line Arguments in C#
Command Line Arguments in C#
Ali Hassan
 
List in Python
List in PythonList in Python
List in Python
Sharath Ankrajegowda
 
Stack and Queue
Stack and Queue Stack and Queue
Stack and Queue
Apurbo Datta
 
structure and union
structure and unionstructure and union
structure and union
student
 
Strings
StringsStrings
Strings
Nilesh Dalvi
 
Arrays in python
Arrays in pythonArrays in python
Arrays in python
moazamali28
 
Python programming : Inheritance and polymorphism
Python programming : Inheritance and polymorphismPython programming : Inheritance and polymorphism
Python programming : Inheritance and polymorphism
Emertxe Information Technologies Pvt Ltd
 
Python Variable Types, List, Tuple, Dictionary
Python Variable Types, List, Tuple, DictionaryPython Variable Types, List, Tuple, Dictionary
Python Variable Types, List, Tuple, Dictionary
Soba Arjun
 
Python : Functions
Python : FunctionsPython : Functions
Python : Functions
Emertxe Information Technologies Pvt Ltd
 
Functions in python
Functions in pythonFunctions in python
Functions in python
colorsof
 
C pointer
C pointerC pointer
C pointer
University of Potsdam
 
Python programming : Arrays
Python programming : ArraysPython programming : Arrays
Python programming : Arrays
Emertxe Information Technologies Pvt Ltd
 
Python Modules
Python ModulesPython Modules
Python Modules
Nitin Reddy Katkam
 
Arrays
ArraysArrays
Arrays
SARITHA REDDY
 
Python Exception Handling
Python Exception HandlingPython Exception Handling
Python Exception Handling
Megha V
 

Similar to Python Regular Expressions (20)

Pythonlearn-11-Regex.pptx
Pythonlearn-11-Regex.pptxPythonlearn-11-Regex.pptx
Pythonlearn-11-Regex.pptx
Dave Tan
 
Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptx
DurgaNayak4
 
P3 2017 python_regexes
P3 2017 python_regexesP3 2017 python_regexes
P3 2017 python_regexes
Prof. Wim Van Criekinge
 
scanf function in c, variations in conversion specifier
scanf function in c, variations in conversion specifierscanf function in c, variations in conversion specifier
scanf function in c, variations in conversion specifier
herosaikiran
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracle
Logan Palanisamy
 
Regular Expressions 2007
Regular Expressions 2007Regular Expressions 2007
Regular Expressions 2007
Geoffrey Dunn
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
Raj Gupta
 
P3 2018 python_regexes
P3 2018 python_regexesP3 2018 python_regexes
P3 2018 python_regexes
Prof. Wim Van Criekinge
 
Strings brief introduction in python.pdf
Strings brief introduction in python.pdfStrings brief introduction in python.pdf
Strings brief introduction in python.pdf
TODAYIREAD1
 
For this assignment, download the A6 code pack. This zip fil.docx
For this assignment, download the A6 code pack. This zip fil.docxFor this assignment, download the A6 code pack. This zip fil.docx
For this assignment, download the A6 code pack. This zip fil.docx
alfred4lewis58146
 
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docx
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docxShad_Cryptography_PracticalFile_IT_4th_Year (1).docx
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docx
Sonu62614
 
php string part 4
php string part 4php string part 4
php string part 4
monikadeshmane
 
Python programming: Anonymous functions, String operations
Python programming: Anonymous functions, String operationsPython programming: Anonymous functions, String operations
Python programming: Anonymous functions, String operations
Megha V
 
regular-expression.pdf
regular-expression.pdfregular-expression.pdf
regular-expression.pdf
DarellMuchoko
 
unit-5 String Math Date Time AI presentation
unit-5 String Math Date Time AI presentationunit-5 String Math Date Time AI presentation
unit-5 String Math Date Time AI presentation
MukeshTheLioner
 
lecture_lex.pdf
lecture_lex.pdflecture_lex.pdf
lecture_lex.pdf
DrNilotpalChakrabort
 
Beginning with vi text editor
Beginning with vi text editorBeginning with vi text editor
Beginning with vi text editor
Jose Pla
 
Programming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAYProgramming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAY
vikram mahendra
 
Python Cheatsheet_A Quick Reference Guide for Data Science.pdf
Python Cheatsheet_A Quick Reference Guide for Data Science.pdfPython Cheatsheet_A Quick Reference Guide for Data Science.pdf
Python Cheatsheet_A Quick Reference Guide for Data Science.pdf
zayanchutiya
 
String in programming language in c or c++
String in programming language in c or c++String in programming language in c or c++
String in programming language in c or c++
Azeemaj101
 
Pythonlearn-11-Regex.pptx
Pythonlearn-11-Regex.pptxPythonlearn-11-Regex.pptx
Pythonlearn-11-Regex.pptx
Dave Tan
 
Regular_Expressions.pptx
Regular_Expressions.pptxRegular_Expressions.pptx
Regular_Expressions.pptx
DurgaNayak4
 
scanf function in c, variations in conversion specifier
scanf function in c, variations in conversion specifierscanf function in c, variations in conversion specifier
scanf function in c, variations in conversion specifier
herosaikiran
 
Regular expressions in oracle
Regular expressions in oracleRegular expressions in oracle
Regular expressions in oracle
Logan Palanisamy
 
Regular Expressions 2007
Regular Expressions 2007Regular Expressions 2007
Regular Expressions 2007
Geoffrey Dunn
 
Regular expressions
Regular expressionsRegular expressions
Regular expressions
Raj Gupta
 
Strings brief introduction in python.pdf
Strings brief introduction in python.pdfStrings brief introduction in python.pdf
Strings brief introduction in python.pdf
TODAYIREAD1
 
For this assignment, download the A6 code pack. This zip fil.docx
For this assignment, download the A6 code pack. This zip fil.docxFor this assignment, download the A6 code pack. This zip fil.docx
For this assignment, download the A6 code pack. This zip fil.docx
alfred4lewis58146
 
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docx
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docxShad_Cryptography_PracticalFile_IT_4th_Year (1).docx
Shad_Cryptography_PracticalFile_IT_4th_Year (1).docx
Sonu62614
 
Python programming: Anonymous functions, String operations
Python programming: Anonymous functions, String operationsPython programming: Anonymous functions, String operations
Python programming: Anonymous functions, String operations
Megha V
 
regular-expression.pdf
regular-expression.pdfregular-expression.pdf
regular-expression.pdf
DarellMuchoko
 
unit-5 String Math Date Time AI presentation
unit-5 String Math Date Time AI presentationunit-5 String Math Date Time AI presentation
unit-5 String Math Date Time AI presentation
MukeshTheLioner
 
Beginning with vi text editor
Beginning with vi text editorBeginning with vi text editor
Beginning with vi text editor
Jose Pla
 
Programming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAYProgramming in lua STRING AND ARRAY
Programming in lua STRING AND ARRAY
vikram mahendra
 
Python Cheatsheet_A Quick Reference Guide for Data Science.pdf
Python Cheatsheet_A Quick Reference Guide for Data Science.pdfPython Cheatsheet_A Quick Reference Guide for Data Science.pdf
Python Cheatsheet_A Quick Reference Guide for Data Science.pdf
zayanchutiya
 
String in programming language in c or c++
String in programming language in c or c++String in programming language in c or c++
String in programming language in c or c++
Azeemaj101
 
Ad

More from BMS Institute of Technology and Management (15)

Artificial Neural Networks: Introduction, Neural Network representation, Appr...
Artificial Neural Networks: Introduction, Neural Network representation, Appr...Artificial Neural Networks: Introduction, Neural Network representation, Appr...
Artificial Neural Networks: Introduction, Neural Network representation, Appr...
BMS Institute of Technology and Management
 
Decision Tree Learning: Decision tree representation, Appropriate problems fo...
Decision Tree Learning: Decision tree representation, Appropriate problems fo...Decision Tree Learning: Decision tree representation, Appropriate problems fo...
Decision Tree Learning: Decision tree representation, Appropriate problems fo...
BMS Institute of Technology and Management
 
Classification: MNIST, training a Binary classifier, performance measure, mul...
Classification: MNIST, training a Binary classifier, performance measure, mul...Classification: MNIST, training a Binary classifier, performance measure, mul...
Classification: MNIST, training a Binary classifier, performance measure, mul...
BMS Institute of Technology and Management
 
ML_Module1.Introduction_and_conceprtLearning_pptx.pptx
ML_Module1.Introduction_and_conceprtLearning_pptx.pptxML_Module1.Introduction_and_conceprtLearning_pptx.pptx
ML_Module1.Introduction_and_conceprtLearning_pptx.pptx
BMS Institute of Technology and Management
 
Software Engineering and Introduction, Activities and ProcessModels
Software Engineering and Introduction, Activities and ProcessModels Software Engineering and Introduction, Activities and ProcessModels
Software Engineering and Introduction, Activities and ProcessModels
BMS Institute of Technology and Management
 
Pytho_tuples
Pytho_tuplesPytho_tuples
Pytho_tuples
BMS Institute of Technology and Management
 
Pytho dictionaries
Pytho dictionaries Pytho dictionaries
Pytho dictionaries
BMS Institute of Technology and Management
 
Pytho lists
Pytho listsPytho lists
Pytho lists
BMS Institute of Technology and Management
 
File handling in Python
File handling in PythonFile handling in Python
File handling in Python
BMS Institute of Technology and Management
 
Introduction to the Python
Introduction to the PythonIntroduction to the Python
Introduction to the Python
BMS Institute of Technology and Management
 
15CS562 AI VTU Question paper
15CS562 AI VTU Question paper15CS562 AI VTU Question paper
15CS562 AI VTU Question paper
BMS Institute of Technology and Management
 
weak slot and filler
weak slot and fillerweak slot and filler
weak slot and filler
BMS Institute of Technology and Management
 
strong slot and filler
strong slot and fillerstrong slot and filler
strong slot and filler
BMS Institute of Technology and Management
 
Problems, Problem spaces and Search
Problems, Problem spaces and SearchProblems, Problem spaces and Search
Problems, Problem spaces and Search
BMS Institute of Technology and Management
 
Introduction to Artificial Intelligence and few examples
Introduction to Artificial Intelligence and few examplesIntroduction to Artificial Intelligence and few examples
Introduction to Artificial Intelligence and few examples
BMS Institute of Technology and Management
 
Artificial Neural Networks: Introduction, Neural Network representation, Appr...
Artificial Neural Networks: Introduction, Neural Network representation, Appr...Artificial Neural Networks: Introduction, Neural Network representation, Appr...
Artificial Neural Networks: Introduction, Neural Network representation, Appr...
BMS Institute of Technology and Management
 
Decision Tree Learning: Decision tree representation, Appropriate problems fo...
Decision Tree Learning: Decision tree representation, Appropriate problems fo...Decision Tree Learning: Decision tree representation, Appropriate problems fo...
Decision Tree Learning: Decision tree representation, Appropriate problems fo...
BMS Institute of Technology and Management
 
Classification: MNIST, training a Binary classifier, performance measure, mul...
Classification: MNIST, training a Binary classifier, performance measure, mul...Classification: MNIST, training a Binary classifier, performance measure, mul...
Classification: MNIST, training a Binary classifier, performance measure, mul...
BMS Institute of Technology and Management
 
Ad

Recently uploaded (20)

Control Methods of Noise Pollutions.pptx
Control Methods of Noise Pollutions.pptxControl Methods of Noise Pollutions.pptx
Control Methods of Noise Pollutions.pptx
vvsasane
 
Slide share PPT of SOx control technologies.pptx
Slide share PPT of SOx control technologies.pptxSlide share PPT of SOx control technologies.pptx
Slide share PPT of SOx control technologies.pptx
vvsasane
 
Slide share PPT of NOx control technologies.pptx
Slide share PPT of  NOx control technologies.pptxSlide share PPT of  NOx control technologies.pptx
Slide share PPT of NOx control technologies.pptx
vvsasane
 
hypermedia_system_revisit_roy_fielding .
hypermedia_system_revisit_roy_fielding .hypermedia_system_revisit_roy_fielding .
hypermedia_system_revisit_roy_fielding .
NABLAS株式会社
 
Agents chapter of Artificial intelligence
Agents chapter of Artificial intelligenceAgents chapter of Artificial intelligence
Agents chapter of Artificial intelligence
DebdeepMukherjee9
 
Water Industry Process Automation & Control Monthly May 2025
Water Industry Process Automation & Control Monthly May 2025Water Industry Process Automation & Control Monthly May 2025
Water Industry Process Automation & Control Monthly May 2025
Water Industry Process Automation & Control
 
Generative AI & Large Language Models Agents
Generative AI & Large Language Models AgentsGenerative AI & Large Language Models Agents
Generative AI & Large Language Models Agents
aasgharbee22seecs
 
Using the Artificial Neural Network to Predict the Axial Strength and Strain ...
Using the Artificial Neural Network to Predict the Axial Strength and Strain ...Using the Artificial Neural Network to Predict the Axial Strength and Strain ...
Using the Artificial Neural Network to Predict the Axial Strength and Strain ...
Journal of Soft Computing in Civil Engineering
 
AI-Powered Data Management and Governance in Retail
AI-Powered Data Management and Governance in RetailAI-Powered Data Management and Governance in Retail
AI-Powered Data Management and Governance in Retail
IJDKP
 
Design of Variable Depth Single-Span Post.pdf
Design of Variable Depth Single-Span Post.pdfDesign of Variable Depth Single-Span Post.pdf
Design of Variable Depth Single-Span Post.pdf
Kamel Farid
 
Artificial intelligence and machine learning.pptx
Artificial intelligence and machine learning.pptxArtificial intelligence and machine learning.pptx
Artificial intelligence and machine learning.pptx
rakshanatarajan005
 
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdf
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdfSmart City is the Future EN - 2024 Thailand Modify V1.0.pdf
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdf
PawachMetharattanara
 
Jacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia - Excels In Optimizing Software ApplicationsJacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia
 
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
PawachMetharattanara
 
Physical and Physic-Chemical Based Optimization Methods: A Review
Physical and Physic-Chemical Based Optimization Methods: A ReviewPhysical and Physic-Chemical Based Optimization Methods: A Review
Physical and Physic-Chemical Based Optimization Methods: A Review
Journal of Soft Computing in Civil Engineering
 
Frontend Architecture Diagram/Guide For Frontend Engineers
Frontend Architecture Diagram/Guide For Frontend EngineersFrontend Architecture Diagram/Guide For Frontend Engineers
Frontend Architecture Diagram/Guide For Frontend Engineers
Michael Hertzberg
 
Personal Protective Efsgfgsffquipment.ppt
Personal Protective Efsgfgsffquipment.pptPersonal Protective Efsgfgsffquipment.ppt
Personal Protective Efsgfgsffquipment.ppt
ganjangbegu579
 
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
Guru Nanak Technical Institutions
 
Construction Materials (Paints) in Civil Engineering
Construction Materials (Paints) in Civil EngineeringConstruction Materials (Paints) in Civil Engineering
Construction Materials (Paints) in Civil Engineering
Lavish Kashyap
 
Construction-Chemicals-For-Waterproofing.ppt
Construction-Chemicals-For-Waterproofing.pptConstruction-Chemicals-For-Waterproofing.ppt
Construction-Chemicals-For-Waterproofing.ppt
ssuser2ffcbc
 
Control Methods of Noise Pollutions.pptx
Control Methods of Noise Pollutions.pptxControl Methods of Noise Pollutions.pptx
Control Methods of Noise Pollutions.pptx
vvsasane
 
Slide share PPT of SOx control technologies.pptx
Slide share PPT of SOx control technologies.pptxSlide share PPT of SOx control technologies.pptx
Slide share PPT of SOx control technologies.pptx
vvsasane
 
Slide share PPT of NOx control technologies.pptx
Slide share PPT of  NOx control technologies.pptxSlide share PPT of  NOx control technologies.pptx
Slide share PPT of NOx control technologies.pptx
vvsasane
 
hypermedia_system_revisit_roy_fielding .
hypermedia_system_revisit_roy_fielding .hypermedia_system_revisit_roy_fielding .
hypermedia_system_revisit_roy_fielding .
NABLAS株式会社
 
Agents chapter of Artificial intelligence
Agents chapter of Artificial intelligenceAgents chapter of Artificial intelligence
Agents chapter of Artificial intelligence
DebdeepMukherjee9
 
Generative AI & Large Language Models Agents
Generative AI & Large Language Models AgentsGenerative AI & Large Language Models Agents
Generative AI & Large Language Models Agents
aasgharbee22seecs
 
AI-Powered Data Management and Governance in Retail
AI-Powered Data Management and Governance in RetailAI-Powered Data Management and Governance in Retail
AI-Powered Data Management and Governance in Retail
IJDKP
 
Design of Variable Depth Single-Span Post.pdf
Design of Variable Depth Single-Span Post.pdfDesign of Variable Depth Single-Span Post.pdf
Design of Variable Depth Single-Span Post.pdf
Kamel Farid
 
Artificial intelligence and machine learning.pptx
Artificial intelligence and machine learning.pptxArtificial intelligence and machine learning.pptx
Artificial intelligence and machine learning.pptx
rakshanatarajan005
 
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdf
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdfSmart City is the Future EN - 2024 Thailand Modify V1.0.pdf
Smart City is the Future EN - 2024 Thailand Modify V1.0.pdf
PawachMetharattanara
 
Jacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia - Excels In Optimizing Software ApplicationsJacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia - Excels In Optimizing Software Applications
Jacob Murphy Australia
 
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
01.คุณลักษณะเฉพาะของอุปกรณ์_pagenumber.pdf
PawachMetharattanara
 
Frontend Architecture Diagram/Guide For Frontend Engineers
Frontend Architecture Diagram/Guide For Frontend EngineersFrontend Architecture Diagram/Guide For Frontend Engineers
Frontend Architecture Diagram/Guide For Frontend Engineers
Michael Hertzberg
 
Personal Protective Efsgfgsffquipment.ppt
Personal Protective Efsgfgsffquipment.pptPersonal Protective Efsgfgsffquipment.ppt
Personal Protective Efsgfgsffquipment.ppt
ganjangbegu579
 
Construction Materials (Paints) in Civil Engineering
Construction Materials (Paints) in Civil EngineeringConstruction Materials (Paints) in Civil Engineering
Construction Materials (Paints) in Civil Engineering
Lavish Kashyap
 
Construction-Chemicals-For-Waterproofing.ppt
Construction-Chemicals-For-Waterproofing.pptConstruction-Chemicals-For-Waterproofing.ppt
Construction-Chemicals-For-Waterproofing.ppt
ssuser2ffcbc
 

Python Regular Expressions

  • 1. MODULE 3 – PART 4 REGULAR EXPRESSIONS By, Ravi Kumar B N Assistant professor, Dept. of CSE BMSIT & M
  • 2. ➢ Regular expression is a sequence of characters that define a search pattern. ➢ patterns are used by string searching algorithms for "find" or "find and replace" operations on strings, or for input validation. ➢ The regular expression library “re” must be imported into our program before we can use it. INTRODUCTION
  • 3. ➢ search() function: used to search for a particular string. will only return the first occurrence that matches the specified pattern. This function is available in “re” library. ➢ the caret character (^) : is used in regular expressions to match the beginning of a line. ➢ The dollar character ($) : is used in regular expressions to match the end of a line. Example: program to match only lines where “From:” is at the beginning of the line import re hand = open('mbox1.txt') for line in hand: line = line.rstrip() if re.search('^From:', line) : print(line) #Output From:stephen Sat Jan 5 09:14:16 2008 From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008 From:zqian@umich.edu Fri Jan 4 16:10:39 2008 mbox1.txt From:stephen Sat Jan 5 09:14:16 2008 Return-Path: <postmaster@collab.sakaiproject.org> From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008 Subject: [sakai] svn commit: From:zqian@umich.edu Fri Jan 4 16:10:39 2008 Return-Path: <postmaster@collab.sakaiproject.org> ✓ The instruction re.search('^From:', line) equivalent with the startswith() method from the string library. SEARCH() FUNCTION:
  • 4. ➢ The dot character (.) : The most commonly used special character is the period (”dot”) or full stop, which matches any character. The regular expression “F..m:” would match any of the following strings since the period characters in the regular expression match any character. “From:”, “Fxxm:”, “F12m:”, or “F!@m:” ➢ The program in the previous slide is rewritten using dot character which gives the same output CHARACTER MATCHING IN REGULAR EXPRESSIONS import re hand = open('mbox1.txt') for line in hand: line = line.rstrip() if re.search(‘^F..m:', line) : print(line) #Output From:stephen Sat Jan 5 09:14:16 2008 From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008 From:zqian@umich.edu Fri Jan 4 16:10:39 2008
  • 5. Character can be repeated any number of times using the “*” or “+” characters in a regular expression. ➢ The Asterisk character (*) : matches zero-or-more characters ➢ The Plus character (+) : matches one-or-more characters Example: Program to match lines that start with “From:”, followed by mail-id import re hand = open('mbox1.txt') for line in hand: line = line.rstrip() if re.search(‘^From:.+@', line) : print(line) #Output From: louis@media.berkeley.edu Mon Jan 4 16:10:39 2008 From:zqian@umich.edu Fri Jan 4 16:10:39 2008 ✓ The search string “ˆFrom:.+@” will successfully match lines that start with “From:”, followed by one or more characters (“.+”), followed by an at-sign. The “.+” wildcard matches all the characters between the colon character and the at-sign.
  • 6. ➢ non-whitespace character (S) - matches one non-whitespace character ➢findall() function: It is used to search for “all” occurrences that match a given pattern. In contrast, search() function will only return the first occurrence that matches the specified pattern. import re s = 'Hello from csev@umich.edu to cwen@iupui.edu about the meeting @2PM' lst = re.findall('S+@S+', s) print(lst) #output ['csev@umich.edu', 'cwen@iupui.edu'] Example1: Program returns a list of all of the strings that look like email addresses from a given line. # same program using search() it will display only first mail id or first matching string import re s = 'Hello from csev@umich.edu to cwen@iupui.edu about the meeting @2PM' lst = re.search('S+@S+', s) print(lst) #output <re.Match object; span=(11, 25), match='csev@umich.edu'> 'S+@S+’ this regular expression matches substrings that have at least one non-whitespace character, followed by an at-sign, followed by at least one more non-whitespace character
  • 7. Example2: Program returns a list of all of the strings that look like email addresses from a given file. import re hand = open('mbox1.txt') for line in hand: line = line.rstrip() x = re.findall('S+@S+', line) if len(x) > 0 : print(x) #Output ['<postmaster@collab.sakaiproject.org>'] ['louis@media.berkeley.edu'] ['zqian@umich.edu'] ['<postmaster@collab.sakaiproject.org>'] ➢ Square brackets “[]” : square brackets are used to indicate a set of multiple acceptable characters we are willing to consider matching. Example: [a-z] matches single lowercase letter [A-Z] matches single uppercase letter [a-zA-Z] matches single lowercase letter or uppercase letter [a-zA-Z0-9] matches single lowercase letter or uppercase letter or number Some of our email addresses have incorrect characters like “<” or “;” at the beginning or end. we are only interested in the portion of the string that starts and ends with a letter or a number. To get the proper output we have to use following character.
  • 8. [amk] matches 'a', 'm', or ’k’ [(+*)] matches any of the literal characters ’(‘ , '+’, '*’, or ’)’ [0-5][0-9] matches all the two-digits numbers from 00 to 59 ➢ Characters that are not within a range can be matched by complementing the set If the first character of the set is '^', all the characters that are not in the set will be matched. For example, [^5] will match any character except ’5’ Ex: Program returns list of all email addresses in proper format. import re hand = open('mbox.txt') for line in hand: line = line.rstrip() x = re.findall('[a-zA-Z0-9]S*@S*[a-zA-Z]', line) if len(x) > 0 : print(x) #output ['postmaster@collab.sakaiproject.org'] ['louis@media.berkeley.edu'] ['zqian@umich.edu'] ['postmaster@collab.sakaiproject.org'] [a-zA-Z0-9]S*@S*[a-zA-Z] : substrings that start with a single lowercase letter, uppercase letter, or number “[a-zA- Z0-9]”, followed by zero or more non-blank characters “S*”, followed by an at-sign, followed by zero or more non-blank characters “S*”, followed by an uppercase or lowercase letter “[a-zA-Z]”.
  • 9. SEARCH AND EXTRACT import re hand = open('mbox2.txt') for line in hand: line = line.rstrip() if re.search('^XS*: [0-9.]+', line) : print(line) #Output X-DSPAM-Confidence: 0.8475 X-DSPAM-Probability: 0.9245 Example1: Find numbers on lines that start with the string “X-” lines such as: X-DSPAM-Confidence: 0.8475 ➢ parentheses “()” in regular expression : used to extract a portion of the substring that matches the regular expression. import re hand = open('mbox2.txt') for line in hand: line = line.rstrip() x = re.findall('^XS*: ([0-9.]+)', line) if len(x) > 0 : print(x) Search #Output ['0.8475’] Extract ['0.9245'] mbox2.txt From: stephen.marquard@uct.ac.za Subject: [sakai] svn commit: r39772 - content/branches/sakai_2-5-x/conten impl/impl/src/java/org X-Content-Type-Outer-Envelope: text/plain; charset=UTF-8 X-Content-Type-Message-Body: text/plain; charset=UTF-8 Content-Type: text/plain; charset=UTF-8 X-DSPAM-Result: Innocent X-DSPAM-Processed: Sat Jan 5 09:14:16 2008 X-DSPAM-Confidence: 0.8475 X-DSPAM-Probability: 0.9245 Above output has entire line we only want to extract numbers from lines that have the above syntax
  • 10. import re hand = open('mbox1.txt') for line in hand: line = line.rstrip() x = re.findall('^From.* ([0-3][0-9]):', line) if len(x) > 0 : print(x) #Output ['09'] ['16'] ['16'] Example2: Program to print the day of received mails
  • 11. RANDOM EXECUTION >>> s=" 0.9 .90 1.0 1. 138 pqr“ >>> re.findall('[0-9.]+',s) ['0.9', '.90', '1.0', '1.', '138’] >>> re.findall('[0-9]+[.][0-9]',s) ['0.9', '1.0’] >>> re.findall('[0-9]+[.][0-9]+',s) ['0.9', '1.0'] >>> re.findall('[0-9]*[.][0-9]+’,s) ['0.9', '.90', '1.0’] >>> usn="1bycs123, 1byec249, 1bycs009, 1byme209, 1byis112, 1byee190“ >>> re.findall('1bycs...',usn) ['1bycs123', '1bycs009’] >>> re.findall('[a-zA-Z0-9]+cs[0-9]+',usn) ['1bycs123', '1bycs009’] >>> usn="1bycs123, 1byec249, 1bycs009, 1byme209, 1vecs112, 1svcs190" >>> re.findall('[a-zA-Z0-9]+cs[0-9]+',usn) ['1bycs123', '1bycs009', '1vecs112', '1svcs190’] >>> re.findall('[0-9]+cs[0-9]+',usn) [] >>> re.findall('[a-zA-Z0-9]+cs([0-9]+)',usn) ['123', '009', '112', '190']
  • 12. ESCAPE CHARACTER ➢ Escape character (backslash "" ) is a metacharacter in regular expressions. It allow special characters to be used without invoking their special meaning. If you want to match 1+1=2, the correct regex is 1+1=2. Otherwise, the plus sign has a special meaning. For example, we can find money amounts with the following regular expression. >>>import re >>>x = 'We just received $10.00 for cookies.’ >>>y = re.findall(‘$[0-9.]+’,x) >>> y ['$10.00']
  • 13. SUMMARY Character Meaning ˆ Matches the beginning of the line $ Matches the end of the line . Matches any character (a wildcard) s Matches a whitespace character S Matches a non-whitespace character (opposite of s) * Applies to the immediately preceding character and indicates to match zero or more of the preceding character(s) *? Applies to the immediately preceding character and indicates to match zero or more of the preceding character(s) in “non-greedy mode” + Applies to the immediately preceding character and indicates to match one or more of the preceding character(s) +? Applies to the immediately preceding character and indicates to match one or more of the preceding character(s) in “non-greedy mode”. [aeiou] Matches a single character as long as that character is in the specified set. In this example, it would match “a”, “e”, “i”, “o”, or “u”, but no other characters. [a-z0-9] You can specify ranges of characters using the minus sign. This example is a single character that must be a lowercase letter or a digit.
  • 14. Character Meaning [ˆA-Za-z] When the first character in the set notation is a caret, it inverts the logic. This example matches a single character that is anything other than an uppercase or lowercase letter. ( ) When parentheses are added to a regular expression, they are ignored for the purpose of matching, but allow you to extract a particular subset of the matched string rather than the whole string when using findall() b Matches the empty string, but only at the start or end of a word. B Matches the empty string, but not at the start or end of a word d Matches any decimal digit; equivalent to the set [0-9]. D Matches any non-digit character; equivalent to the set [ˆ0-9]
  • 15. ASSIGNMENT 1) Write a python program to check the validity of a Password In this program, we will be taking a password as a combination of alphanumeric characters along with special characters, and check whether the password is valid or not with the help of few conditions. Primary conditions for password validation : 1.Minimum 8 characters. 2.The alphabets must be between [a-z] 3.At least one alphabet should be of Upper Case [A-Z] 4.At least 1 number or digit between [0-9]. 5.At least 1 character from [ _ or @ or $ ]. 2) Write a pattern for the following: Pattern to extract lines starting with the word From (or from) and ending with edu. Pattern to extract lines ending with any digit. Start with upper case letters and end with digits. Search for the first white-space character in the string and display its position. Replace every white-space character with the number 9: consider a sample text txt = "The rain in Spain"
  翻译: