What Are Regular
Expressions?
A regular expression is a pattern template you define
that a Linux utility Uses to filter text. A Linux utility (such as the sed
editor or the awk program) matches the regular expression pattern against data
as that data flows Into the utility. If the data matches the pattern, it's
accepted for processing.
If
the data doesn't match the pattern, it's rejected. The regular expression
pattern makes use of wildcard characters to represent one or more characters in
the data stream.
Types of regular
expressions:
There are two popular regular expression engines:
The POSIX Basic Regular Expression (BRE) engine
The POSIX Extended Regular Expression (ERE) engine
A. Defining BRE Patterns:
The most basic BRE pattern is matching text characters in
a data stream.
$ echo "This
is a test" | sed -n '/test/p'
This is a test.
$ echo "This
is a test" | sed -n '/trial/p'
$
$ echo "This
is a test" | awk '/test/{print $0}'
This is a test.
$ echo "This
is a test" | awk '/trial/{print $0}’
$
Example 2: Special
characters
The special characters recognized by regular expressions
are:
. * [ ] ^ $ { } \ + ? | ( )
For example, if you want to search for a dollar sign in
your text, just precede it with a backslash character:
$ cat data2
The cost is $4.00
$ sed -n '/\$/p'
data2
The cost is $4.00
$
Example 3: Looking
for the ending
The dollar sign ($) special character defines the end
anchor.
$ echo "This
is a good book" | sed -n '/book$/p'
This is a good
book
$ echo "This
book is good" | sed -n '/book$/p'
$
Example 4: Using
ranges
You can use a range of characters within a character
class by using the dash symbol.
Now you can simplify the zip code example by specifying a
range of digits:
$ sed -n
'/^[0-9][0-9][0-9][0-9][0-9]$/p' data8
60633
46201
45902
$
B. Extended Regular Expressions:
The POSIX ERE patterns include a few additional symbols
that are used by some Linux applications and utilities. The awk program
recognizes the ERE patterns, but the sed editor doesn't.
Example 1: The
question mark
The question mark indicates that the preceding character
can appear zero or one time, but that's all. It doesn't match repeating
occurrences of the character:
$ echo
"bt" | awk '/be?t/{print $0}'
bt
$ echo
"bet" | awk '/be?t/{print $0}'
Bet
$ echo
"beet" | awk '/be?t/{print $0}'
$
$ echo
"beeet" | awk '/be?t/{print $0}'
$
Example 2: The
plus sign
The plus sign indicates that the preceding character can
appear one ormore times, but must be present at least once. The pattern doesn't
match if the character is not present:
$ echo
"beeet" | awk '/be+t/{print $0}'
beeet
$ echo
"beet" | awk '/be+t/{print $0}'
beet
$ echo
"bet" | awk '/be+t/{print $0}'
bet
$ echo
"bt" | awk '/be+t/{print $0}'
$
Example 3: The
pipe symbol
The pipe symbol allows to you to specify two or more
patterns that the regular expression engine uses in a logical OR formula when
examining the data stream. If any of the patterns match the data stream text,
the text passes. If none of the patterns match, the data stream text fails.
The format for using the pipe symbol is:
expr1|expr2|...
Here's an example of this:
$ echo "The
cat is asleep" | awk '/cat|dog/{print $0}'
The cat is asleep
$ echo "The
dog is asleep" | awk '/cat|dog/{print $0}'
The dog is asleep
$ echo "The
sheep is asleep" | awk '/cat|dog/{print $0}'
$
Example 4:
Grouping expressions
When you group a regular expression pattern, the group is
treated like a standard character. You can apply a special character to the
group just as you would to a regular character.
For example:
$ echo
"Sat" | awk '/Sat(urday)?/{print $0}'
Sat
$ echo
"Saturday" | awk '/Sat(urday)?/{print $0}'
Saturday
$
Like the Facebook Page & join Group
https://www.facebook.com/DataStage4you
https://www.facebook.com/groups/DataStage4you
https://twitter.com/datastage4you
https://groups.google.com/d/forum/datastage4you
For WHATSAPP group , drop a msg to 91-88-00-906098
No comments :
Post a Comment