command line, linux, tools

Regex cheat sheet

Regular expressions or commonly called as Regex or Regexp is a tool which simplifies working with strings. It helps finding, filtering, repliacing and matching. This tool is essential to work efficiently with log files and in general working in IT. Sooner or later you will need it!

^\w+\.pdf$

finds files with the pdf extension

Character sets #1

b[aeiou]r

will match bar ber bir bor bur

Character sets #2

b[^eo]r

will exclude ber & bor from above set

Letter range

[g-m]

will match all letters from specified range, including themselves: abcdefghijklmnopqrstuvwxyz

Number range

[4-7]

same as above, but for numbers

Repetitions

Asterisk

be*r

letter e can occur 0 or more times, matches: br ber beer beeeer

Plus

be+r

letter e can occur one or more times ber beer

Question mark

colou?r

indicates that character u is optional, matches: colour color

Curly braces

be{2}r

e should occur only 2 times: beer

Curly braces #2

be{3,}r

e should appear at least 3 times, beeer

Curly braces #3

be{1,3}r

e should appear between 1 and 3 times, ber beer beeer

Grouping

Parentheses

(haha)

matches haha string and substring coontaining whole group

Pipe

(C|c)at

similar to grouping, matches cat Cat

Caret

^[0-9]

find only numbers at the beginning of a line, eg 1. Title will match 1

Dollar

ending$

$ indicates end of line, matches ending in This story has an ending but not ending in This story has an ending and start

Alphanumeric

Word character

\w

expression \w is used to find letters, numbers and underscore characters.

\W

find characters other than letters, numbers, and underscores

Digits

\d

finds only number characters, excluding is \D

Space

\s \S

Lookarounds

Positive lookahead

\d+(?=PM)

will match PM in Date: 4 Aug 3PM which means: find the numerical values \d+ that have PM after them (?=PM)

Negative lookahead

\d+(?!PM)

select only digits that do not have PM after them, matches 4 in Date: 4 Aug 3PM

Positive lookbehind

(?<=\$)\d+

will match 5 in Code 123, Price: $5, so it matches before it

Negative lookbehind

(?<!\$)\d+

will match 123 in Code 123, Price: $5

Flags

Global

/expression/g

by default, expression will find only first match, add g at the end to find all occurences

Multiline

/expression/m

handles each line separately

Case insensitive

/expression/i

Examples