Slicing and dicing data with regular expressions

Most computer systems have an assortment of tools for filtering and processing data. A virus scanner, a spam fighter, a web search engine, a spell checker – each is a filter that sifts though data to isolate the information you really need. Your shell provides a filter, too. For example, ls *.jpg lists only JPEG images.

Because so much of Linux depends on interpreting and processing plain text files, an entire shorthand exists for creating filters. The shorthand is called regular expressions, or regex. A regex applied to text can find, dissect, and extract virtually any pattern you seek. Table 1 shows some common regex operators, which you can string together and use in combination to build arbitrarily complex filters.

The origin of regex dates back some 60 years to research in theoretical computer science, a branch of study that includes the design and analysis of algorithms and the semantics of programming languages. The earliest progenitor described models of computation in a shorthand notation called a "regular expression." The shorthand was first co-opted for use in the QED editor found in the original Unix operating system, but it has since expanded into a POSIX standard for pattern matching. Today, the most popular implementation of regex is the Perl-Compatible Regular Expressions library, or PCRE. You will find the PCRE in Perl, Apache, Ruby, PHP, and many other languages and tools.

[...]

Use Express-Checkout link below to read the full article (PDF).

Buy this article as PDF

Express-Checkout as PDF

Pages: 4

Price $2.95
(incl. VAT)

Buy Raspberry Pi Geek

SINGLE ISSUES

Print Issues

Digital Issues

SUBSCRIPTIONS

Print Subs

Digisubs

TABLET & SMARTPHONE APPS

US / Canada

UK / Australia

Slicing and dicing data with regular expressions

Buy this article as PDF

Buy Raspberry Pi Geek

Related content

Current Issue

23/2017
Back to Basics (sort of): Commandeering the Linux command line, looking at logs, and securing Secure Shell

Buy this issue as a PDF

www.raspberrypi.org

Slicing and dicing data with regular expressions

Buy this article as PDF

Buy Raspberry Pi Geek

Related content

Current Issue

23/2017 Back to Basics (sort of): Commandeering the Linux command line, looking at logs, and securing Secure Shell

Buy this issue as a PDF

www.raspberrypi.org

23/2017
Back to Basics (sort of): Commandeering the Linux command line, looking at logs, and securing Secure Shell