| CARVIEW |
How Regexes Work
How do Perl's regexes work on the inside? Suppose you were going to write a language like Perl, which has regexes, in a language like C, which doesn't? How might you do that?
Note: Although this article purports to describe "How Regexes Work", it describes only one possible technique. In particular, the implementation described does not closely resemble the way Perl's regex implementation works. I apologize for any confusion this has caused.
Special note: If you find this interesting, you might want to consider attending the three-hour Regular Expression Mastery class that I teach at the O'Reilly Perl conference and elsewhere. Full details are available.
Bonus update: The complete slides for my Regular Expression Mastery tutorial are now available on my web site. They are somewhat telegraphic in style, because they are designed to accompany a lecture, and not to stand on their own. However, unlike this article, the slides do describe how the Perl imeplementation of regexes works.
- This article explains how it works.
- A sidebar explains the academic terminology that I didn't use in the article.
- This module, Regex.pm implements the scheme described in the article.
- This program, grep.pl implements a version of Unix grep in Perl without using Perl's built-in regexes.
- This program, demo.pl demonstrates that for some tasks, the Regex.pm library is a whole lot faster than Perl's builtin regexes.
Notes and Errata
- There was an error in one of the illustrations in The Perl Journal.
- Eight lines of code suffice to add the . metacharacter.
- Why does Perl take so long to match some patterns?
Found an error, or have a remark? Send me mail.
Return to: Universe of Discourse main page | What's new page | Perl Paraphernalia