Oakland.pm

Reviews

Review of "Regular Expression Pocket Reference"

reviewed by George Woolley


Regular Expression Pocket Reference
By Tony Stubblebine
August 2003
Series: Pocket References
0-596-00415-X, Order Number: 415X
100 pages, $12.95 US, $20.95 CA

Note:

Short Review

Excellent.

I recommend getting this book if you feel comfortable with regular expressions and regular expressions are important to you.

As one expects from an O'Reilly Pocket Reference, this book is compact but still covers a lot of ground. For a whole bunch of applications, it provides

  • tables of various groupings of regex metacharacters' summarizing their syntax and meaning
  • summaries of other regex related features but not in tabular form
  • examples
  • a few references in case you need to go deeper

The information is concise and well chosen.

This is a reference, but in applications where you use regular expressions less, it may also be useful for expanding your knowledge significantly. It was for me.

If you wish, take a look at my more detailed review.

George Woolley of Oakland.pm

Miscellaneous

Section Titles

  1. About This Book
  2. Introduction to Regexes and Pattern Matching
  3. Perl 5.8
  4. Java (java.util.regex)
  5. .NET and C#
  6. Python
  7. PCRE Lib
  8. PHP
  9. vi Editor [includes vim]
  10. JavaScript
  11. Shell Tools

Notes

  • There's just one chapter with many sections.
  • PCRE Lib stands for Perl Compatible Regular Expression Library. It's an open source library included in PHP, Apache, KDE, etc.
  • The shell tools are GNU egrep, GNU sed and GNU awk.

Version Numbers

The information in the book for each of the applications covered in most cases presupposes that you have at least a certain version of the application. The versions indicated in the book are:

  • Perl 5.8
  • Java 1.4
  • Python 2.2
  • PCRE Lib 4.0
  • PHP 4.3
  • JavaScript 1.5
  • GNU egrep 2.4.2
  • GNU sed 3.02
  • GNU awk 3.1

I checked the version for the applications that I care about on my Linux system. Here's how:

If you have a Unix/Linux system of some kind, likely you'll also be able to easily determine what versions you are using in some similar way.

Notes:

  • No version was specified for .NET, C# or vi.
  • Unless otherwise indicated, what I've given above is a shell command to determine the version.

Going Deeper

If you want an in depth treatment of regular expressions, you would to do well to consider "Mastering Regular Expressions" by Jeffrey Friedl.

"The world of regular expressions is complex and filled with nuance. Jeffrey Friedl has written the definitive work on the subject, Mastering Regular Expressions (O'Reilly), a work on which I relied heavily when writing this book. As a convenience, this book provides page references to Mastering Regular Expressions, Second Edition (MRE) for expanded discussion of regular expression syntax and concepts." -- Tony Stubblebine in "Regular Expression Pocket Reference".

Note

More Detailed Review

Contents

Notes:

  • My initial reading of this book was on-line using Safari. I didn't have a hard copy of this book until after I had completed the first full draft of this review.
  • For info on Safari, click on the image above.
  • I've been using Safari since February and have found it easy to use and quite useful. If you wish, take a look at my Safari Review.
  • I've used Safari for seven more months since writing the review. I still find it very useful.
  • For a special offer for O'Reilly user group members, click here.

The Title

Regular Expression: The second section of the book begins with this characterization of regular expressions: "Regular expressions (known as regexps or regexes) are a way to describe text through pattern matching. You might want to use regular expressions to validate data, to pull pieces of text out of larger blocks, or to substitute new text for old text."

If you don't have a grasp of what a regular expression is, likely this is not a suitable book for you.

Pocket Reference: My understanding is that O'Reilly Pocket References

  • are even more abbreviated than Nutshell books
  • contain syntax that an experienced user might wish to look up
  • assume you are a competent user

Does the book fit the title? Yes.

  • The book is a concise reference of only 100 pages
  • The book has valuable regex feature summaries and syntax.
  • Part of the reason the book is short is that it does assume a competent reader.

About the Reviewer

I've been big into string manipulation for years. I've been using regular expressions since around 1989 and feel very comfortable with them.

I'm a user of the following applications that are relevant here

  • Perl - is my favorite language. I've been a user and advocate of it since 1994.
  • Java - use a little, mostly for applets. Likely I will use it more now that I have a start on regexes in Java.
  • vi Editor - is my primary editor. It has been since 1989.
  • JavaScript - use a little. Likely I will use it more now that I have a start on regexes in JavaScript.
  • Shell Tools - use GNU egrep a lot. Earlier, I used sed a moderate amount.

I have little or no experience with the other applications addressed in the book.

What did I expect from the book?

Well, I was hoping for two things from the book

  • to learn at least a few things I didn't know about regexes in particular applications
  • to then be able to use the book as a reference to lookup syntax I'd forgotten.

In the cases of Java and JavaScript, I was hoping to move from not using regular expressions at all to feeling I could use them.

What did I learn?

general: The second section is an "Introduction to Regexes and Pattern Matching". Even though I use regular expressions a lot, I learned some things from this section including:

  • quoting a span of metacharacters with \Q ... \E
  • that there are a number of POSIX character classes that may be useful to me

Perl: I didn't learn anything about Perl from the section that focuses on Perl, though both the things I mentioned above (under general) apply to Perl. I was, however, reminded of a number of features of regexes I don't use and may wish to use at some future time.

Java: Using the section on Java, I was able to use some regexes in an applet. I did have an annoying problem for a bit. My applet would work under the applet viewer but not under my browser. But I soon had it working both places.

vim: As a result of reading this book, I discovered that I am using vim. Two useful things that vim handles which I wasn't previously using (in vim) are

  • quantifiers, e.g. \+ and \{3,}
  • groupings, e.g. \([aeiou]\+\)

javascript: Using the section on JavaScript, at first, I was unable to get even very simple substitutions to work in JavaScript. But knowing the capability was likely there, I did a Google search and was able to quickly determine what I was doing wrong. Now I'm ready to use regular expressions in JavaScript, if I need to.

summary: Unless you know regular expressions really well and in a wide variety of applications, you can likely learn something from this book too. What you'll learn will doubtless be different from what I did.

The point I'm making is that a moderately sophisticated user can use this book as a learning device as well as for a reference. But keep in mind, this usage goes beyond the stated purpose of the book.

Is the book a good reference?

Well, I suppose the real test is whether I use it much for the next year or so. But I'm pretty sure it will serve well as a reference. I have two reasons for saying that

  • I like the layout.
  • I've already learned things from it.

layout: The book begins with

  • a very general introduction to what the book is about at all
  • a general introduction to regexes and pattern matching.

And then there are nine sections each covering regexes in a particular application or group of applications.

A typical section on a particular application (or groups of applications) includes a very brief introduction and subsections regarding

  • metacharacters which are grouped in tables which include the syntax and the meaning.
  • relevant constructs that are not metacharacters (e.g. operators, functions, classes, etc.)
  • unicode support (if relevant)
  • examples
  • references to other resources (books, sections of books, links, or the like)

learning: Earlier I listed some of the things I've already learned from the book. Generally, the book seems easy to follow. While the book is certainly not a tutorial and is not suitable for beginners, neither is it just a dry syntax summary. Some of the things that go beyond that are:

  • examples of the use of regexes in each of the applications covered
  • brief verbal explanations where that might be useful to the intended audience

Gripes

Basically, I don't have any.

I did find the section numbers amusing. They go from 1.1 through 1.11 Personally, I would have numbered them 1 through 11. But, who cares? Besides, in a way it makes sense. This book is supposed to be abbreviated. So why have more than one chapter?

Another questionable nit, I was glad to see split included. I often use it in conjunction with regular expressions. But when I do, I generally later use join to put things back together. So, I would have been inclined to include join, even though it doesn't involve a regex.

Who do I recommend the book for?

Well, I would definitely recommend this book for someone

  • who uses regular expressions a lot
  • and feels generally comfortable with regular expressions

If you are such a person, you may also (as I did) be able to get a handle on what's available in some applications where you didn't previously use regular expressions.

I would not recommend this book for someone who is trying to learn the basics of regular expressions. And that's not who it's intended for, anyway.

Enjoy.

Last Updated: 2003-10-10