Posted on Jul 1, 2017
In the 1970’s, regular expressions were shown to be astonishingly useful. A compact notation and smart, fast implementations made regular expressions the de facto text search technique for programmers, system administrators, and others. And they remain so to this day!
In recent decades, the modern regex
was born. This creature has more power
than the classic original, although it has come with costs:
- Most implementations backtrack arbitrarily and can take exponential time
- The
regex
syntax and semantics varies across implementations - All the syntactic add-ons make
regex
quite cryptic and obtuse
The Rosie Pattern Language is an attempt to fill these gaps. The PEG-based language and implementation is strictly more powerful than the classic regular expressions of the 1970’s, while achieving:
- Linear time matching in the size of the input
- A single pattern syntax
RPL
which honorably borrows from
regex
, but is more rational (and is usable from many programming languages) - A much improved syntax that, while not as compact, is more easily understood
Most importantly, Rosie sports an extensible library of named patterns. In the
example below, the pattern net.any
is defined as the disjunction of several
network-related patterns:
any = ip / fqdn / email / url / http_command
(In Rosie Pattern Language, a forward slash is the ordered choice operator.)
The definition above is in the net
package, so we can refer to net.any
,
net.ip
, etc. And net.ip
is defined as:
ip = ipv4 / ipv6
It’s much easier to remember net.any
than to write a pattern from scratch.
When a team of programmers are all using Rosie, you won’t see a half dozen
different patterns for matching ip addresses in your source repository. You
won’t have to wonder if they are all equivalent, or (more likely) how they
differ in what they match.
Pattern libraries like those in Rosie are how #modernpatternmatching is done.
Follow us on Twitter for announcements. We expect v1.0.0 to be released later this summer.