I recently gave an IBM Tech Talk (replay available here) on Rosie v1.0. And there’s a subreddit available for discussion and questions.

The talk covers:

  • How regex fails to scale “in the large”, i.e. to complex patterns, large collections of patterns, teams of developers, and big data.
  • RPL extends the core concepts of regex, resulting in a pattern language that is familiar to developers yet strictly more powerful than regular expressions.
  • RPL is designed like a programming language, with attention to readability, maintainability, modularity (packages), and to (devops) tools.

The last part of the last point is important. Inspired in some ways by how the designers of Google’s Go paid attention to software engineering practice, Rosie was designed to work well with software tools like Git, Travis, Jenkins, and others.

Plays well with git and friends

RPL was designed with the kind of syntax found in programming languages. In particular, you can have whitespace and comments scattered around your (pattern) code. This makes RPL much easier to read and maintain than regex, but it also makes diff-based tools like Git more useful. Git, and tools like it, are based on diff, which is line-oriented. Most regex are embedded in programs as literal strings dense with magic characters and escape sequences. That makes it difficult to see the changes when reading diff output:

Output of diff comparing one grok file to another.  The definitions of QUOTEDSTRING and IPV6, which are named patterns, differ between the two files. But the syntax of the regex is so cryptic that it is very hard visually to find the changes.

Also, RPL patterns can be combined into packages. A package can import other packages, so that a group of interdependent files of RPL patterns can be stored in an online repository (e.g. on GitHub). That makes it easy to use such a package:

  • Do a git clone of the repo with the RPL package into a local directory
  • Add the local directory to Rosie’s libpath

Plays well with Travis, Jenkins and friends

Rosie also plays well with build and test automation. RPL files can contain unit tests which are executed with the rosie test command. The tests can be run during development, to ensure (1) that changes to the pattern definitions did not cause regression errors, and (2) to capture new requirements (where a pattern must newly accept or reject certain input).

Finally, the compilation and unit test actions invoked by rosie test also ensure that all pattern dependencies (packages) are available.

Today, Rosie does not support separate compilation and execution steps. So it is possible to pass unit tests but fail in deployment if the deployment environment is missing a needed dependency. (The same is true of Javascript, Python, and other languages.) However, the rosie test command can be used to “smoke test” a deployment, to ensure that all RPL dependencies are installed, that the RPL compiles without errors, and that the patterns pass all of their unit tests.

Discussion on reddit

A Rosie subreddit has been created for discussion of these posts and for questions about Rosie and RPL. See you there!

Follow us on Twitter for announcements about the RPL approach to #modernpatternmatching.