Edited Sat Nov 25 17:53:46 EST 2017: Examples were updated to Rosie v1.0.0-alpha-6.

In 5 minutes, you’ll have Rosie installed in a local directory like /tmp or /home/whomever. In 10 minutes, you’ll be using the standard pattern library to extract from your own data a variety of common patterns like network addresses, dates, times, and more.

In 15 minutes, you’ll be writing your own RPL patterns on the command line or at the REPL.

Install Rosie

(1) Download by visiting the Rosie repository and click Clone or download. Or:

git clone http://github.com/jamiejennings/rosie-pattern-language

(2) Build Rosie by running make in the directory containing Rosie:

cd rosie-pattern-language
make

You can use Rosie now, by running the executable bin/rosie:

rosie-pattern-language$ bin/rosie --version
1.0.0-alpha-6
rosie-pattern-language$ 

(3) Optionally, install Rosie by running make install. The default installation directory is /usr/local, and the installation will consist of:

/usr/local/bin/rosie           executable
/usr/local/lib/rosie           directory with additional rosie files
/usr/local/lib/librosie.so     shared library, e.g. for Python and other languages

If you want to call Rosie from Python at some point, you’ll need to copy src/librosie/python/rosie.py to wherever you keep your Python libraries. More on this in a future post, but meanwhile there’s a test program that illustrates the basics.

Getting help

The Rosie CLI takes a command (like match) and optional switches. One of the commands is help:

rosie-pattern-language$ bin/rosie help
Usage: rosie [--version] [--verbose] [--rpl <rpl>] [-f <file>]
       [--libpath <libpath>] [-o <output>] [<command>] ...

Rosie 1.0.0-alpha-6

Options:
   --version             Print rosie version
   --verbose             Output additional messages
   --rpl <rpl>           Inline RPL statements
   -f <file>, --file <file>
                         Load an RPL file
   --libpath <libpath>   Directories to search for rpl modules
   -o <output>, --output <output>
                         Output style, one of: none, subs, line, byte, json, matches, default, color

Commands:
   help                  Print this help message
   config                Print rosie configuration information
   list                  List patterns, packages, and macros
   grep                  In the style of Unix grep, match the pattern anywhere in each input line
   match                 Match the given RPL pattern against the input
   repl                  Start the read-eval-print loop for interactive pattern development and debugging
   test                  Execute pattern tests written within the target rpl file(s)
   expand                Expand an rpl expression to see the input to the rpl compiler
   trace                 Match while tracing all steps (generates MUCH output)

The RPL 'import' statement will search these directories in order (this is the libpath):
        /Users/jennings/Projects/rosie-pattern-language/rpl
rosie-pattern-language$ 

The RPL language reference is in the code repository at doc/rpl.md.

Match all the things!

There’s a useful pattern in the ‘all’ package called ‘things’ that matches a few dozen common items. Try it out with some sample data from the rosie test directory…

rosie-pattern-language$ bin/rosie match all.things test/logfile 
Apr  8 09:42:24 Js-MacBook-Pro com.apple.xpc.launchd[1] (homebrew.mxcl.kafka[68878]): Service exited with abnormal code: 1
Apr  8 09:42:24 Js-MacBook-Pro com.apple.xpc.launchd[1] (homebrew.mxcl.kafka): Service only ran for 8 seconds. Pushing respawn out by 2 seconds.
Apr  8 10:10:18 Js-MacBook-Pro.local MUpdate[69707]: Endpoint at '/Applications/Meeting.app' is latest version (4732), skipping.
Apr  8 10:10:18 Js-MacBook-Pro.local MUpdate[69707]: Next Update Check at 2016-04-09 02:22:03 +0000
rosie-pattern-language$ 

A few things to notice:

  • The CLI automatically executes import all upon seeing use of the pattern all.things. Files of RPL code must explicitly include the import X statement to use patterns from package X.
  • The output style is color, which is the default for the match command. The default output style for the grep command is to output every line that matches, like the Unix grep does.
  • Pattern names from the standard library are assigned default color and font styles. Soon these will be customizable.

The rosie list command will show the patterns loaded, and what color, if any, has been assigned. To see patterns in the network packages, you have to tell rosie to import that package:

rosie-pattern-language$ bin/rosie --rpl 'import net' list net.*
Rosie 1.0.0-alpha-6

Name                           Cap? Type       Color           Source
------------------------------ ---- ---------- --------------- ------------------------------
$                                   pattern    red (default)   
.                                   pattern    red (default)   
MAC                            Yes  pattern    underline;green ...attern-language/rpl/net.rpl
MAC_cisco                      Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
MAC_common                     Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
MAC_windows                    Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
^                                   pattern    red (default)   
any                            Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
authority                      Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
authpath                       Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
ci                                  macro                      
email                          Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
error                               function                   
find                                macro                      
findall                             macro                      
first                               macro                      
fqdn                           Yes  pattern    red             ...attern-language/rpl/net.rpl
fqdn_strict                    Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
fqdn_strict_alias                   pattern    red (default)   ...attern-language/rpl/net.rpl
halt                                pattern    red (default)   
host                           Yes  pattern    red             ...attern-language/rpl/net.rpl
http_command                   Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
http_command_name              Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
http_version                   Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
ip                             Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
ip_literal                     Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
ipv4                           Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
ipv6                           Yes  pattern    red;underline   ...attern-language/rpl/net.rpl
ipv6_mixed                          pattern    red (default)   ...attern-language/rpl/net.rpl
keepto                              macro                      
last                                macro                      
message                             function                   
name                           Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
path                           Yes  pattern    green           ...attern-language/rpl/net.rpl
port                           Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
registered_name                Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
scheme                         Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
uri                            Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
url                            Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
userinfo                       Yes  pattern    red (default)   ...attern-language/rpl/net.rpl
~                                   pattern    red (default)   

41/41 names shown
rosie-pattern-language$ 

Another way to explore the RPL standard library is to examine the files in the rpl directory. In each file, you’ll find comments and test cases that show what kinds of input each pattern is expected to accept and reject.

Remember to start at the beginning!

There are a small number of important differences between Rosie expressions (PEGs, generally) and regex. The one that trips up people who are most familiar with regex is that PEGs start matching at the first character of the input.

rosie-pattern-language$ bin/rosie -o line match '"brown"' test/quick.txt 
brown fox in field wants to sleep
brown fox in brush wants to sleep
rosie-pattern-language$ 

To find all the lines in test/quick.txt that contain the word “brown” anywhere in the line, Rosie has a grep command:

rosie-pattern-language$ bin/rosie grep '"brown"' test/quick.txt 
the quick brown
the quick brown fox
the quick brown fox jumped over the lazy (but adorable) dog
brown fox in field wants to sleep
brown fox in brush wants to sleep
rosie-pattern-language$ 

Aside

In case you are curious about how Rosie's `grep` command is implemented, it is equivalent to applying the `findall` macro to the pattern argument and using the `match` command. (And specifying the `line` output format, which is the default for `grep`.)
rosie-pattern-language$ bin/rosie -o line match 'findall:"brown"' test/quick.txt 
the quick brown
the quick brown fox
the quick brown fox jumped over the lazy (but adorable) dog
brown fox in field wants to sleep
brown fox in brush wants to sleep
rosie-pattern-language$ 
Peeling away one more layer, the `findall` macro is a repetitive form of the `find` macro, which takes a pattern argument and does essentially this: While not looking at the target pattern, consume a character and repeat. Finally, match the target pattern.
rosie-pattern-language$ bin/rosie -o line match '{!"brown" .}* "brown"' test/quick.txt 
the quick brown
the quick brown fox
the quick brown fox jumped over the lazy (but adorable) dog
brown fox in field wants to sleep
brown fox in brush wants to sleep
rosie-pattern-language$ 

Experiment at the CLI or the REPL

The Rosie CLI

Here are some suggestions for experimenting on your own data using the Rosie CLI.

  • Use match all.things to see which items within your data are already recognized by Rosie.
  • Switch to grep <pat> to find specific items, e.g. use date.any or net.any for <pat>.
  • Add -o color to your command to make the output easier to read. (The default for Rosie grep is to simply echo the matching lines, like Unix grep does.)
  • Compose a pattern on the command line. Don’t forget to enclose the pattern in single quotes to shield it from interpretation by the shell!
  • Change the output option to -o json to see the structure in the matches. Pipe the output into a json pretty-printer to increase readability.
rosie-pattern-language$ bin/rosie grep ts.any test/logfile
Apr  8 09:42:24 Js-MacBook-Pro com.apple.xpc.launchd[1] (homebrew.mxcl.kafka[68878]): Service exited with abnormal code: 1
Apr  8 09:42:24 Js-MacBook-Pro com.apple.xpc.launchd[1] (homebrew.mxcl.kafka): Service only ran for 8 seconds. Pushing respawn out by 2 seconds.
Apr  8 10:10:18 Js-MacBook-Pro.local MUpdate[69707]: Endpoint at '/Applications/Meeting.app' is latest version (4732), skipping.
Apr  8 10:10:18 Js-MacBook-Pro.local MUpdate[69707]: Next Update Check at 2016-04-09 02:22:03 +0000
rosie-pattern-language$ bin/rosie -o color grep 'ts.any id.any' test/logfile
Apr  8 09:42:24 Js-MacBook-Pro com.apple.xpc.launchd[1] (homebrew.mxcl.kafka[68878]): Service exited with abnormal code: 1
Apr  8 09:42:24 Js-MacBook-Pro com.apple.xpc.launchd[1] (homebrew.mxcl.kafka): Service only ran for 8 seconds. Pushing respawn out by 2 seconds.
Apr  8 10:10:18 Js-MacBook-Pro.local MUpdate[69707]: Endpoint at '/Applications/Meeting.app' is latest version (4732), skipping.
Apr  8 10:10:18 Js-MacBook-Pro.local MUpdate[69707]: Next Update Check at 2016-04-09 02:22:03 +0000
rosie-pattern-language$ bin/rosie -o color grep 'ts.any id.any find:ts.any' test/logfile
Apr  8 10:10:18 Js-MacBook-Pro.local MUpdate[69707]: Next Update Check at 2016-04-09 02:22:03 +0000
rosie-pattern-language$ 

The Read-Eval-Print Loop (REPL)

If you have developed in Lisp or Scheme, you have seen the power of the REPL as a development tool. Even Python supports a REPL these days to enable incremental code development. And so does Rosie.

There are three things you can enter at the Rosie> REPL prompt:

  • Commands, like .match, .trace, and .load;
  • RPL statements, e.g. definitions like d = [:digit:]; and
  • RPL identifiers (to see their definitions).
rosie-pattern-language$ bin/rosie repl
Rosie 1.0.0-alpha-6
Rosie> d
Repl: undefined identifier d
Rosie> d = [:digit:]
Rosie> d
[:digit:]
Rosie> .match d "4"
{"data": "4", 
 "e": 2, 
 "s": 1, 
 "type": "d"}
Rosie> .match d+ "4321"
{"data": "4321", 
 "e": 5, 
 "s": 1, 
 "subs": 
   [{"data": "4", 
     "e": 2, 
     "s": 1, 
     "type": "d"}, 
    {"data": "3", 
     "e": 3, 
     "s": 2, 
     "type": "d"}, 
    {"data": "2", 
     "e": 4, 
     "s": 3, 
     "type": "d"}, 
    {"data": "1", 
     "e": 5, 
     "s": 4, 
     "type": "d"}], 
 "type": "͙"}
Rosie> import net
Rosie> net
<environment: 0x7fa00a7b54c0>
Rosie> net.ipv4
{ipv4_component {{"." ipv4_component} {"." ipv4_component} {"." ipv4_component}}}
Rosie> .match net.ipv4 "192.67.1.100"
{"data": "192.67.1.100", 
 "e": 13, 
 "s": 1, 
 "type": "net.ipv4"}
Rosie> .match findall:net.ipv4 "Hello 192.67.1.100"
{"data": "Hello 192.67.1.100", 
 "e": 19, 
 "s": 1, 
 "subs": 
   [{"data": "192.67.1.100", 
     "e": 19, 
     "s": 7, 
     "type": "net.ipv4"}], 
 "type": "*"}
Rosie> 
Exiting
rosie-pattern-language$ 

Note that sample data for the match and trace commands must be enclosed in double quotes.

Using the REPL is a good way to develop RPL patterns. Because Rosie is happy to match just a portion of the input data (starting at the first character), you can begin with a pattern that matches just the first item in the data, and then extend the pattern incrementally to match more and more of the sample input.

Coming up: Rosie and Python

In a forthcoming post, I’ll show how to call Rosie from Python using rosie.py, which uses librosie.so.

Discussion on reddit

A Rosie subreddit has been created for discussion of these posts and for questions about Rosie and RPL. See you there!


Follow us on Twitter for announcements about the RPL approach to #modernpatternmatching.