A Cloudflare global outage occurred on July 2, 2019 due to a regex that spiked CPU usage on a key component in their service. This regex behavior is sometimes called catastrophic backtracking, and can happen with backtracking regex libraries. Only a few implementations are smart enough to avoid this behavior. In this post, we look at RPL as a regex alternative. If the problematic regex that Cloudflare used had been developed in RPL, it would have produced a linear time solution.[more]
Edited Sunday, July 21, 2019: With permission of the author, the complete article is available here in PDF form.[more]
The first release candidate for Rosie v1.1.0 is in the
dev branch of the
Rosie repository on GitLab. Among other changes, we have moved the code that
librosie and languages like Python, Go, C, and Haskell.
Those interface libraries are now in the
Clients subgroup of the
Rosie Community group.
Our last post described a released planned for the end of 2018, which has been delayed. To avoid unnecessary churn, we are taking this opportunity to move the Rosie/RPL interface libraries (in Python, Go, C, and Haskell, with more forthcoming) to their own repositories. This move will facilitate development of those interfaces as well as new ones, independent of the Rosie/RPL release plan.[more]
In August of 2018, I left IBM and joined the faculty of Computer Science at NCSU. I was a professor before my IBM career, and it’s wonderful to be a professor again. This change has some positive implications for the Rosie project: We are building a small group of researchers and developers who will use and contribute to the project.[more]
Is your data structured for humans, not for easy processing? Do you have data
CSC316 from which you want to extract the department (CSC) and
the course number (316)? But you have other data in geo-coordinates like
(35.7692755,-78.6786137). And then there are also lists of items usually
separated by commas, but sometimes by semi-colons. A single Rosie pattern can
destructure all of these and more.
Rosie v1.0 is out on gitlab.com! Please visit us there, open issues to report bugs or request information, and leave a star. 😄[more]
We are about to release version 1.0.0-beta-11 of Rosie Pattern Language, and this may be the last beta release before Rosie version 1.0. Our expectation is that version 1.0 will be released before the end of next month (June, 2018). In this post, we will review the project goals, some of Rosie’s current capabilities, and what we have planned for the coming year.[more]
For some time, Rosie has had a Python module, but it was undocumented. Until
now, you had to read the code to understand how to use it. In this post, we’ll
rosie.py, which exposes the Rosie Pattern Language functionality to
Regex syntax has been extended over the years to allow matching of characters based on their Unicode properties. While there is considerable variation in the syntax and the behavior across implementations, the Perl syntax may be familar.[more]
A few months ago, I was writing a Go module for using
I ran into a problem
that appeared to be related to Go/cgo stack allocation. That turned out to be a
red herring (no offense to herrings of any color), and we now have a working Go
library for Rosie!
I’ve said before that I really enjoy reading the posts about Oil Shell. I have a lot of enthusiasm for the goals of that project, and it’s great to be able to read about a project as it evolves. A recent post covered the raison d’être for Oil Shell and the Oil Language, and inspired this post. Go read that post now if you want — we will be here when you get back. 😁[more]
The character set syntax in Rosie Pattern Language fixes some of the usability issues with regex character sets, and then goes beyond what some (but not all) regex solutions offer today. In this post, we cover RPL character sets as implemented in Rosie v1.0.0-alpha-8.[more]
Rosie v1.0.0 is in alpha release now. Our intention is to release a beta version in early 2018, with a frozen feature set, API, CLI, and REPL interfaces. The beta will be a release candidate for the proper version 1.0.0.[more]
Edited Sat Nov 25 17:53:46 EST 2017: Examples were updated to Rosie v1.0.0-alpha-6.
Edited Tue Jul 31 08:42:24 EDT 2018: Added example of command help using `-h`, and fixed broken links.
Rosie version 1.0.0-alpha-2 has been released, and the Python module is back![more]
Rosie version 1.0.0-alpha released![more]
RPL has many concepts in common with regex, and the syntax of RPL reflects this. So if you know regex, you know a lot of RPL already![more]
Regex are hard to debug when they fail to match what you think they should match (and vice versa). That’s why there are so many websites offering regex debugging tools. Rosie expressions can likewise be hard to debug at times, and I think for the same reason: Pattern matchers (parsers, generally) are algorithms with a very large number of states, essentially all of which influence the next step to be taken. There are many ways that a human being’s mental model of the algorithm’s state can be wrong.[more]
It’s easy to make a mistake when entering a regular expression on the command line. And, sometimes, we make a hard-to-spot error in a regular expression that is part of a program. Usually, those errors are not caught at compile time — but of course we want to catch as many errors as we can at compile time.[more]
Think of your favorite regex tool. How flexible is it when it comes to producing output?[more]
In the 1970’s, regular expressions were shown to be astonishingly useful. A compact notation and smart, fast implementations made regular expressions the de facto text search technique for programmers, system administrators, and others. And they remain so to this day![more]
We are working hard developing the version 1.0.0 release of Rosie Pattern Language.[more]
The current version is Rosie v0.99k. Rosie is more powerful than regex, easier to use, and faster![more]