libuReg is a small regexp matching library based on Ken Thompson NFA method. It can run any valid regexp in linear time and constant stack size, even those considered to be pathological cases for backtracking-based matching engines.
libuReg requires a C89-compliant C compiler (gcc is fine) and CMake.
The library has been tested on MacOSX 10.6, FreeBSD 8.1 and Debian squeeze, but it should work on any modern POSIX-compliant operating system. Maybe it could work under MinGW or Cygwin, but you're pretty much on your own.
Right now, libuReg is designed for static linking and in-tree shipping.
Drop the libuReg sources in a subdirectory below your project root and add a reference in CMakeLists.txt:
ADD_SUBDIRECTORY(libureg)
TARGET_LINK_LIBRARIES(your-target ureg)
TBD, for the moment you're on your own.
libuReg has mostly the same syntax of POSIX EREs, with a few caveats:
- backreferences are not supported (sorry folks, they're NP-complete);
- capturing groups are not supported, they behave just like non-capturing groups;
- POSIX named character classes are not supported and never will be;
- bracket expressions do not yet support negative matching;
- no assertions and anchors (I didn't need them), all patterns are strictly unanchored;
- non-greedy operators are supported, although they are mostly useless.
Please keep in mind this is experimental code.
When the compiler hits an unknown AST node, it will call exit() instead of
relying on a user-defined error callback.
Issue tracking, wiki and git repository can be found at the project's page on github.
Author: Matteo Panella.
Heavily inspired by and based on RE1 by Russ Cox.