RE/flex scanner generator replacement for Flex/Lex. More...
#include "reflex.h"
Macros | |
| #define | WITH_BOOST_PARTIAL_MATCH_BUG |
| Work around the Boost.Regex partial_match bug by forcing the generated scanner to buffer all input. More... | |
Functions | |
| int | fopen_s (FILE **file, const char *name, const char *mode) |
| Safer fopen_s() More... | |
| char | char_tolower (char c) |
| Convert to lower case. More... | |
| static std::string | file_ext (std::string &name, const char *ext) |
| Add file extension if not present, modifies the string argument and returns a copy. More... | |
| int | main (int argc, char **argv) |
Main program instantiates Reflex class and runs Reflex::main(argc, argv) More... | |
Variables | |
| static const char * | options_table [] |
| Table with command-line reflex options and lex specification %options. More... | |
| static const Reflex::Library | library_table [] |
| Table with regex library properties. More... | |
RE/flex scanner generator replacement for Flex/Lex.
| #define WITH_BOOST_PARTIAL_MATCH_BUG |
Work around the Boost.Regex partial_match bug by forcing the generated scanner to buffer all input.
|
inline |
Convert to lower case.
|
static |
Add file extension if not present, modifies the string argument and returns a copy.
name string with extension ext
|
inline |
Safer fopen_s()
| int main | ( | int | argc, |
| char ** | argv | ||
| ) |
Main program instantiates Reflex class and runs Reflex::main(argc, argv)
|
static |
Table with regex library properties.
This table is extensible and new regex libraries may be added. Each regex library is described by:
matcher=NAME optionA regex library signature is a string of the form "decls:escapes?+.", see reflex::convert.
The optional "decls:" part specifies which modifiers and other special (?...) constructs are supported:
(?:...) is supported(?i...) case-insensitive matching is supported(?m...) multiline mode is supported for the ^ and $ anchors(?s...) dotall mode is supported(?x...) freespace mode is supported# specifies that (?#...) comments are supported= specifies that (?=...) lookahead is supported< specifies that (?<...) lookbehind is supported! specifies that (?!=...) and (?!<...) are supported^ specifies that (?^...) negative (reflex) patterns are supportedThe "escapes" characters specify which standard escapes are supported:
a for \a (BEL U+0007)b for \b (BS U+0008) in brackets [\b] only AND the \b word boundaryc for \cX control character specified by X modulo 32d for \d ASCII digit [0-9]e for \e ESC U+001Bf for \f FF U+000Ch for \h ASCII blank [ \t] (SP U+0020 or TAB U+0009)i for \i reflex indent anchorj for \j reflex dedent anchorj for \k reflex undent anchorl for \l ASCII lower case letter [a-z]n for \n LF U+000Ap for \p{C} Unicode character classes, also implies Unicode {X}, , , , , r for \r CR U+000Ds for \s space (SP, TAB, LF, VT, FF, or CR)t for \t TAB U+0009u for \u ASCII upper case letter [A-Z] (when not followed by {XXXX})v for \v VT U+000Bw for \w ASCII word-like character [0-9A-Z_a-z]x for \xXX 8-bit character encoding in hexadecimaly for \y word boundaryz for \z end of input anchorfor `\ begin of input anchor' for \' end of input anchor< for \< left word boundary> for \> right word boundaryA for \A begin of input anchorB for \B non-word boundaryD for \D ASCII non-digit [^0-9]H for \H ASCII non-blank [^ \t]L for \L ASCII non-lower case letter [^a-z]N for \N not a newlineP for \P{C} Unicode inverse character classes, see 'p'Q for \Q...\E quotationsR for \R Unicode line breakS for \S ASCII non-space (no SP, TAB, LF, VT, FF, or CR)U for \U ASCII non-upper case letter [^A-Z]W for \W ASCII non-word-like character [^0-9A-Z_a-z]X for \X any Unicode characterZ for \Z end of input anchor, before the final line break0 for \0nnn 8-bit character encoding in octal requires a leading 0Note that 'p' is a special case to support Unicode-based matchers that natively support UTF8 patterns and Unicode classes {C}, {C}, , , , , , , , , , and {X}. Basically, 'p' prevents conversion of Unicode patterns to UTF8. This special case does not support {NAME} expansions in bracket lists such as [a-z||{upper}] and {lower}{+}{upper} used in lexer specifications.
The optional "?+" specify lazy and possessive support:
? lazy quantifiers for repeats are supported+ possessive quantifiers for repeats are supportedThe optional "." (dot) specifies that dot matches any character except newline. A dot is implied by the presence of the 's' modifier, and can be omitted in that case.
|
static |
Table with command-line reflex options and lex specification %options.
The table consists of option names with hyphens replaced by underscores.