Provides support for regular expression pattern matching and text processing.
The Regex module provides ECMAScript-compatible regex functionality with Unicode support. It offers efficient pattern compilation, flexible matching modes, and text manipulation capabilities including search, replace, and split operations.
Key features include:
Note: Fluid scripts are expected to use the built-in regex functions for better integration as opposed to this module.
Compile | GetCaptureIndex | Replace | Search | Split
Compiles a regex pattern and returns a regex object.
Parameter | Description |
---|---|
Pattern | A regex pattern string. |
Flags | Optional flags. |
ErrorMsg | Optional reference for storing custom error messages. |
Result | Pointer to store the created regex object. |
Use Compile() to compile a regex pattern into a regex object that can be used for matching and searching. The compiled regex object can be reused for multiple match or search operations, improving performance. It must be removed with ~Core:FreeResource() when no longer needed to avoid memory leaks.
Okay | Operation successful. |
---|---|
AllocMemory | AllocMemory() failed to create a new memory block. |
Syntax | Invalid syntax detected. |
NullArgs | Function call missing argument value(s) |
Retrieves capture indices for a named group.
Parameter | Description |
---|---|
Regex | The compiled regex object. |
Name | The capture group name to resolve. |
Indices | Receives the resulting capture indices. |
Use GetCaptureIndex() to resolve the numeric capture indices associated with a named capture group. ECMAScript allows multiple groups to share the same name; this function therefore returns every index that matches the provided name. If no capture groups match the provided name, ERR::Search
is returned.
Okay | The name was resolved and Indices populated. |
---|---|
Search | The provided name does not exist within the regex. |
NullArgs | One or more required arguments were null. |
Replaces occurrences of the regex pattern in the input text with a specified replacement string.
Parameter | Description |
---|---|
Regex | The compiled regex object. |
Text | The input text to perform replacements on. |
Replacement | The replacement string, which can include back-references like \1 , \2 , etc. |
Output | Receives the resulting string after replacements. |
Flags | Optional flags to modify the replacement behavior. |
Call Replace() to perform regex-based replacements in a given text. The function takes a compiled regex object, the input text, a replacement string, and optional flags to modify the replacement behavior. The replacement string can include back-references like \1
, \2
, etc., to refer to captured groups from the regex match.
Okay | Successful execution, does not necessarily mean replacements were made. |
---|---|
NullArgs | One or more required input arguments were null. |
Performs regex matching.
Parameter | Description |
---|---|
Regex | The compiled regex object. |
Text | The input text to perform matching on. |
Flags | Optional flags to modify the matching behavior. |
Callback | Receives the match results. |
Call Search() to search for a regex pattern in a given text. The function takes a compiled regex object, the input text, optional flags to modify the matching behavior, and a callback function to process the match results. For each match that is found, the callback function is invoked with details about the match.
The C++ prototype for the Callback function is:
ERR callback(int Index, std::vector<std::string_view> &Capture, size_t MatchStart, size_t MatchEnd, APTR Meta);
Note the inclusion of the Index
parameter, which indicates the match number (starting from 0). The MatchStart
and MatchEnd
parameters provide explicit byte offsets into the input text for the matched region.
The Capture vector is always normalised so that its size matches the total number of capturing groups defined by the pattern (including the full match at index 0). Optional groups that did not match are provided as empty std::string_view
instances, ensuring consistent indexing across matches.
Okay | At least one match was found and processed. |
---|---|
Search | No matches were found. |
NullArgs | One or more required input arguments were null. |
Split a string into tokens, using a regex pattern to denote the delimiter.
Parameter | Description |
---|---|
Regex | The compiled regex object. |
Text | The input text to split. |
Output | Receives the resulting string tokens. |
Flags | Optional flags to modify the splitting behavior. |
Call Split() to divide a string into multiple tokens based on a regex pattern that defines the delimiters. The function takes a compiled regex object, the input text, and optional flags to modify the splitting behavior.
The resulting tokens are stored in the provided output array.
If no matches are found, the entire input text is returned as a single token.
Okay | The string was successfully split into tokens. If no matches are found, the entire input text is returned as a single token. |
---|---|
NullArgs | One or more required input arguments were null. |
Optional flags for the Regex functions.
Name | Description |
---|---|
REGEX::DOT_ALL | . matches newlines. |
REGEX::ICASE | Ignore case. |
REGEX::MULTILINE | ^ and $ match line boundaries. |
Name | Description |
---|---|
RMATCH::CONTINUOUS | Requires the match to start at the beginning of the sequence (anchored matching). |
RMATCH::NOT_BEGIN_OF_LINE | Treats the first character in the sequence as NOT being at the beginning of a line, preventing ^ from matching at that position. |
RMATCH::NOT_BEGIN_OF_WORD | Treats the first character in the sequence as NOT being at the beginning of a word, affecting \b word boundary matching. |
RMATCH::NOT_END_OF_LINE | Treats the last character in the sequence as NOT being at the end of a line, preventing $ from matching at that position. |
RMATCH::NOT_END_OF_WORD | Treats the last character in the sequence as NOT being at the end of a word, affecting \b word boundary matching. |
RMATCH::NOT_NULL | Prevents the regex engine from matching zero-length (empty) sequences. |
RMATCH::PREV_AVAILABLE | Indicates that a valid character exists before the first position in the sequence, enabling proper look-behind and boundary assertions. |
RMATCH::REPLACE_FIRST_ONLY | In Replace(), replaces only the first match and leaves subsequent matches unchanged. |
RMATCH::REPLACE_NO_COPY | In Replace(), prevents copying non-matched portions of the input to the output. |
RMATCH::WHOLE | Implicit ^...$ around the pattern. |
Compiled regex structure.
Field | Type | Description |
---|---|---|
Pattern | std::string | Original pattern string |
Flags | REGEX | Compilation flags |