pcre2test.txt   pcre2test.txt 
skipping to change at line 622 skipping to change at line 622
convert_length set convert buffer length convert_length set convert buffer length
debug same as info,fullbincode debug same as info,fullbincode
framesize show matching frame size framesize show matching frame size
fullbincode show binary code with lengths fullbincode show binary code with lengths
/I info show info about compiled pattern /I info show info about compiled pattern
hex unquoted characters are hexadecimal hex unquoted characters are hexadecimal
jit[=<number>] use JIT jit[=<number>] use JIT
jitfast use JIT fast path jitfast use JIT fast path
jitverify verify JIT use jitverify verify JIT use
locale=<name> use this locale locale=<name> use this locale
max_pattern_length=<n> set maximum pattern length max_pattern_compiled ) set maximum compiled pattern
_length=<n> ) length (bytes)
max_pattern_length=<n> set maximum pattern length (code uni
ts)
max_varlookbehind=<n> set maximum variable lookbehind leng th max_varlookbehind=<n> set maximum variable lookbehind leng th
memory show memory used memory show memory used
newline=<type> set newline type newline=<type> set newline type
null_context compile with a NULL context null_context compile with a NULL context
null_pattern pass pattern as NULL null_pattern pass pattern as NULL
parens_nest_limit=<n> set maximum parentheses depth parens_nest_limit=<n> set maximum parentheses depth
posix use the POSIX API posix use the POSIX API
posix_nosub use the POSIX API with REG_NOSUB posix_nosub use the POSIX API with REG_NOSUB
push push compiled pattern onto the stack push push compiled pattern onto the stack
pushcopy push a copy onto the stack pushcopy push a copy onto the stack
skipping to change at line 904 skipping to change at line 906
pcre2test sets its own default of 220, which is required for runn ing pcre2test sets its own default of 220, which is required for runn ing
the standard test suite. the standard test suite.
Limiting the pattern length Limiting the pattern length
The max_pattern_length modifier sets a limit, in code units, to the The max_pattern_length modifier sets a limit, in code units, to the
length of pattern that pcre2_compile() will accept. Breaching the li mit length of pattern that pcre2_compile() will accept. Breaching the li mit
causes a compilation error. The default is the largest number a causes a compilation error. The default is the largest number a
PCRE2_SIZE variable can hold (essentially unlimited). PCRE2_SIZE variable can hold (essentially unlimited).
Limiting the size of a compiled pattern
The max_pattern_compiled_length modifier sets a limit, in bytes, to
the
amount of memory used by a compiled pattern. Breaching the limit cau
ses
a compilation error. The default is the largest number a PCRE2_S
IZE
variable can hold (essentially unlimited).
Using the POSIX wrapper API Using the POSIX wrapper API
The posix and posix_nosub modifiers cause pcre2test to call PCRE2 The posix and posix_nosub modifiers cause pcre2test to call PCRE2
via via
the POSIX wrapper API rather than its native API. When posix_nosub the POSIX wrapper API rather than its native API. When posix_nosub
is is
used, the POSIX option REG_NOSUB is passed to regcomp(). The PO used, the POSIX option REG_NOSUB is passed to regcomp(). The PO
SIX SIX
wrapper supports only the 8-bit library. Note that it does not im wrapper supports only the 8-bit library. Note that it does not im
ply ply
POSIX matching semantics; for more detail see the pcre2posix documen ta- POSIX matching semantics; for more detail see the pcre2posix documen ta-
tion. The following pattern modifiers set options for the regcom p() tion. The following pattern modifiers set options for the regcom p()
function: function:
caseless REG_ICASE caseless REG_ICASE
multiline REG_NEWLINE multiline REG_NEWLINE
dotall REG_DOTALL ) dotall REG_DOTALL )
ungreedy REG_UNGREEDY ) These options are not part of ungreedy REG_UNGREEDY ) These options are not part of
ucp REG_UCP ) the POSIX standard ucp REG_UCP ) the POSIX standard
utf REG_UTF8 ) utf REG_UTF8 )
The regerror_buffsize modifier specifies a size for the error buf The regerror_buffsize modifier specifies a size for the error buf
fer fer
that is passed to regerror() in the event of a compilation error. that is passed to regerror() in the event of a compilation error.
For For
example: example:
/abc/posix,regerror_buffsize=20 /abc/posix,regerror_buffsize=20
This provides a means of testing the behaviour of regerror() when This provides a means of testing the behaviour of regerror() when
the the
buffer is too small for the error message. If this modifier has buffer is too small for the error message. If this modifier has
not not
been set, a large buffer is used. been set, a large buffer is used.
The aftertext and allaftertext subject modifiers work as described be- The aftertext and allaftertext subject modifiers work as described be-
low. All other modifiers are either ignored, with a warning message, or low. All other modifiers are either ignored, with a warning message, or
cause an error. cause an error.
The pattern is passed to regcomp() as a zero-terminated string by de- The pattern is passed to regcomp() as a zero-terminated string by de-
fault, but if the use_length or hex modifiers are set, the REG_PEND ex- fault, but if the use_length or hex modifiers are set, the REG_PEND ex-
tension is used to pass it by length. tension is used to pass it by length.
Testing the stack guard feature Testing the stack guard feature
The stackguard modifier is used to test the use of pcre2_set_c The stackguard modifier is used to test the use of pcre2_set_c
om- om-
pile_recursion_guard(), a function that is provided to enable st pile_recursion_guard(), a function that is provided to enable st
ack ack
availability to be checked during compilation (see the pcre2api do availability to be checked during compilation (see the pcre2api do
cu- cu-
mentation for details). If the number specified by the modifier mentation for details). If the number specified by the modifier
is is
greater than zero, pcre2_set_compile_recursion_guard() is called to set greater than zero, pcre2_set_compile_recursion_guard() is called to set
up callback from pcre2_compile() to a local function. The argument up callback from pcre2_compile() to a local function. The argument
it it
receives is the current nesting parenthesis depth; if this is grea receives is the current nesting parenthesis depth; if this is grea
ter ter
than the value given by the modifier, non-zero is returned, causing the than the value given by the modifier, non-zero is returned, causing the
compilation to be aborted. compilation to be aborted.
Using alternative character tables Using alternative character tables
The value specified for the tables modifier must be one of the dig its The value specified for the tables modifier must be one of the dig its
0, 1, 2, or 3. It causes a specific set of built-in character tables to 0, 1, 2, or 3. It causes a specific set of built-in character tables to
be passed to pcre2_compile(). This is used in the PCRE2 tests to ch be passed to pcre2_compile(). This is used in the PCRE2 tests to ch
eck eck
behaviour with different character tables. The digit specifies the behaviour with different character tables. The digit specifies the
ta- ta-
bles as follows: bles as follows:
0 do not pass any special character tables 0 do not pass any special character tables
1 the default ASCII tables, as distributed in 1 the default ASCII tables, as distributed in
pcre2_chartables.c.dist pcre2_chartables.c.dist
2 a set of tables defining ISO 8859 characters 2 a set of tables defining ISO 8859 characters
3 a set of tables loaded by the #loadtables command 3 a set of tables loaded by the #loadtables command
In tables 2, some characters whose codes are greater than 128 are id en- In tables 2, some characters whose codes are greater than 128 are id en-
tified as letters, digits, spaces, etc. Tables 3 can be used only af ter tified as letters, digits, spaces, etc. Tables 3 can be used only af ter
a #loadtables command has loaded them from a binary file. Setting al- a #loadtables command has loaded them from a binary file. Setting al-
ternate character tables and a locale are mutually exclusive. ternate character tables and a locale are mutually exclusive.
Setting certain match controls Setting certain match controls
The following modifiers are really subject modifiers, and are descri bed The following modifiers are really subject modifiers, and are descri bed
under "Subject Modifiers" below. However, they may be included in under "Subject Modifiers" below. However, they may be included i
a n a
pattern's modifier list, in which case they are applied to every s pattern's modifier list, in which case they are applied to every s
ub- ub-
ject line that is processed with that pattern. These modifiers do ject line that is processed with that pattern. These modifiers do
not not
affect the compilation process. affect the compilation process.
aftertext show text after match aftertext show text after match
allaftertext show text after captures allaftertext show text after captures
allcaptures show all captures allcaptures show all captures
allvector show the entire ovector allvector show the entire ovector
allusedtext show all consulted text allusedtext show all consulted text
altglobal alternative global matching altglobal alternative global matching
/g global global matching /g global global matching
heapframes_size show match data heapframes size heapframes_size show match data heapframes size
skipping to change at line 1001 skipping to change at line 1010
substitute_extended use PCRE2_SUBSTITUTE_EXTENDED substitute_extended use PCRE2_SUBSTITUTE_EXTENDED
substitute_literal use PCRE2_SUBSTITUTE_LITERAL substitute_literal use PCRE2_SUBSTITUTE_LITERAL
substitute_matched use PCRE2_SUBSTITUTE_MATCHED substitute_matched use PCRE2_SUBSTITUTE_MATCHED
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENG TH substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENG TH
substitute_replacement_only use PCRE2_SUBSTITUTE_REPLACEMENT_O NLY substitute_replacement_only use PCRE2_SUBSTITUTE_REPLACEMENT_O NLY
substitute_skip=<n> skip substitution <n> substitute_skip=<n> skip substitution <n>
substitute_stop=<n> skip substitution <n> and followin g substitute_stop=<n> skip substitution <n> and followin g
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
These modifiers may not appear in a #pattern command. If you want t hem These modifiers may not appear in a #pattern command. If you want t hem
as defaults, set them in a #subject command. as defaults, set them in a #subject command.
Specifying literal subject lines Specifying literal subject lines
If the subject_literal modifier is present on a pattern, all the s ub- If the subject_literal modifier is present on a pattern, all the s ub-
ject lines that it matches are taken as literal strings, with no int er- ject lines that it matches are taken as literal strings, with no int er-
pretation of backslashes. It is not possible to set subject modifi pretation of backslashes. It is not possible to set subject modifi
ers ers
on such lines, but any that are set as defaults by a #subject comm on such lines, but any that are set as defaults by a #subject comm
and and
are recognized. are recognized.
Saving a compiled pattern Saving a compiled pattern
When a pattern with the push modifier is successfully compiled, it When a pattern with the push modifier is successfully compiled, it
is is
pushed onto a stack of compiled patterns, and pcre2test expects pushed onto a stack of compiled patterns, and pcre2test expects
the the
next line to contain a new pattern (or a command) instead of a subj next line to contain a new pattern (or a command) instead of a subj
ect ect
line. This facility is used when saving compiled patterns to a file, as line. This facility is used when saving compiled patterns to a file, as
described in the section entitled "Saving and restoring compiled p described in the section entitled "Saving and restoring compiled p
at- at-
terns" below. If pushcopy is used instead of push, a copy of the c terns" below. If pushcopy is used instead of push, a copy of the c
om- om-
piled pattern is stacked, leaving the original as current, ready piled pattern is stacked, leaving the original as current, ready
to to
match the following input lines. This provides a way of testing match the following input lines. This provides a way of testing
the the
pcre2_code_copy() function. The push and pushcopy modifiers are pcre2_code_copy() function. The push and pushcopy modifiers are
in- in-
compatible with compilation modifiers such as global that act at ma compatible with compilation modifiers such as global that act at ma
tch tch
time. Any that are specified are ignored (for the stacked copy), wit h a time. Any that are specified are ignored (for the stacked copy), wit h a
warning message, except for replace, which causes an error. Note t warning message, except for replace, which causes an error. Note t
hat hat
jitverify, which is allowed, does not carry through to any subsequ jitverify, which is allowed, does not carry through to any subsequ
ent ent
matching that uses a stacked pattern. matching that uses a stacked pattern.
Testing foreign pattern conversion Testing foreign pattern conversion
The experimental foreign pattern conversion functions in PCRE2 can The experimental foreign pattern conversion functions in PCRE2 can
be be
tested by setting the convert modifier. Its argument is a colon-se tested by setting the convert modifier. Its argument is a colon-se
pa- pa-
rated list of options, which set the equivalent option for rated list of options, which set the equivalent option for
the the
pcre2_pattern_convert() function: pcre2_pattern_convert() function:
glob PCRE2_CONVERT_GLOB glob PCRE2_CONVERT_GLOB
glob_no_starstar PCRE2_CONVERT_GLOB_NO_STARSTAR glob_no_starstar PCRE2_CONVERT_GLOB_NO_STARSTAR
glob_no_wild_separator PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR glob_no_wild_separator PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR
posix_basic PCRE2_CONVERT_POSIX_BASIC posix_basic PCRE2_CONVERT_POSIX_BASIC
posix_extended PCRE2_CONVERT_POSIX_EXTENDED posix_extended PCRE2_CONVERT_POSIX_EXTENDED
unset Unset all options unset Unset all options
The "unset" value is useful for turning off a default that has been set The "unset" value is useful for turning off a default that has been set
by a #pattern command. When one of these options is set, the input p at- by a #pattern command. When one of these options is set, the input p at-
tern is passed to pcre2_pattern_convert(). If the conversion is s tern is passed to pcre2_pattern_convert(). If the conversion is s
uc- uc-
cessful, the result is reflected in the output and then passed cessful, the result is reflected in the output and then passed
to to
pcre2_compile(). The normal utf and no_utf_check options, if set, ca use pcre2_compile(). The normal utf and no_utf_check options, if set, ca use
the PCRE2_CONVERT_UTF and PCRE2_CONVERT_NO_UTF_CHECK options to be the PCRE2_CONVERT_UTF and PCRE2_CONVERT_NO_UTF_CHECK options to be
passed to pcre2_pattern_convert(). passed to pcre2_pattern_convert().
By default, the conversion function is allowed to allocate a buffer for By default, the conversion function is allowed to allocate a buffer for
its output. However, if the convert_length modifier is set to a va its output. However, if the convert_length modifier is set to a va
lue lue
greater than zero, pcre2test passes a buffer of the given length. T greater than zero, pcre2test passes a buffer of the given length. T
his his
makes it possible to test the length check. makes it possible to test the length check.
The convert_glob_escape and convert_glob_separator modifiers can The convert_glob_escape and convert_glob_separator modifiers can
be be
used to specify the escape and separator characters for glob proce used to specify the escape and separator characters for glob proce
ss- ss-
ing, overriding the defaults, which are operating-system dependent. ing, overriding the defaults, which are operating-system dependent.
SUBJECT MODIFIERS SUBJECT MODIFIERS
The modifiers that can appear in subject lines and the #subject comm and The modifiers that can appear in subject lines and the #subject comm and
are of two types. are of two types.
Setting match options Setting match options
The following modifiers set options for pcre2_match() or The following modifiers set options for pcre2_match() or
pcre2_dfa_match(). See pcreapi for a description of their effects. pcre2_dfa_match(). See pcreapi for a description of their effects.
anchored set PCRE2_ANCHORED anchored set PCRE2_ANCHORED
endanchored set PCRE2_ENDANCHORED endanchored set PCRE2_ENDANCHORED
dfa_restart set PCRE2_DFA_RESTART dfa_restart set PCRE2_DFA_RESTART
dfa_shortest set PCRE2_DFA_SHORTEST dfa_shortest set PCRE2_DFA_SHORTEST
disable_recurseloop_check set PCRE2_DISABLE_RECURSELOOP_CHECK disable_recurseloop_check set PCRE2_DISABLE_RECURSELOOP_CHECK
no_jit set PCRE2_NO_JIT no_jit set PCRE2_NO_JIT
no_utf_check set PCRE2_NO_UTF_CHECK no_utf_check set PCRE2_NO_UTF_CHECK
notbol set PCRE2_NOTBOL notbol set PCRE2_NOTBOL
notempty set PCRE2_NOTEMPTY notempty set PCRE2_NOTEMPTY
notempty_atstart set PCRE2_NOTEMPTY_ATSTART notempty_atstart set PCRE2_NOTEMPTY_ATSTART
noteol set PCRE2_NOTEOL noteol set PCRE2_NOTEOL
partial_hard (or ph) set PCRE2_PARTIAL_HARD partial_hard (or ph) set PCRE2_PARTIAL_HARD
partial_soft (or ps) set PCRE2_PARTIAL_SOFT partial_soft (or ps) set PCRE2_PARTIAL_SOFT
The partial matching modifiers are provided with abbreviations beca use The partial matching modifiers are provided with abbreviations beca use
they appear frequently in tests. they appear frequently in tests.
If the posix or posix_nosub modifier was present on the pattern, ca us- If the posix or posix_nosub modifier was present on the pattern, ca us-
ing the POSIX wrapper API to be used, the only option-setting modifi ers ing the POSIX wrapper API to be used, the only option-setting modifi ers
that have any effect are notbol, notempty, and noteol, causing REG_N OT- that have any effect are notbol, notempty, and noteol, causing REG_N OT-
BOL, REG_NOTEMPTY, and REG_NOTEOL, respectively, to be passed to BOL, REG_NOTEMPTY, and REG_NOTEOL, respectively, to be passed to
regexec(). The other modifiers are ignored, with a warning message. regexec(). The other modifiers are ignored, with a warning message.
There is one additional modifier that can be used with the POSIX wr ap- There is one additional modifier that can be used with the POSIX wr ap-
per. It is ignored (with a warning) if used for non-POSIX matching. per. It is ignored (with a warning) if used for non-POSIX matching.
posix_startend=<n>[:<m>] posix_startend=<n>[:<m>]
This causes the subject string to be passed to regexec() using This causes the subject string to be passed to regexec() using
the the
REG_STARTEND option, which uses offsets to specify which part of REG_STARTEND option, which uses offsets to specify which part of
the the
string is searched. If only one number is given, the end offset string is searched. If only one number is given, the end offset
is is
passed as the end of the subject string. For more detail of REG_ST passed as the end of the subject string. For more detail of REG_ST
AR- AR-
TEND, see the pcre2posix documentation. If the subject string conta TEND, see the pcre2posix documentation. If the subject string conta
ins ins
binary zeros (coded as escapes such as \x{00} because pcre2test d binary zeros (coded as escapes such as \x{00} because pcre2test d
oes oes
not support actual binary zeros in its input), you must use posix_st ar- not support actual binary zeros in its input), you must use posix_st ar-
tend to specify its length. tend to specify its length.
Setting match controls Setting match controls
The following modifiers affect the matching process or request ad The following modifiers affect the matching process or request ad
di- di-
tional information. Some of them may also be specified on a patt tional information. Some of them may also be specified on a patt
ern ern
line (see above), in which case they apply to every subject line t line (see above), in which case they apply to every subject line t
hat hat
is matched against that pattern, but can be overridden by modifiers is matched against that pattern, but can be overridden by modifiers
on on
the subject. the subject.
aftertext show text after match aftertext show text after match
allaftertext show text after captures allaftertext show text after captures
allcaptures show all captures allcaptures show all captures
allvector show the entire ovector allvector show the entire ovector
allusedtext show all consulted text (non-JIT on ly) allusedtext show all consulted text (non-JIT on ly)
altglobal alternative global matching altglobal alternative global matching
callout_capture show captures at callout time callout_capture show captures at callout time
callout_data=<n> set a value to pass via callouts callout_data=<n> set a value to pass via callouts
skipping to change at line 1165 skipping to change at line 1174
substitute_matched use PCRE2_SUBSTITUTE_MATCHED substitute_matched use PCRE2_SUBSTITUTE_MATCHED
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGT H substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGT H
substitute_replacement_only use PCRE2_SUBSTITUTE_REPLACEMENT_O NLY substitute_replacement_only use PCRE2_SUBSTITUTE_REPLACEMENT_O NLY
substitute_skip=<n> skip substitution number n substitute_skip=<n> skip substitution number n
substitute_stop=<n> skip substitution number n and grea ter substitute_stop=<n> skip substitution number n and grea ter
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
zero_terminate pass the subject as zero-terminated zero_terminate pass the subject as zero-terminated
The effects of these modifiers are described in the following sectio ns. The effects of these modifiers are described in the following sectio ns.
When matching via the POSIX wrapper API, the aftertext, allafterte When matching via the POSIX wrapper API, the aftertext, allafterte
xt, xt,
and ovector subject modifiers work as described below. All other mo and ovector subject modifiers work as described below. All other mo
di- di-
fiers are either ignored, with a warning message, or cause an error. fiers are either ignored, with a warning message, or cause an error.
Showing more text Showing more text
The aftertext modifier requests that as well as outputting the part of The aftertext modifier requests that as well as outputting the part of
the subject string that matched the entire pattern, pcre2test should in the subject string that matched the entire pattern, pcre2test should in
addition output the remainder of the subject string. This is useful for addition output the remainder of the subject string. This is useful for
tests where the subject contains multiple copies of the same substri ng. tests where the subject contains multiple copies of the same substri ng.
The allaftertext modifier requests the same action for captured s ub- The allaftertext modifier requests the same action for captured s ub-
strings as well as the main matched substring. In each case the rema in- strings as well as the main matched substring. In each case the rema in-
der is output on the following line with a plus character following the der is output on the following line with a plus character following the
capture number. capture number.
The allusedtext modifier requests that all the text that was consul The allusedtext modifier requests that all the text that was consul
ted ted
during a successful pattern match by the interpreter should be sho during a successful pattern match by the interpreter should be sho
wn, wn,
for both full and partial matches. This feature is not supported for both full and partial matches. This feature is not supported
for for
JIT matching, and if requested with JIT it is ignored (with a warn JIT matching, and if requested with JIT it is ignored (with a warn
ing ing
message). Setting this modifier affects the output if there is a lo message). Setting this modifier affects the output if there is a lo
ok- ok-
behind at the start of a match, or, for a complete match, a lookah behind at the start of a match, or, for a complete match, a lookah
ead ead
at the end, or if \K is used in the pattern. Characters that precede or at the end, or if \K is used in the pattern. Characters that precede or
follow the start and end of the actual match are indicated in the o ut- follow the start and end of the actual match are indicated in the o ut-
put by '<' or '>' characters underneath them. Here is an example: put by '<' or '>' characters underneath them. Here is an example:
re> /(?<=pqr)abc(?=xyz)/ re> /(?<=pqr)abc(?=xyz)/
data> 123pqrabcxyz456\=allusedtext data> 123pqrabcxyz456\=allusedtext
0: pqrabcxyz 0: pqrabcxyz
<<< >>> <<< >>>
data> 123pqrabcxy\=ph,allusedtext data> 123pqrabcxy\=ph,allusedtext
Partial match: pqrabcxy Partial match: pqrabcxy
<<< <<<
The first, complete match shows that the matched string is "abc", w The first, complete match shows that the matched string is "abc", w
ith ith
the preceding and following strings "pqr" and "xyz" having been c the preceding and following strings "pqr" and "xyz" having been c
on- on-
sulted during the match (when processing the assertions). The part sulted during the match (when processing the assertions). The part
ial ial
match can indicate only the preceding string. match can indicate only the preceding string.
The startchar modifier requests that the starting character for The startchar modifier requests that the starting character for
the the
match be indicated, if it is different to the start of the matc match be indicated, if it is different to the start of the matc
hed hed
string. The only time when this occurs is when \K has been processed as string. The only time when this occurs is when \K has been processed as
part of the match. In this situation, the output for the matched str ing part of the match. In this situation, the output for the matched str ing
is displayed from the starting character instead of from the ma tch is displayed from the starting character instead of from the ma tch
point, with circumflex characters under the earlier characters. For ex- point, with circumflex characters under the earlier characters. For ex-
ample: ample:
re> /abc\Kxyz/ re> /abc\Kxyz/
data> abcxyz\=startchar data> abcxyz\=startchar
0: abcxyz 0: abcxyz
^^^ ^^^
Unlike allusedtext, the startchar modifier can be used with JIT. H ow- Unlike allusedtext, the startchar modifier can be used with JIT. H ow-
ever, these two modifiers are mutually exclusive. ever, these two modifiers are mutually exclusive.
Showing the value of all capture groups Showing the value of all capture groups
The allcaptures modifier requests that the values of all potential c ap- The allcaptures modifier requests that the values of all potential c ap-
tured parentheses be output after a match. By default, only those up to tured parentheses be output after a match. By default, only those up to
the highest one actually used in the match are output (corresponding to the highest one actually used in the match are output (corresponding to
the return code from pcre2_match()). Groups that did not take part the return code from pcre2_match()). Groups that did not take part
in in
the match are output as "<unset>". This modifier is not relevant the match are output as "<unset>". This modifier is not relevant
for for
DFA matching (which does no capturing) and does not apply when repl DFA matching (which does no capturing) and does not apply when repl
ace ace
is specified; it is ignored, with a warning message, if present. is specified; it is ignored, with a warning message, if present.
Showing the entire ovector, for all outcomes Showing the entire ovector, for all outcomes
The allvector modifier requests that the entire ovector be shown, wh at- The allvector modifier requests that the entire ovector be shown, wh at-
ever the outcome of the match. Compare allcaptures, which shows only up ever the outcome of the match. Compare allcaptures, which shows only up
to the maximum number of capture groups for the pattern, and then o to the maximum number of capture groups for the pattern, and then o
nly nly
for a successful complete non-DFA match. This modifier, which acts for a successful complete non-DFA match. This modifier, which acts
af- af-
ter any match result, and also for DFA matching, provides a means ter any match result, and also for DFA matching, provides a means
of of
checking that there are no unexpected modifications to ovector fiel checking that there are no unexpected modifications to ovector fiel
ds. ds.
Before each match attempt, the ovector is filled with a special val Before each match attempt, the ovector is filled with a special val
ue, ue,
and if this is found in both elements of a capturing pair, "< and if this is found in both elements of a capturing pair, "<
un- un-
changed>" is output. After a successful match, this applies to changed>" is output. After a successful match, this applies to
all all
groups after the maximum capture group for the pattern. In other ca groups after the maximum capture group for the pattern. In other ca
ses ses
it applies to the entire ovector. After a partial match, the first it applies to the entire ovector. After a partial match, the first
two two
elements are the only ones that should be set. After a DFA match, elements are the only ones that should be set. After a DFA match,
the the
amount of ovector that is used depends on the number of matches t amount of ovector that is used depends on the number of matches t
hat hat
were found. were found.
Testing pattern callouts Testing pattern callouts
A callout function is supplied when pcre2test calls the library mat A callout function is supplied when pcre2test calls the library mat
ch- ch-
ing functions, unless callout_none is specified. Its behaviour can ing functions, unless callout_none is specified. Its behaviour can
be be
controlled by various modifiers listed above whose names begin w controlled by various modifiers listed above whose names begin w
ith ith
callout_. Details are given in the section entitled "Callouts" bel callout_. Details are given in the section entitled "Callouts" bel
ow. ow.
Testing callouts from pcre2_substitute() is described separately Testing callouts from pcre2_substitute() is described separately
in in
"Testing the substitution function" below. "Testing the substitution function" below.
Finding all matches in a string Finding all matches in a string
Searching for all possible matches within a subject can be requested by Searching for all possible matches within a subject can be requested by
the global or altglobal modifier. After finding a match, the match the global or altglobal modifier. After finding a match, the match
ing ing
function is called again to search the remainder of the subject. function is called again to search the remainder of the subject.
The The
difference between global and altglobal is that the former uses difference between global and altglobal is that the former uses
the the
start_offset argument to pcre2_match() or pcre2_dfa_match() to st start_offset argument to pcre2_match() or pcre2_dfa_match() to st
art art
searching at a new point within the entire string (which is what P searching at a new point within the entire string (which is what P
erl erl
does), whereas the latter passes over a shortened subject. This make s a does), whereas the latter passes over a shortened subject. This make s a
difference to the matching process if the pattern begins with a look be- difference to the matching process if the pattern begins with a look be-
hind assertion (including \b or \B). hind assertion (including \b or \B).
If an empty string is matched, the next match is done with the If an empty string is matched, the next match is done with the
PCRE2_NOTEMPTY_ATSTART and PCRE2_ANCHORED flags set, in order to sea rch PCRE2_NOTEMPTY_ATSTART and PCRE2_ANCHORED flags set, in order to sea rch
for another, non-empty, match at the same point in the subject. If t his for another, non-empty, match at the same point in the subject. If t his
match fails, the start offset is advanced, and the normal match is match fails, the start offset is advanced, and the normal match is
re- re-
tried. This imitates the way Perl handles such cases when using the tried. This imitates the way Perl handles such cases when using the
/g /g
modifier or the split() function. Normally, the start offset is modifier or the split() function. Normally, the start offset is
ad- ad-
vanced by one character, but if the newline convention recognizes C vanced by one character, but if the newline convention recognizes C
RLF RLF
as a newline, and the current character is CR followed by LF, an as a newline, and the current character is CR followed by LF, an
ad- ad-
vance of two characters occurs. vance of two characters occurs.
Testing substring extraction functions Testing substring extraction functions
The copy and get modifiers can be used to test the pcre2_s ub- The copy and get modifiers can be used to test the pcre2_s ub-
string_copy_xxx() and pcre2_substring_get_xxx() functions. They can be string_copy_xxx() and pcre2_substring_get_xxx() functions. They can be
given more than once, and each can specify a capture group name or n um- given more than once, and each can specify a capture group name or n um-
ber, for example: ber, for example:
abcd\=copy=1,copy=3,get=G1 abcd\=copy=1,copy=3,get=G1
If the #subject command is used to set default copy and/or get lis If the #subject command is used to set default copy and/or get lis
ts, ts,
these can be unset by specifying a negative number to cancel all n these can be unset by specifying a negative number to cancel all n
um- um-
bered groups and an empty name to cancel all named groups. bered groups and an empty name to cancel all named groups.
The getall modifier tests pcre2_substring_list_get(), which extra cts The getall modifier tests pcre2_substring_list_get(), which extra cts
all captured substrings. all captured substrings.
If the subject line is successfully matched, the substrings extrac If the subject line is successfully matched, the substrings extrac
ted ted
by the convenience functions are output with C, G, or L after by the convenience functions are output with C, G, or L after
the the
string number instead of a colon. This is in addition to the nor string number instead of a colon. This is in addition to the nor
mal mal
full list. The string length (that is, the return from the extract full list. The string length (that is, the return from the extract
ion ion
function) is given in parentheses after each substring, followed by the function) is given in parentheses after each substring, followed by the
name when the extraction was by name. name when the extraction was by name.
Testing the substitution function Testing the substitution function
If the replace modifier is set, the pcre2_substitute() function If the replace modifier is set, the pcre2_substitute() function
is is
called instead of one of the matching functions (or after one call called instead of one of the matching functions (or after one call
of of
pcre2_match() in the case of PCRE2_SUBSTITUTE_MATCHED). Note that pcre2_match() in the case of PCRE2_SUBSTITUTE_MATCHED). Note that
re- re-
placement strings cannot contain commas, because a comma signifies placement strings cannot contain commas, because a comma signifies
the the
end of a modifier. This is not thought to be an issue in a test p end of a modifier. This is not thought to be an issue in a test p
ro- ro-
gram. gram.
Specifying a completely empty replacement string disables this mo Specifying a completely empty replacement string disables this mo
di- di-
fier. However, it is possible to specify an empty replacement by p fier. However, it is possible to specify an empty replacement by p
ro- ro-
viding a buffer length, as described below, for an otherwise empty viding a buffer length, as described below, for an otherwise empty
re- re-
placement. placement.
Unlike subject strings, pcre2test does not process replacement stri Unlike subject strings, pcre2test does not process replacement stri
ngs ngs
for escape sequences. In UTF mode, a replacement string is checked for escape sequences. In UTF mode, a replacement string is checked
to to
see if it is a valid UTF-8 string. If so, it is correctly converted see if it is a valid UTF-8 string. If so, it is correctly converted
to to
a UTF string of the appropriate code unit width. If it is not a va a UTF string of the appropriate code unit width. If it is not a va
lid lid
UTF-8 string, the individual code units are copied directly. This p UTF-8 string, the individual code units are copied directly. This p
ro- ro-
vides a means of passing an invalid UTF-8 string for testing purpose s. vides a means of passing an invalid UTF-8 string for testing purpose s.
The following modifiers set options (in additional to the normal ma tch The following modifiers set options (in additional to the normal ma tch
options) for pcre2_substitute(): options) for pcre2_substitute():
global PCRE2_SUBSTITUTE_GLOBAL global PCRE2_SUBSTITUTE_GLOBAL
substitute_extended PCRE2_SUBSTITUTE_EXTENDED substitute_extended PCRE2_SUBSTITUTE_EXTENDED
substitute_literal PCRE2_SUBSTITUTE_LITERAL substitute_literal PCRE2_SUBSTITUTE_LITERAL
substitute_matched PCRE2_SUBSTITUTE_MATCHED substitute_matched PCRE2_SUBSTITUTE_MATCHED
substitute_overflow_length PCRE2_SUBSTITUTE_OVERFLOW_LENGTH substitute_overflow_length PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
substitute_replacement_only PCRE2_SUBSTITUTE_REPLACEMENT_ONLY substitute_replacement_only PCRE2_SUBSTITUTE_REPLACEMENT_ONLY
substitute_unknown_unset PCRE2_SUBSTITUTE_UNKNOWN_UNSET substitute_unknown_unset PCRE2_SUBSTITUTE_UNKNOWN_UNSET
substitute_unset_empty PCRE2_SUBSTITUTE_UNSET_EMPTY substitute_unset_empty PCRE2_SUBSTITUTE_UNSET_EMPTY
See the pcre2api documentation for details of these options. See the pcre2api documentation for details of these options.
After a successful substitution, the modified string is output, p After a successful substitution, the modified string is output, p
re- re-
ceded by the number of replacements. This may be zero if there were ceded by the number of replacements. This may be zero if there were
no no
matches. Here is a simple example of a substitution test: matches. Here is a simple example of a substitution test:
/abc/replace=xxx /abc/replace=xxx
=abc=abc= =abc=abc=
1: =xxx=abc= 1: =xxx=abc=
=abc=abc=\=global =abc=abc=\=global
2: =xxx=xxx= 2: =xxx=xxx=
Subject and replacement strings should be kept relatively short (fe Subject and replacement strings should be kept relatively short (fe
wer wer
than 256 characters) for substitution tests, as fixed-size buffers than 256 characters) for substitution tests, as fixed-size buffers
are are
used. To make it easy to test for buffer overflow, if the replacem used. To make it easy to test for buffer overflow, if the replacem
ent ent
string starts with a number in square brackets, that number is pas string starts with a number in square brackets, that number is pas
sed sed
to pcre2_substitute() as the size of the output buffer, with the to pcre2_substitute() as the size of the output buffer, with the
re- re-
placement string starting at the next character. Here is an exam placement string starting at the next character. Here is an exam
ple ple
that tests the edge case: that tests the edge case:
/abc/ /abc/
123abc123\=replace=[10]XYZ 123abc123\=replace=[10]XYZ
1: 123XYZ123 1: 123XYZ123
123abc123\=replace=[9]XYZ 123abc123\=replace=[9]XYZ
Failed: error -47: no more memory Failed: error -47: no more memory
The default action of pcre2_substitute() is to return PCRE2_ ER- The default action of pcre2_substitute() is to return PCRE2_ ER-
ROR_NOMEMORY when the output buffer is too small. However, if ROR_NOMEMORY when the output buffer is too small. However, if
the the
PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option is set (by using the subs PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option is set (by using the subs
ti- ti-
tute_overflow_length modifier), pcre2_substitute() continues to go tute_overflow_length modifier), pcre2_substitute() continues to go
through the motions of matching and substituting (but not doing through the motions of matching and substituting (but not doing
any any
callouts), in order to compute the size of buffer that is requir callouts), in order to compute the size of buffer that is requir
ed. ed.
When this happens, pcre2test shows the required buffer length (wh When this happens, pcre2test shows the required buffer length (wh
ich ich
includes space for the trailing zero) as part of the error message. For includes space for the trailing zero) as part of the error message. For
example: example:
/abc/substitute_overflow_length /abc/substitute_overflow_length
123abc123\=replace=[9]XYZ 123abc123\=replace=[9]XYZ
Failed: error -47: no more memory: 10 code units are needed Failed: error -47: no more memory: 10 code units are needed
A replacement string is ignored with POSIX and DFA matching. Specify ing A replacement string is ignored with POSIX and DFA matching. Specify ing
partial matching provokes an error return ("bad option value") f rom partial matching provokes an error return ("bad option value") f rom
pcre2_substitute(). pcre2_substitute().
Testing substitute callouts Testing substitute callouts
If the substitute_callout modifier is set, a substitution callout fu nc- If the substitute_callout modifier is set, a substitution callout fu nc-
tion is set up. The null_context modifier must not be set, because tion is set up. The null_context modifier must not be set, because
the the
address of the callout function is passed in a match context. When address of the callout function is passed in a match context. When
the the
callout function is called (after each substitution), details of callout function is called (after each substitution), details of
the the
input and output strings are output. For example: input and output strings are output. For example:
/abc/g,replace=<$0>,substitute_callout /abc/g,replace=<$0>,substitute_callout
abcdefabcpqr abcdefabcpqr
1(1) Old 0 3 "abc" New 0 5 "<abc>" 1(1) Old 0 3 "abc" New 0 5 "<abc>"
2(1) Old 6 9 "abc" New 8 13 "<abc>" 2(1) Old 6 9 "abc" New 8 13 "<abc>"
2: <abc>def<abc>pqr 2: <abc>def<abc>pqr
The first number on each callout line is the count of matches. The The first number on each callout line is the count of matches. The
parenthesized number is the number of pairs that are set in the ovec tor parenthesized number is the number of pairs that are set in the ovec tor
(that is, one more than the number of capturing groups that were se t). (that is, one more than the number of capturing groups that were se t).
Then are listed the offsets of the old substring, its contents, and the Then are listed the offsets of the old substring, its contents, and the
same for the replacement. same for the replacement.
By default, the substitution callout function returns zero, which By default, the substitution callout function returns zero, which
ac- ac-
cepts the replacement and causes matching to continue if /g was us cepts the replacement and causes matching to continue if /g was us
ed. ed.
Two further modifiers can be used to test other return values. If s Two further modifiers can be used to test other return values. If s
ub- ub-
stitute_skip is set to a value greater than zero the callout funct stitute_skip is set to a value greater than zero the callout funct
ion ion
returns +1 for the match of that number, and similarly substitute_s returns +1 for the match of that number, and similarly substitute_s
top top
returns -1. These cause the replacement to be rejected, and -1 cau returns -1. These cause the replacement to be rejected, and -1 cau
ses ses
no further matching to take place. If either of them are set, subs no further matching to take place. If either of them are set, subs
ti- ti-
tute_callout is assumed. For example: tute_callout is assumed. For example:
/abc/g,replace=<$0>,substitute_skip=1 /abc/g,replace=<$0>,substitute_skip=1
abcdefabcpqr abcdefabcpqr
1(1) Old 0 3 "abc" New 0 5 "<abc> SKIPPED" 1(1) Old 0 3 "abc" New 0 5 "<abc> SKIPPED"
2(1) Old 6 9 "abc" New 6 11 "<abc>" 2(1) Old 6 9 "abc" New 6 11 "<abc>"
2: abcdef<abc>pqr 2: abcdef<abc>pqr
abcdefabcpqr\=substitute_stop=1 abcdefabcpqr\=substitute_stop=1
1(1) Old 0 3 "abc" New 0 5 "<abc> STOPPED" 1(1) Old 0 3 "abc" New 0 5 "<abc> STOPPED"
1: abcdefabcpqr 1: abcdefabcpqr
If both are set for the same number, stop takes precedence. Only a s in- If both are set for the same number, stop takes precedence. Only a s in-
gle skip or stop is supported, which is sufficient for testing that the gle skip or stop is supported, which is sufficient for testing that the
feature works. feature works.
Setting the JIT stack size Setting the JIT stack size
The jitstack modifier provides a way of setting the maximum stack s The jitstack modifier provides a way of setting the maximum stack s
ize ize
that is used by the just-in-time optimization code. It is ignored that is used by the just-in-time optimization code. It is ignored
if if
JIT optimization is not being used. The value is a number of kibiby JIT optimization is not being used. The value is a number of kibiby
tes tes
(units of 1024 bytes). Setting zero reverts to the default of 32K (units of 1024 bytes). Setting zero reverts to the default of 32K
iB. iB.
Providing a stack that is larger than the default is necessary only for Providing a stack that is larger than the default is necessary only for
very complicated patterns. If jitstack is set non-zero on a subj ect very complicated patterns. If jitstack is set non-zero on a subj ect
line it overrides any value that was set on the pattern. line it overrides any value that was set on the pattern.
Setting heap, match, and depth limits Setting heap, match, and depth limits
The heap_limit, match_limit, and depth_limit modifiers set the app The heap_limit, match_limit, and depth_limit modifiers set the app
ro- ro-
priate limits in the match context. These values are ignored when priate limits in the match context. These values are ignored when
the the
find_limits or find_limits_noheap modifier is specified. find_limits or find_limits_noheap modifier is specified.
Finding minimum limits Finding minimum limits
If the find_limits modifier is present on a subject line, pcre2t If the find_limits modifier is present on a subject line, pcre2t
est est
calls the relevant matching function several times, setting differ calls the relevant matching function several times, setting differ
ent ent
values in the match context via pcre2_set_heap_limit values in the match context via pcre2_set_heap_limit
(), (),
pcre2_set_match_limit(), or pcre2_set_depth_limit() until it finds pcre2_set_match_limit(), or pcre2_set_depth_limit() until it finds
the the
smallest value for each parameter that allows the match to compl smallest value for each parameter that allows the match to compl
ete ete
without a "limit exceeded" error. The match itself may succeed or fa il. without a "limit exceeded" error. The match itself may succeed or fa il.
An alternative modifier, find_limits_noheap, omits the heap limit. T his An alternative modifier, find_limits_noheap, omits the heap limit. T his
is used in the standard tests, because the minimum heap limit var is used in the standard tests, because the minimum heap limit var
ies ies
between systems. If JIT is being used, only the match limit is re between systems. If JIT is being used, only the match limit is re
le- le-
vant, and the other two are automatically omitted. vant, and the other two are automatically omitted.
When using this modifier, the pattern should not contain any limit s et- When using this modifier, the pattern should not contain any limit s et-
tings such as (*LIMIT_MATCH=...) within it. If such a setting is tings such as (*LIMIT_MATCH=...) within it. If such a setting is
present and is lower than the minimum matching value, the minimum va lue present and is lower than the minimum matching value, the minimum va lue
cannot be found because pcre2_set_match_limit() etc. are only able to cannot be found because pcre2_set_match_limit() etc. are only able to
reduce the value of an in-pattern limit; they cannot increase it. reduce the value of an in-pattern limit; they cannot increase it.
For non-DFA matching, the minimum depth_limit number is a measure of For non-DFA matching, the minimum depth_limit number is a measure of
how much nested backtracking happens (that is, how deeply the patter n's how much nested backtracking happens (that is, how deeply the patter n's
tree is searched). In the case of DFA matching, depth_limit contr tree is searched). In the case of DFA matching, depth_limit contr
ols ols
the depth of recursive calls of the internal function that is used the depth of recursive calls of the internal function that is used
for for
handling pattern recursion, lookaround assertions, and atomic groups . handling pattern recursion, lookaround assertions, and atomic groups .
For non-DFA matching, the match_limit number is a measure of the amo unt For non-DFA matching, the match_limit number is a measure of the amo unt
of backtracking that takes place, and learning the minimum value can be of backtracking that takes place, and learning the minimum value can be
instructive. For most simple matches, the number is quite small, instructive. For most simple matches, the number is quite small,
but but
for patterns with very large numbers of matching possibilities, it for patterns with very large numbers of matching possibilities, it
can can
become large very quickly with increasing length of subject string. become large very quickly with increasing length of subject string.
In In
the case of DFA matching, match_limit controls the total number the case of DFA matching, match_limit controls the total number
of of
calls, both recursive and non-recursive, to the internal matching fu nc- calls, both recursive and non-recursive, to the internal matching fu nc-
tion, thus controlling the overall amount of computing resource that is tion, thus controlling the overall amount of computing resource that is
used. used.
For both kinds of matching, the heap_limit number, which is For both kinds of matching, the heap_limit number, which is
in in
kibibytes (units of 1024 bytes), limits the amount of heap memory u kibibytes (units of 1024 bytes), limits the amount of heap memory u
sed sed
for matching. for matching.
Showing MARK names Showing MARK names
The mark modifier causes the names from backtracking control verbs t hat The mark modifier causes the names from backtracking control verbs t hat
are returned from calls to pcre2_match() to be displayed. If a mark are returned from calls to pcre2_match() to be displayed. If a mark
is is
returned for a match, non-match, or partial match, pcre2test shows returned for a match, non-match, or partial match, pcre2test shows
it. it.
For a match, it is on a line by itself, tagged with "MK:". Otherwi For a match, it is on a line by itself, tagged with "MK:". Otherwi
se, se,
it is added to the non-match message. it is added to the non-match message.
Showing memory usage Showing memory usage
The memory modifier causes pcre2test to log the sizes of all heap m The memory modifier causes pcre2test to log the sizes of all heap m
em- em-
ory allocation and freeing calls that occur during a call ory allocation and freeing calls that occur during a call
to to
pcre2_match() or pcre2_dfa_match(). In the latter case, heap memory pcre2_match() or pcre2_dfa_match(). In the latter case, heap memory
is is
used only when a match requires more internal workspace that the used only when a match requires more internal workspace that the
de- de-
fault allocation on the stack, so in many cases there will be no o fault allocation on the stack, so in many cases there will be no o
ut- ut-
put. No heap memory is allocated during matching with JIT. For t put. No heap memory is allocated during matching with JIT. For t
his his
modifier to work, the null_context modifier must not be set on both the modifier to work, the null_context modifier must not be set on both the
pattern and the subject, though it can be set on one or the other. pattern and the subject, though it can be set on one or the other.
Showing the heap frame overall vector size Showing the heap frame overall vector size
The heapframes_size modifier is relevant for matches us ing The heapframes_size modifier is relevant for matches us ing
pcre2_match() without JIT. After a match has run (whether successful or pcre2_match() without JIT. After a match has run (whether successful or
not) the size, in bytes, of the allocated heap frames vector that not) the size, in bytes, of the allocated heap frames vector that
is is
left attached to the match data block is shown. If the matching act left attached to the match data block is shown. If the matching act
ion ion
involved several calls to pcre2_match() (for example, global match involved several calls to pcre2_match() (for example, global match
ing ing
or for timing) only the final value is shown. or for timing) only the final value is shown.
This modifier is ignored, with a warning, for POSIX or DFA matchi ng. This modifier is ignored, with a warning, for POSIX or DFA matchi ng.
JIT matching does not use the heap frames vector, so the size is alw ays JIT matching does not use the heap frames vector, so the size is alw ays
zero, unless there was a previous non-JIT match. Note that specifing a zero, unless there was a previous non-JIT match. Note that specifin g a
size of zero for the output vector (see below) causes pcre2test to f ree size of zero for the output vector (see below) causes pcre2test to f ree
its match data block (and associated heap frames vector) and allocat e a its match data block (and associated heap frames vector) and allocat e a
new one. new one.
Setting a starting offset Setting a starting offset
The offset modifier sets an offset in the subject string at wh ich The offset modifier sets an offset in the subject string at wh ich
matching starts. Its value is a number of code units, not characters . matching starts. Its value is a number of code units, not characters .
Setting an offset limit Setting an offset limit
The offset_limit modifier sets a limit for unanchored matches. If a The offset_limit modifier sets a limit for unanchored matches. I f a
match cannot be found starting at or before this offset in the subje ct, match cannot be found starting at or before this offset in the subje ct,
a "no match" return is given. The data value is a number of code uni ts, a "no match" return is given. The data value is a number of code uni ts,
not characters. When this modifier is used, the use_offset_limit mo di- not characters. When this modifier is used, the use_offset_limit mo di-
fier must have been set for the pattern; if not, an error is generat ed. fier must have been set for the pattern; if not, an error is generat ed.
Setting the size of the output vector Setting the size of the output vector
The ovector modifier applies only to the subject line in which it ap- The ovector modifier applies only to the subject line in which it ap-
pears, though of course it can also be used to set a default in a #s ub- pears, though of course it can also be used to set a default in a #s ub-
ject command. It specifies the number of pairs of offsets that are ject command. It specifies the number of pairs of offsets that are
available for storing matching information. The default is 15. available for storing matching information. The default is 15.
A value of zero is useful when testing the POSIX API because it cau ses A value of zero is useful when testing the POSIX API because it cau ses
regexec() to be called with a NULL capture vector. When not testing the regexec() to be called with a NULL capture vector. When not testing the
POSIX API, a value of zero is used to cause pcre2_match_data_c POSIX API, a value of zero is used to cause pcre2_match_data_c
re- re-
ate_from_pattern() to be called, in order to create a new match bl ate_from_pattern() to be called, in order to create a new match bl
ock ock
of exactly the right size for the pattern. (It is not possible to c of exactly the right size for the pattern. (It is not possible to c
re- re-
ate a match block with a zero-length ovector; there is always at le ate a match block with a zero-length ovector; there is always at le
ast ast
one pair of offsets.) The old match data block is freed. one pair of offsets.) The old match data block is freed.
Passing the subject as zero-terminated Passing the subject as zero-terminated
By default, the subject string is passed to a native API matching fu nc- By default, the subject string is passed to a native API matching fu nc-
tion with its correct length. In order to test the facility for pass ing tion with its correct length. In order to test the facility for pass ing
a zero-terminated string, the zero_terminate modifier is provided. a zero-terminated string, the zero_terminate modifier is provided.
It It
causes the length to be passed as PCRE2_ZERO_TERMINATED. When match causes the length to be passed as PCRE2_ZERO_TERMINATED. When match
ing ing
via the POSIX interface, this modifier is ignored, with a warning. via the POSIX interface, this modifier is ignored, with a warning.
When testing pcre2_substitute(), this modifier also has the effect of When testing pcre2_substitute(), this modifier also has the effect of
passing the replacement string as zero-terminated. passing the replacement string as zero-terminated.
Passing a NULL context, subject, or replacement Passing a NULL context, subject, or replacement
Normally, pcre2test passes a context block to pcre2_match Normally, pcre2test passes a context block to pcre2_match
(), (),
pcre2_dfa_match(), pcre2_jit_match() or pcre2_substitute(). If pcre2_dfa_match(), pcre2_jit_match() or pcre2_substitute(). If
the the
null_context modifier is set, however, NULL is passed. This is null_context modifier is set, however, NULL is passed. This is
for for
testing that the matching and substitution functions behave correc testing that the matching and substitution functions behave correc
tly tly
in this case (they use default values). This modifier cannot be u in this case (they use default values). This modifier cannot be u
sed sed
with the find_limits, find_limits_noheap, or substitute_callout mo with the find_limits, find_limits_noheap, or substitute_callout mo
di- di-
fiers. fiers.
Similarly, for testing purposes, if the null_subject or null_repla Similarly, for testing purposes, if the null_subject or null_repla
ce- ce-
ment modifier is set, the subject or replacement string pointers ment modifier is set, the subject or replacement string pointers
are are
passed as NULL, respectively, to the relevant functions. passed as NULL, respectively, to the relevant functions.
THE ALTERNATIVE MATCHING FUNCTION THE ALTERNATIVE MATCHING FUNCTION
By default, pcre2test uses the standard PCRE2 matching functi on, By default, pcre2test uses the standard PCRE2 matching functi on,
pcre2_match() to match each subject line. PCRE2 also supports an alt er- pcre2_match() to match each subject line. PCRE2 also supports an alt er-
native matching function, pcre2_dfa_match(), which operates in a d native matching function, pcre2_dfa_match(), which operates in a d
if- if-
ferent way, and has some restrictions. The differences between the ferent way, and has some restrictions. The differences between the
two two
functions are described in the pcre2matching documentation. functions are described in the pcre2matching documentation.
If the dfa modifier is set, the alternative matching function is us If the dfa modifier is set, the alternative matching function is us
ed. ed.
This function finds all possible matches at a given point in the s This function finds all possible matches at a given point in the s
ub- ub-
ject. If, however, the dfa_shortest modifier is set, processing st ject. If, however, the dfa_shortest modifier is set, processing st
ops ops
after the first match is found. This is always the shortest possi after the first match is found. This is always the shortest possi
ble ble
match. match.
DEFAULT OUTPUT FROM pcre2test DEFAULT OUTPUT FROM pcre2test
This section describes the output when the normal matching functi on, This section describes the output when the normal matching functi on,
pcre2_match(), is being used. pcre2_match(), is being used.
When a match succeeds, pcre2test outputs the list of captured s When a match succeeds, pcre2test outputs the list of captured s
ub- ub-
strings, starting with number 0 for the string that matched the wh strings, starting with number 0 for the string that matched the wh
ole ole
pattern. Otherwise, it outputs "No match" when the return is PCRE2_ ER- pattern. Otherwise, it outputs "No match" when the return is PCRE2_ ER-
ROR_NOMATCH, or "Partial match:" followed by the partially match ROR_NOMATCH, or "Partial match:" followed by the partially match
ing ing
substring when the return is PCRE2_ERROR_PARTIAL. (Note that this substring when the return is PCRE2_ERROR_PARTIAL. (Note that this
is is
the entire substring that was inspected during the partial match; the entire substring that was inspected during the partial match;
it it
may include characters before the actual match start if a lookbeh may include characters before the actual match start if a lookbeh
ind ind
assertion, \K, \b, or \B was involved.) assertion, \K, \b, or \B was involved.)
For any other return, pcre2test outputs the PCRE2 negative error num ber For any other return, pcre2test outputs the PCRE2 negative error num ber
and a short descriptive phrase. If the error is a failed UTF str and a short descriptive phrase. If the error is a failed UTF str
ing ing
check, the code unit offset of the start of the failing character check, the code unit offset of the start of the failing character
is is
also output. Here is an example of an interactive pcre2test run. also output. Here is an example of an interactive pcre2test run.
$ pcre2test $ pcre2test
PCRE2 version 10.22 2016-07-29 PCRE2 version 10.22 2016-07-29
re> /^abc(\d+)/ re> /^abc(\d+)/
data> abc123 data> abc123
0: abc123 0: abc123
1: 123 1: 123
data> xyz data> xyz
No match No match
Unset capturing substrings that are not followed by one that is set are Unset capturing substrings that are not followed by one that is set are
not shown by pcre2test unless the allcaptures modifier is specified. In not shown by pcre2test unless the allcaptures modifier is specified. In
the following example, there are two capturing substrings, but when the the following example, there are two capturing substrings, but when the
first data line is matched, the second, unset substring is not sho first data line is matched, the second, unset substring is not sho
wn. wn.
An "internal" unset substring is shown as "<unset>", as for the sec An "internal" unset substring is shown as "<unset>", as for the sec
ond ond
data line. data line.
re> /(a)|(b)/ re> /(a)|(b)/
data> a data> a
0: a 0: a
1: a 1: a
data> b data> b
0: b 0: b
1: <unset> 1: <unset>
2: b 2: b
If the strings contain any non-printing characters, they are output If the strings contain any non-printing characters, they are output
as as
\xhh escapes if the value is less than 256 and UTF mode is not s \xhh escapes if the value is less than 256 and UTF mode is not s
et. et.
Otherwise they are output as \x{hh...} escapes. See below for the de fi- Otherwise they are output as \x{hh...} escapes. See below for the de fi-
nition of non-printing characters. If the aftertext modifier is s nition of non-printing characters. If the aftertext modifier is s
et, et,
the output for substring 0 is followed by the rest of the subj the output for substring 0 is followed by the rest of the subj
ect ect
string, identified by "0+" like this: string, identified by "0+" like this:
re> /cat/aftertext re> /cat/aftertext
data> cataract data> cataract
0: cat 0: cat
0+ aract 0+ aract
If global matching is requested, the results of successive matching at- If global matching is requested, the results of successive matching at-
tempts are output in sequence, like this: tempts are output in sequence, like this:
re> /\Bi(\w\w)/g re> /\Bi(\w\w)/g
data> Mississippi data> Mississippi
0: iss 0: iss
1: ss 1: ss
0: iss 0: iss
1: ss 1: ss
0: ipp 0: ipp
1: pp 1: pp
"No match" is output only if the first match attempt fails. Here is "No match" is output only if the first match attempt fails. Here is
an an
example of a failure message (the offset 4 that is specified by example of a failure message (the offset 4 that is specified by
the the
offset modifier is past the end of the subject string): offset modifier is past the end of the subject string):
re> /xyz/ re> /xyz/
data> xyz\=offset=4 data> xyz\=offset=4
Error -24 (bad offset value) Error -24 (bad offset value)
Note that whereas patterns can be continued over several lines (a pl ain Note that whereas patterns can be continued over several lines (a pl ain
">" prompt is used for continuations), subject lines may not. Howe ver ">" prompt is used for continuations), subject lines may not. Howe ver
newlines can be included in a subject by means of the \n escape (or \r, newlines can be included in a subject by means of the \n escape (or \r,
\r\n, etc., depending on the newline sequence setting). \r\n, etc., depending on the newline sequence setting).
OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION
When the alternative matching function, pcre2_dfa_match(), is used, the When the alternative matching function, pcre2_dfa_match(), is used, the
output consists of a list of all the matches that start at the fi rst output consists of a list of all the matches that start at the fi rst
point in the subject where there is at least one match. For example: point in the subject where there is at least one match. For example:
re> /(tang|tangerine|tan)/ re> /(tang|tangerine|tan)/
data> yellow tangerine\=dfa data> yellow tangerine\=dfa
0: tangerine 0: tangerine
1: tang 1: tang
2: tan 2: tan
Using the normal matching function on this data finds only "tang". Using the normal matching function on this data finds only "tang".
The The
longest matching string is always given first (and numbered zero). longest matching string is always given first (and numbered zero).
Af- Af-
ter a PCRE2_ERROR_PARTIAL return, the output is "Partial match:", f ter a PCRE2_ERROR_PARTIAL return, the output is "Partial match:", f
ol- ol-
lowed by the partially matching substring. Note that this is the ent ire lowed by the partially matching substring. Note that this is the ent ire
substring that was inspected during the partial match; it may incl ude substring that was inspected during the partial match; it may incl ude
characters before the actual match start if a lookbehind assertion, \b, characters before the actual match start if a lookbehind assertion, \b,
or \B was involved. (\K is not supported for DFA matching.) or \B was involved. (\K is not supported for DFA matching.)
If global matching is requested, the search for further matches resu mes If global matching is requested, the search for further matches resu mes
at the end of the longest match. For example: at the end of the longest match. For example:
re> /(tang|tangerine|tan)/g re> /(tang|tangerine|tan)/g
data> yellow tangerine and tangy sultana\=dfa data> yellow tangerine and tangy sultana\=dfa
0: tangerine 0: tangerine
1: tang 1: tang
2: tan 2: tan
0: tang 0: tang
1: tan 1: tan
0: tan 0: tan
The alternative matching function does not support substring captu The alternative matching function does not support substring captu
re, re,
so the modifiers that are concerned with captured substrings are so the modifiers that are concerned with captured substrings are
not not
relevant. relevant.
RESTARTING AFTER A PARTIAL MATCH RESTARTING AFTER A PARTIAL MATCH
When the alternative matching function has given the PCRE2_ERROR_P AR- When the alternative matching function has given the PCRE2_ERROR_P AR-
TIAL return, indicating that the subject partially matched the patte rn, TIAL return, indicating that the subject partially matched the patte rn,
you can restart the match with additional subject data by means of the you can restart the match with additional subject data by means of the
dfa_restart modifier. For example: dfa_restart modifier. For example:
re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d $/ re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d $/
data> 23ja\=ps,dfa data> 23ja\=ps,dfa
Partial match: 23ja Partial match: 23ja
data> n05\=dfa,dfa_restart data> n05\=dfa,dfa_restart
0: n05 0: n05
For further information about partial matching, see the pcre2part ial For further information about partial matching, see the pcre2part ial
documentation. documentation.
CALLOUTS CALLOUTS
If the pattern contains any callout requests, pcre2test's callout fu nc- If the pattern contains any callout requests, pcre2test's callout fu nc-
tion is called during matching unless callout_none is specified. T his tion is called during matching unless callout_none is specified. T his
works with both matching functions, and with JIT, though there are s ome works with both matching functions, and with JIT, though there are s ome
differences in behaviour. The output for callouts with numerical ar gu- differences in behaviour. The output for callouts with numerical ar gu-
ments and those with string arguments is slightly different. ments and those with string arguments is slightly different.
Callouts with numerical arguments Callouts with numerical arguments
By default, the callout function displays the callout number, the st art By default, the callout function displays the callout number, the st art
and current positions in the subject text at the callout time, and the and current positions in the subject text at the callout time, and the
next pattern item to be tested. For example: next pattern item to be tested. For example:
--->pqrabcdef --->pqrabcdef
0 ^ ^ \d 0 ^ ^ \d
This output indicates that callout number 0 occurred for a match This output indicates that callout number 0 occurred for a match
at- at-
tempt starting at the fourth character of the subject string, when tempt starting at the fourth character of the subject string, when
the the
pointer was at the seventh character, and when the next pattern i pointer was at the seventh character, and when the next pattern i
tem tem
was \d. Just one circumflex is output if the start and current po was \d. Just one circumflex is output if the start and current po
si- si-
tions are the same, or if the current position precedes the start po si- tions are the same, or if the current position precedes the start po si-
tion, which can happen if the callout is in a lookbehind assertion. tion, which can happen if the callout is in a lookbehind assertion.
Callouts numbered 255 are assumed to be automatic callouts, inserted as Callouts numbered 255 are assumed to be automatic callouts, inserted as
a result of the auto_callout pattern modifier. In this case, instead of a result of the auto_callout pattern modifier. In this case, instead of
showing the callout number, the offset in the pattern, preceded by a showing the callout number, the offset in the pattern, preceded b y a
plus, is output. For example: plus, is output. For example:
re> /\d?[A-E]\*/auto_callout re> /\d?[A-E]\*/auto_callout
data> E* data> E*
--->E* --->E*
+0 ^ \d? +0 ^ \d?
+3 ^ [A-E] +3 ^ [A-E]
+8 ^^ \* +8 ^^ \*
+10 ^ ^ +10 ^ ^
0: E* 0: E*
skipping to change at line 1763 skipping to change at line 1772
data> abc data> abc
--->abc --->abc
+0 ^ a +0 ^ a
+1 ^^ (*MARK:X) +1 ^^ (*MARK:X)
+10 ^^ b +10 ^^ b
Latest Mark: X Latest Mark: X
+11 ^ ^ c +11 ^ ^ c
+12 ^ ^ +12 ^ ^
0: abc 0: abc
The mark changes between matching "a" and "b", but stays the same The mark changes between matching "a" and "b", but stays the same
for for
the rest of the match, so nothing more is output. If, as a result the rest of the match, so nothing more is output. If, as a result
of of
backtracking, the mark reverts to being unset, the text "<unset>" backtracking, the mark reverts to being unset, the text "<unset>"
is is
output. output.
Callouts with string arguments Callouts with string arguments
The output for a callout with a string argument is similar, except t hat The output for a callout with a string argument is similar, except t hat
instead of outputting a callout number before the position indicato instead of outputting a callout number before the position indicato
rs, rs,
the callout string and its offset in the pattern string are output the callout string and its offset in the pattern string are output
be- be-
fore the reflection of the subject string, and the subject string fore the reflection of the subject string, and the subject string
is is
reflected for each callout. For example: reflected for each callout. For example:
re> /^ab(?C'first')cd(?C"second")ef/ re> /^ab(?C'first')cd(?C"second")ef/
data> abcdefg data> abcdefg
Callout (7): 'first' Callout (7): 'first'
--->abcdefg --->abcdefg
^ ^ c ^ ^ c
Callout (20): "second" Callout (20): "second"
--->abcdefg --->abcdefg
^ ^ e ^ ^ e
0: abcdef 0: abcdef
Callout modifiers Callout modifiers
The callout function in pcre2test returns zero (carry on matching) The callout function in pcre2test returns zero (carry on matching)
by by
default, but you can use a callout_fail modifier in a subject line default, but you can use a callout_fail modifier in a subject line
to to
change this and other parameters of the callout (see below). change this and other parameters of the callout (see below).
If the callout_capture modifier is set, the current captured groups are If the callout_capture modifier is set, the current captured groups are
output when a callout occurs. This is useful only for non-DFA matchi ng, output when a callout occurs. This is useful only for non-DFA matchi ng,
as pcre2_dfa_match() does not support capturing, so no captures are as pcre2_dfa_match() does not support capturing, so no captures are
ever shown. ever shown.
The normal callout output, showing the callout number or pattern off set The normal callout output, showing the callout number or pattern off set
(as described above) is suppressed if the callout_no_where modifier is (as described above) is suppressed if the callout_no_where modifier is
set. set.
When using the interpretive matching function pcre2_match() with When using the interpretive matching function pcre2_match() with
out out
JIT, setting the callout_extra modifier causes additional output f JIT, setting the callout_extra modifier causes additional output f
rom rom
pcre2test's callout function to be generated. For the first callout pcre2test's callout function to be generated. For the first callout
in in
a match attempt at a new starting position in the subject, "New ma a match attempt at a new starting position in the subject, "New ma
tch tch
attempt" is output. If there has been a backtrack since the last ca attempt" is output. If there has been a backtrack since the last ca
ll- ll-
out (or start of matching if this is the first callout), "Backtrack" is out (or start of matching if this is the first callout), "Backtrack" is
output, followed by "No other matching paths" if the backtrack en ded output, followed by "No other matching paths" if the backtrack en ded
the previous match attempt. For example: the previous match attempt. For example:
re> /(a+)b/auto_callout,no_start_optimize,no_auto_possess re> /(a+)b/auto_callout,no_start_optimize,no_auto_possess
data> aac\=callout_extra data> aac\=callout_extra
New match attempt New match attempt
--->aac --->aac
+0 ^ ( +0 ^ (
+1 ^ a+ +1 ^ a+
+3 ^ ^ ) +3 ^ ^ )
+4 ^ ^ b +4 ^ ^ b
skipping to change at line 1844 skipping to change at line 1853
+0 ^ ( +0 ^ (
+1 ^ a+ +1 ^ a+
Backtrack Backtrack
No other matching paths No other matching paths
New match attempt New match attempt
--->aac --->aac
+0 ^ ( +0 ^ (
+1 ^ a+ +1 ^ a+
No match No match
Notice that various optimizations must be turned off if you want Notice that various optimizations must be turned off if you want
all all
possible matching paths to be scanned. If no_start_optimize is possible matching paths to be scanned. If no_start_optimize is
not not
used, there is an immediate "no match", without any callouts, beca used, there is an immediate "no match", without any callouts, beca
use use
the starting optimization fails to find "b" in the subject, which the starting optimization fails to find "b" in the subject, which
it it
knows must be present for any match. If no_auto_possess is not us knows must be present for any match. If no_auto_possess is not us
ed, ed,
the "a+" item is turned into "a++", which reduces the number of ba the "a+" item is turned into "a++", which reduces the number of ba
ck- ck-
tracks. tracks.
The callout_extra modifier has no effect if used with the DFA match ing The callout_extra modifier has no effect if used with the DFA match ing
function, or with JIT. function, or with JIT.
Return values from callouts Return values from callouts
The default return from the callout function is zero, which all ows The default return from the callout function is zero, which all ows
matching to continue. The callout_fail modifier can be given one or two matching to continue. The callout_fail modifier can be given one or two
numbers. If there is only one number, 1 is returned instead of 0 (ca us- numbers. If there is only one number, 1 is returned instead of 0 (ca us-
ing matching to backtrack) when a callout of that number is reached. If ing matching to backtrack) when a callout of that number is reached. If
two numbers (<n>:<m>) are given, 1 is returned when callout <n> two numbers (<n>:<m>) are given, 1 is returned when callout <n>
is is
reached and there have been at least <m> callouts. The callout_er reached and there have been at least <m> callouts. The callout_er
ror ror
modifier is similar, except that PCRE2_ERROR_CALLOUT is returned, ca us- modifier is similar, except that PCRE2_ERROR_CALLOUT is returned, ca us-
ing the entire matching process to be aborted. If both these modifi ing the entire matching process to be aborted. If both these modifi
ers ers
are set for the same callout number, callout_error takes preceden are set for the same callout number, callout_error takes preceden
ce. ce.
Note that callouts with string arguments are always given the num Note that callouts with string arguments are always given the num
ber ber
zero. zero.
The callout_data modifier can be given an unsigned or a negative n The callout_data modifier can be given an unsigned or a negative n
um- um-
ber. This is set as the "user data" that is passed to the match ber. This is set as the "user data" that is passed to the match
ing ing
function, and passed back when the callout function is invoked. function, and passed back when the callout function is invoked.
Any Any
value other than zero is used as a return from pcre2test's call value other than zero is used as a return from pcre2test's call
out out
function. function.
Inserting callouts can be helpful when using pcre2test to check comp li- Inserting callouts can be helpful when using pcre2test to check comp li-
cated regular expressions. For further information about callouts, see cated regular expressions. For further information about callouts, see
the pcre2callout documentation. the pcre2callout documentation.
NON-PRINTING CHARACTERS NON-PRINTING CHARACTERS
When pcre2test is outputting text in the compiled version of a patte rn, When pcre2test is outputting text in the compiled version of a patte rn,
bytes other than 32-126 are always treated as non-printing charact ers bytes other than 32-126 are always treated as non-printing charact ers
and are therefore shown as hex escapes. and are therefore shown as hex escapes.
When pcre2test is outputting text that is a matched part of a subj When pcre2test is outputting text that is a matched part of a subj
ect ect
string, it behaves in the same way, unless a different locale has b string, it behaves in the same way, unless a different locale has b
een een
set for the pattern (using the locale modifier). In this case, the set for the pattern (using the locale modifier). In this case, the
is- is-
print() function is used to distinguish printing and non-printing ch ar- print() function is used to distinguish printing and non-printing ch ar-
acters. acters.
SAVING AND RESTORING COMPILED PATTERNS SAVING AND RESTORING COMPILED PATTERNS
It is possible to save compiled patterns on disc or elsewhere, and It is possible to save compiled patterns on disc or elsewhere, and
re- re-
load them later, subject to a number of restrictions. JIT data can load them later, subject to a number of restrictions. JIT data can
not not
be saved. The host on which the patterns are reloaded must be runn be saved. The host on which the patterns are reloaded must be runn
ing ing
the same version of PCRE2, with the same code unit width, and must a lso the same version of PCRE2, with the same code unit width, and must a lso
have the same endianness, pointer width and PCRE2_SIZE type. Bef have the same endianness, pointer width and PCRE2_SIZE type. Bef
ore ore
compiled patterns can be saved they must be serialized, that is, c compiled patterns can be saved they must be serialized, that is, c
on- on-
verted to a stream of bytes. A single byte stream may contain any n verted to a stream of bytes. A single byte stream may contain any n
um- um-
ber of compiled patterns, but they must all use the same character ber of compiled patterns, but they must all use the same character
ta- ta-
bles. A single copy of the tables is included in the byte stream ( bles. A single copy of the tables is included in the byte stream (
its its
size is 1088 bytes). size is 1088 bytes).
The functions whose names begin with pcre2_serialize_ are used for The functions whose names begin with pcre2_serialize_ are used for
se- se-
rializing and de-serializing. They are described in the pcre2serial rializing and de-serializing. They are described in the pcre2serial
ize ize
documentation. In this section we describe the features of pcre2t documentation. In this section we describe the features of pcre2t
est est
that can be used to test these functions. that can be used to test these functions.
Note that "serialization" in PCRE2 does not convert compiled patte Note that "serialization" in PCRE2 does not convert compiled patte
rns rns
to an abstract format like Java or .NET. It just makes a reloada to an abstract format like Java or .NET. It just makes a reloada
ble ble
byte code stream. Hence the restrictions on reloading mentioned abo ve. byte code stream. Hence the restrictions on reloading mentioned abo ve.
In pcre2test, when a pattern with push modifier is successfully c In pcre2test, when a pattern with push modifier is successfully c
om- om-
piled, it is pushed onto a stack of compiled patterns, and pcre2t piled, it is pushed onto a stack of compiled patterns, and pcre2t
est est
expects the next line to contain a new pattern (or command) instead expects the next line to contain a new pattern (or command) instead
of of
a subject line. By contrast, the pushcopy modifier causes a copy of the a subject line. By contrast, the pushcopy modifier causes a copy of the
compiled pattern to be stacked, leaving the original available for compiled pattern to be stacked, leaving the original available for
im- im-
mediate matching. By using push and/or pushcopy, a number of patte mediate matching. By using push and/or pushcopy, a number of patte
rns rns
can be compiled and retained. These modifiers are incompatible w can be compiled and retained. These modifiers are incompatible w
ith ith
posix, and control modifiers that act at match time are ignored (wit h a posix, and control modifiers that act at match time are ignored (wit h a
message) for the stacked patterns. The jitverify modifier applies o nly message) for the stacked patterns. The jitverify modifier applies o nly
at compile time. at compile time.
The command The command
#save <filename> #save <filename>
causes all the stacked patterns to be serialized and the result writ ten causes all the stacked patterns to be serialized and the result writ ten
to the named file. Afterwards, all the stacked patterns are freed. The to the named file. Afterwards, all the stacked patterns are freed. The
command command
#load <filename> #load <filename>
reads the data in the file, and then arranges for it to be de-seri reads the data in the file, and then arranges for it to be de-seri
al- al-
ized, with the resulting compiled patterns added to the pattern sta ized, with the resulting compiled patterns added to the pattern sta
ck. ck.
The pattern on the top of the stack can be retrieved by the #pop c The pattern on the top of the stack can be retrieved by the #pop c
om- om-
mand, which must be followed by lines of subjects that are to mand, which must be followed by lines of subjects that are to
be be
matched with the pattern, terminated as usual by an empty line or matched with the pattern, terminated as usual by an empty line or
end end
of file. This command may be followed by a modifier list contain of file. This command may be followed by a modifier list contain
ing ing
only control modifiers that act after a pattern has been compiled. only control modifiers that act after a pattern has been compiled.
In In
particular, hex, posix, posix_nosub, push, and pushcopy are not particular, hex, posix, posix_nosub, push, and pushcopy are not
al- al-
lowed, nor are any option-setting modifiers. The JIT modifiers a lowed, nor are any option-setting modifiers. The JIT modifiers a
re, re,
however permitted. Here is an example that saves and reloads two p however permitted. Here is an example that saves and reloads two p
at- at-
terns. terns.
/abc/push /abc/push
/xyz/push /xyz/push
#save tempfile #save tempfile
#load tempfile #load tempfile
#pop info #pop info
xyz xyz
#pop jit,bincode #pop jit,bincode
abc abc
If jitverify is used with #pop, it does not automatically imply j it, If jitverify is used with #pop, it does not automatically imply j it,
which is different behaviour from when it is used on a pattern. which is different behaviour from when it is used on a pattern.
The #popcopy command is analogous to the pushcopy modifier in that it The #popcopy command is analogous to the pushcopy modifier in that it
makes current a copy of the topmost stack pattern, leaving the origi nal makes current a copy of the topmost stack pattern, leaving the origi nal
still on the stack. still on the stack.
SEE ALSO SEE ALSO
pcre2(3), pcre2api(3), pcre2callout(3), pcre2jit, pcre2matching( 3), pcre2(3), pcre2api(3), pcre2callout(3), pcre2jit, pcre2matching( 3),
pcre2partial(d), pcre2pattern(3), pcre2serialize(3). pcre2partial(d), pcre2pattern(3), pcre2serialize(3).
AUTHOR AUTHOR
Philip Hazel Philip Hazel
Retired from University Computing Service Retired from University Computing Service
Cambridge, England. Cambridge, England.
REVISION REVISION
Last updated: 27 January 2024 Last updated: 24 April 2024
Copyright (c) 1997-2024 University of Cambridge. Copyright (c) 1997-2024 University of Cambridge.
PCRE 10.43 27 January 2024 PCRE2TEST (1) PCRE 10.44 24 April 2024 PCRE2TEST (1)
 End of changes. 145 change blocks. 
610 lines changed or deleted 623 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/