pcre2test.1   pcre2test.1 
skipping to change at line 570 skipping to change at line 570
convert_length set convert buffer length convert_length set convert buffer length
debug same as info,fullbincode debug same as info,fullbincode
framesize show matching frame size framesize show matching frame size
fullbincode show binary code with lengths fullbincode show binary code with lengths
/I info show info about compiled pattern /I info show info about compiled pattern
hex unquoted characters are hexadecimal hex unquoted characters are hexadecimal
jit[=<number>] use JIT jit[=<number>] use JIT
jitfast use JIT fast path jitfast use JIT fast path
jitverify verify JIT use jitverify verify JIT use
locale=<name> use this locale locale=<name> use this locale
max_pattern_length=<n> set maximum pattern length max_pattern_compiled ) set maximum compiled pattern
_length=<n> ) length (bytes)
max_pattern_length=<n> set maximum pattern length (code uni
ts)
max_varlookbehind=<n> set maximum variable lookbehind leng th max_varlookbehind=<n> set maximum variable lookbehind leng th
memory show memory used memory show memory used
newline=<type> set newline type newline=<type> set newline type
null_context compile with a NULL context null_context compile with a NULL context
null_pattern pass pattern as NULL null_pattern pass pattern as NULL
parens_nest_limit=<n> set maximum parentheses depth parens_nest_limit=<n> set maximum parentheses depth
posix use the POSIX API posix use the POSIX API
posix_nosub use the POSIX API with REG_NOSUB posix_nosub use the POSIX API with REG_NOSUB
push push compiled pattern onto the stack push push compiled pattern onto the stack
pushcopy push a copy onto the stack pushcopy push a copy onto the stack
skipping to change at line 826 skipping to change at line 828
brary is set when PCRE2 is built, but pcre2test sets its own default of 220, which brary is set when PCRE2 is built, but pcre2test sets its own default of 220, which
is required for running the standard test suite. is required for running the standard test suite.
Limiting the pattern length Limiting the pattern length
The max_pattern_length modifier sets a limit, in code units, to the length of pat‐ The max_pattern_length modifier sets a limit, in code units, to the length of pat‐
tern that pcre2_compile() will accept. Breaching the limit causes a compilation er‐ tern that pcre2_compile() will accept. Breaching the limit causes a compilation er‐
ror. The default is the largest number a PCRE2_SIZE variable can hol d (essentially ror. The default is the largest number a PCRE2_SIZE variable can hol d (essentially
unlimited). unlimited).
Limiting the size of a compiled pattern
The max_pattern_compiled_length modifier sets a limit, in bytes, t
o the amount of
memory used by a compiled pattern. Breaching the limit causes a comp
ilation error.
The default is the largest number a PCRE2_SIZE variable can hold (
essentially un‐
limited).
Using the POSIX wrapper API Using the POSIX wrapper API
The posix and posix_nosub modifiers cause pcre2test to call PCRE The posix and posix_nosub modifiers cause pcre2test to call PCRE2
2 via the POSIX via the POSIX
wrapper API rather than its native API. When posix_nosub is used, th wrapper API rather than its native API. When posix_nosub is used, t
e POSIX option he POSIX option
REG_NOSUB is passed to regcomp(). The POSIX wrapper supports onl REG_NOSUB is passed to regcomp(). The POSIX wrapper supports only
y the 8-bit li‐ the 8-bit li‐
brary. Note that it does not imply POSIX matching semantics; for mo brary. Note that it does not imply POSIX matching semantics; for
re detail see more detail see
the pcre2posix documentation. The following pattern modifiers set the pcre2posix documentation. The following pattern modifiers set op
options for the tions for the
regcomp() function: regcomp() function:
caseless REG_ICASE caseless REG_ICASE
multiline REG_NEWLINE multiline REG_NEWLINE
dotall REG_DOTALL ) dotall REG_DOTALL )
ungreedy REG_UNGREEDY ) These options are not part of ungreedy REG_UNGREEDY ) These options are not part of
ucp REG_UCP ) the POSIX standard ucp REG_UCP ) the POSIX standard
utf REG_UTF8 ) utf REG_UTF8 )
The regerror_buffsize modifier specifies a size for the error buffer that is passed The regerror_buffsize modifier specifies a size for the error buffer that is passed
to regerror() in the event of a compilation error. For example: to regerror() in the event of a compilation error. For example:
/abc/posix,regerror_buffsize=20 /abc/posix,regerror_buffsize=20
This provides a means of testing the behaviour of regerror() when th e buffer is too This provides a means of testing the behaviour of regerror() when th e buffer is too
small for the error message. If this modifier has not been set, a la rge buffer is small for the error message. If this modifier has not been set, a large buffer is
used. used.
The aftertext and allaftertext subject modifiers work as described b elow. All other The aftertext and allaftertext subject modifiers work as described b elow. All other
modifiers are either ignored, with a warning message, or cause an er ror. modifiers are either ignored, with a warning message, or cause an er ror.
The pattern is passed to regcomp() as a zero-terminated string by The pattern is passed to regcomp() as a zero-terminated string by de
default, but if fault, but if
the use_length or hex modifiers are set, the REG_PEND extension is u the use_length or hex modifiers are set, the REG_PEND extension is
sed to pass it used to pass it
by length. by length.
Testing the stack guard feature Testing the stack guard feature
The stackguard modifier is used to test the use of pcre2_set The stackguard modifier is used to test the use of pcre2_set
_compile_recur‐ _compile_recur‐
sion_guard(), a function that is provided to enable stack avail sion_guard(), a function that is provided to enable stack ava
ability to be ilability to be
checked during compilation (see the pcre2api documentation for d checked during compilation (see the pcre2api documentation for det
etails). If the ails). If the
number specified by the modifier is greater than zero, pcre2_set number specified by the modifier is greater than zero, pcre2_set
_compile_recur‐ _compile_recur‐
sion_guard() is called to set up callback from pcre2_compile() to a local function. sion_guard() is called to set up callback from pcre2_compile() to a local function.
The argument it receives is the current nesting parenthesis de The argument it receives is the current nesting parenthesis depth
pth; if this is ; if this is
greater than the value given by the modifier, non-zero is returned greater than the value given by the modifier, non-zero is return
, causing the ed, causing the
compilation to be aborted. compilation to be aborted.
Using alternative character tables Using alternative character tables
The value specified for the tables modifier must be one of the dig its 0, 1, 2, or The value specified for the tables modifier must be one of the digit s 0, 1, 2, or
3. It causes a specific set of built-in character tables to be passe d to pcre2_com‐ 3. It causes a specific set of built-in character tables to be passe d to pcre2_com‐
pile(). This is used in the PCRE2 tests to check behaviour with diff erent character pile(). This is used in the PCRE2 tests to check behaviour with diff erent character
tables. The digit specifies the tables as follows: tables. The digit specifies the tables as follows:
0 do not pass any special character tables 0 do not pass any special character tables
1 the default ASCII tables, as distributed in 1 the default ASCII tables, as distributed in
pcre2_chartables.c.dist pcre2_chartables.c.dist
2 a set of tables defining ISO 8859 characters 2 a set of tables defining ISO 8859 characters
3 a set of tables loaded by the #loadtables command 3 a set of tables loaded by the #loadtables command
In tables 2, some characters whose codes are greater than 128 are identified as In tables 2, some characters whose codes are greater than 128 ar e identified as
letters, digits, spaces, etc. Tables 3 can be used only after a #loa dtables command letters, digits, spaces, etc. Tables 3 can be used only after a #loa dtables command
has loaded them from a binary file. Setting alternate character tabl es and a locale has loaded them from a binary file. Setting alternate character tabl es and a locale
are mutually exclusive. are mutually exclusive.
Setting certain match controls Setting certain match controls
The following modifiers are really subject modifiers, and are descri bed under "Sub‐ The following modifiers are really subject modifiers, and are descri bed under "Sub‐
ject Modifiers" below. However, they may be included in a pattern's ject Modifiers" below. However, they may be included in a pattern's
modifier list, modifier list,
in which case they are applied to every subject line that is proces in which case they are applied to every subject line that is proc
sed with that essed with that
pattern. These modifiers do not affect the compilation process. pattern. These modifiers do not affect the compilation process.
aftertext show text after match aftertext show text after match
allaftertext show text after captures allaftertext show text after captures
allcaptures show all captures allcaptures show all captures
allvector show the entire ovector allvector show the entire ovector
allusedtext show all consulted text allusedtext show all consulted text
altglobal alternative global matching altglobal alternative global matching
/g global global matching /g global global matching
heapframes_size show match data heapframes size heapframes_size show match data heapframes size
skipping to change at line 923 skipping to change at line 932
substitute_stop=<n> skip substitution <n> and followin g substitute_stop=<n> skip substitution <n> and followin g
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
These modifiers may not appear in a #pattern command. If you want th em as defaults, These modifiers may not appear in a #pattern command. If you want th em as defaults,
set them in a #subject command. set them in a #subject command.
Specifying literal subject lines Specifying literal subject lines
If the subject_literal modifier is present on a pattern, all the sub ject lines that If the subject_literal modifier is present on a pattern, all the sub ject lines that
it matches are taken as literal strings, with no interpretation of backslashes. It it matches are taken as literal strings, with no interpretation of b ackslashes. It
is not possible to set subject modifiers on such lines, but any that are set as de‐ is not possible to set subject modifiers on such lines, but any that are set as de‐
faults by a #subject command are recognized. faults by a #subject command are recognized.
Saving a compiled pattern Saving a compiled pattern
When a pattern with the push modifier is successfully compiled, it i s pushed onto a When a pattern with the push modifier is successfully compiled, it i s pushed onto a
stack of compiled patterns, and pcre2test expects the next line to contain a new stack of compiled patterns, and pcre2test expects the next line t o contain a new
pattern (or a command) instead of a subject line. This facility is u sed when saving pattern (or a command) instead of a subject line. This facility is u sed when saving
compiled patterns to a file, as described in the section entit led "Saving and compiled patterns to a file, as described in the section entitle d "Saving and
restoring compiled patterns" below. If pushcopy is used instead of push, a copy of restoring compiled patterns" below. If pushcopy is used instead of push, a copy of
the compiled pattern is stacked, leaving the original as current, r the compiled pattern is stacked, leaving the original as current,
eady to match ready to match
the following input lines. This provides a way of testing the pc the following input lines. This provides a way of testing the pc
re2_code_copy() re2_code_copy()
function. The push and pushcopy modifiers are incompatible with co mpilation modi‐ function. The push and pushcopy modifiers are incompatible with co mpilation modi‐
fiers such as global that act at match time. Any that are specifie d are ignored fiers such as global that act at match time. Any that are specif ied are ignored
(for the stacked copy), with a warning message, except for replace, which causes an (for the stacked copy), with a warning message, except for replace, which causes an
error. Note that jitverify, which is allowed, does not carry throug h to any subse‐ error. Note that jitverify, which is allowed, does not carry through to any subse‐
quent matching that uses a stacked pattern. quent matching that uses a stacked pattern.
Testing foreign pattern conversion Testing foreign pattern conversion
The experimental foreign pattern conversion functions in PCRE2 can The experimental foreign pattern conversion functions in PCRE2 c
be tested by an be tested by
setting the convert modifier. Its argument is a colon-separated l setting the convert modifier. Its argument is a colon-separated lis
ist of options, t of options,
which set the equivalent option for the pcre2_pattern_convert() func tion: which set the equivalent option for the pcre2_pattern_convert() func tion:
glob PCRE2_CONVERT_GLOB glob PCRE2_CONVERT_GLOB
glob_no_starstar PCRE2_CONVERT_GLOB_NO_STARSTAR glob_no_starstar PCRE2_CONVERT_GLOB_NO_STARSTAR
glob_no_wild_separator PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR glob_no_wild_separator PCRE2_CONVERT_GLOB_NO_WILD_SEPARATOR
posix_basic PCRE2_CONVERT_POSIX_BASIC posix_basic PCRE2_CONVERT_POSIX_BASIC
posix_extended PCRE2_CONVERT_POSIX_EXTENDED posix_extended PCRE2_CONVERT_POSIX_EXTENDED
unset Unset all options unset Unset all options
The "unset" value is useful for turning off a default that has been The "unset" value is useful for turning off a default that has been
set by a #pat‐ set by a #pat‐
tern command. When one of these options is set, the input patte tern command. When one of these options is set, the input pattern
rn is passed to is passed to
pcre2_pattern_convert(). If the conversion is successful, the result pcre2_pattern_convert(). If the conversion is successful, the resu
is reflected lt is reflected
in the output and then passed to pcre2_compile(). The normal utf a in the output and then passed to pcre2_compile(). The normal utf an
nd no_utf_check d no_utf_check
options, if set, cause the PCRE2_CONVERT_UTF and PCRE2_CONVERT_NO_UT F_CHECK options options, if set, cause the PCRE2_CONVERT_UTF and PCRE2_CONVERT_NO_UT F_CHECK options
to be passed to pcre2_pattern_convert(). to be passed to pcre2_pattern_convert().
By default, the conversion function is allowed to allocate a buffer for its output. By default, the conversion function is allowed to allocate a buffer for its output.
However, if the convert_length modifier is set to a value great However, if the convert_length modifier is set to a value gre
er than zero, ater than zero,
pcre2test passes a buffer of the given length. This makes it possi pcre2test passes a buffer of the given length. This makes it possibl
ble to test the e to test the
length check. length check.
The convert_glob_escape and convert_glob_separator modifiers can be used to specify The convert_glob_escape and convert_glob_separator modifiers can be used to specify
the escape and separator characters for glob processing, overriding the defaults, the escape and separator characters for glob processing, overridin g the defaults,
which are operating-system dependent. which are operating-system dependent.
SUBJECT MODIFIERS SUBJECT MODIFIERS
The modifiers that can appear in subject lines and the #subject com mand are of two The modifiers that can appear in subject lines and the #subject comm and are of two
types. types.
Setting match options Setting match options
The following modifiers set options for pcre2_match() or pcre2_df a_match(). See The following modifiers set options for pcre2_match() or pcre2_d fa_match(). See
pcreapi for a description of their effects. pcreapi for a description of their effects.
anchored set PCRE2_ANCHORED anchored set PCRE2_ANCHORED
endanchored set PCRE2_ENDANCHORED endanchored set PCRE2_ENDANCHORED
dfa_restart set PCRE2_DFA_RESTART dfa_restart set PCRE2_DFA_RESTART
dfa_shortest set PCRE2_DFA_SHORTEST dfa_shortest set PCRE2_DFA_SHORTEST
disable_recurseloop_check set PCRE2_DISABLE_RECURSELOOP_CHECK disable_recurseloop_check set PCRE2_DISABLE_RECURSELOOP_CHECK
no_jit set PCRE2_NO_JIT no_jit set PCRE2_NO_JIT
no_utf_check set PCRE2_NO_UTF_CHECK no_utf_check set PCRE2_NO_UTF_CHECK
notbol set PCRE2_NOTBOL notbol set PCRE2_NOTBOL
notempty set PCRE2_NOTEMPTY notempty set PCRE2_NOTEMPTY
notempty_atstart set PCRE2_NOTEMPTY_ATSTART notempty_atstart set PCRE2_NOTEMPTY_ATSTART
noteol set PCRE2_NOTEOL noteol set PCRE2_NOTEOL
partial_hard (or ph) set PCRE2_PARTIAL_HARD partial_hard (or ph) set PCRE2_PARTIAL_HARD
partial_soft (or ps) set PCRE2_PARTIAL_SOFT partial_soft (or ps) set PCRE2_PARTIAL_SOFT
The partial matching modifiers are provided with abbreviations beca use they appear The partial matching modifiers are provided with abbreviations becau se they appear
frequently in tests. frequently in tests.
If the posix or posix_nosub modifier was present on the pattern, cau If the posix or posix_nosub modifier was present on the pattern, ca
sing the POSIX using the POSIX
wrapper API to be used, the only option-setting modifiers that have wrapper API to be used, the only option-setting modifiers that have
any effect are any effect are
notbol, notempty, and noteol, causing REG_NOTBOL, REG_NOTEMPTY, and REG_NOTEOL, re‐ notbol, notempty, and noteol, causing REG_NOTBOL, REG_NOTEMPTY, and REG_NOTEOL, re‐
spectively, to be passed to regexec(). The other modifiers are ig nored, with a spectively, to be passed to regexec(). The other modifiers are ignored, with a
warning message. warning message.
There is one additional modifier that can be used with the POSIX wra pper. It is ig‐ There is one additional modifier that can be used with the POSIX wra pper. It is ig‐
nored (with a warning) if used for non-POSIX matching. nored (with a warning) if used for non-POSIX matching.
posix_startend=<n>[:<m>] posix_startend=<n>[:<m>]
This causes the subject string to be passed to regexec() using the R EG_STARTEND op‐ This causes the subject string to be passed to regexec() using the R EG_STARTEND op‐
tion, which uses offsets to specify which part of the string is se arched. If only tion, which uses offsets to specify which part of the string is sear ched. If only
one number is given, the end offset is passed as the end of the subj ect string. For one number is given, the end offset is passed as the end of the subj ect string. For
more detail of REG_STARTEND, see the pcre2posix documentation. I more detail of REG_STARTEND, see the pcre2posix documentation.
f the subject If the subject
string contains binary zeros (coded as escapes such as \x{00} be string contains binary zeros (coded as escapes such as \x{00} bec
cause pcre2test ause pcre2test
does not support actual binary zeros in its input), you must use pos does not support actual binary zeros in its input), you must use po
ix_startend to six_startend to
specify its length. specify its length.
Setting match controls Setting match controls
The following modifiers affect the matching process or request addi The following modifiers affect the matching process or request addit
tional informa‐ ional informa‐
tion. Some of them may also be specified on a pattern line (see abo tion. Some of them may also be specified on a pattern line (see a
ve), in which bove), in which
case they apply to every subject line that is matched against that p attern, but can case they apply to every subject line that is matched against that p attern, but can
be overridden by modifiers on the subject. be overridden by modifiers on the subject.
aftertext show text after match aftertext show text after match
allaftertext show text after captures allaftertext show text after captures
allcaptures show all captures allcaptures show all captures
allvector show the entire ovector allvector show the entire ovector
allusedtext show all consulted text (non-JIT on ly) allusedtext show all consulted text (non-JIT on ly)
altglobal alternative global matching altglobal alternative global matching
callout_capture show captures at callout time callout_capture show captures at callout time
skipping to change at line 1074 skipping to change at line 1083
substitute_matched use PCRE2_SUBSTITUTE_MATCHED substitute_matched use PCRE2_SUBSTITUTE_MATCHED
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGT H substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGT H
substitute_replacement_only use PCRE2_SUBSTITUTE_REPLACEMENT_O NLY substitute_replacement_only use PCRE2_SUBSTITUTE_REPLACEMENT_O NLY
substitute_skip=<n> skip substitution number n substitute_skip=<n> skip substitution number n
substitute_stop=<n> skip substitution number n and grea ter substitute_stop=<n> skip substitution number n and grea ter
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
zero_terminate pass the subject as zero-terminated zero_terminate pass the subject as zero-terminated
The effects of these modifiers are described in the following sectio ns. When match‐ The effects of these modifiers are described in the following sectio ns. When match‐
ing via the POSIX wrapper API, the aftertext, allaftertext, and ing via the POSIX wrapper API, the aftertext, allaftertext, and o
ovector subject vector subject
modifiers work as described below. All other modifiers are either ig modifiers work as described below. All other modifiers are either
nored, with a ignored, with a
warning message, or cause an error. warning message, or cause an error.
Showing more text Showing more text
The aftertext modifier requests that as well as outputting the part of the subject The aftertext modifier requests that as well as outputting the part of the subject
string that matched the entire pattern, pcre2test should in addition output the re‐ string that matched the entire pattern, pcre2test should in addition output the re‐
mainder of the subject string. This is useful for tests where the su mainder of the subject string. This is useful for tests where the s
bject contains ubject contains
multiple copies of the same substring. The allaftertext modifier re multiple copies of the same substring. The allaftertext modifier req
quests the same uests the same
action for captured substrings as well as the main matched substring action for captured substrings as well as the main matched substrin
. In each case g. In each case
the remainder is output on the following line with a plus characte the remainder is output on the following line with a plus character
r following the following the
capture number. capture number.
The allusedtext modifier requests that all the text that was consu lted during a The allusedtext modifier requests that all the text that was con sulted during a
successful pattern match by the interpreter should be shown, for bot h full and par‐ successful pattern match by the interpreter should be shown, for bot h full and par‐
tial matches. This feature is not supported for JIT matching, and if requested with tial matches. This feature is not supported for JIT matching, and if requested with
JIT it is ignored (with a warning message). Setting this modifier a JIT it is ignored (with a warning message). Setting this modifier af
ffects the out‐ fects the out‐
put if there is a lookbehind at the start of a match, or, for a comp put if there is a lookbehind at the start of a match, or, for a co
lete match, a mplete match, a
lookahead at the end, or if \K is used in the pattern. Characters lookahead at the end, or if \K is used in the pattern. Characters th
that precede or at precede or
follow the start and end of the actual match are indicated in the ou follow the start and end of the actual match are indicated in the o
tput by '<' or utput by '<' or
'>' characters underneath them. Here is an example: '>' characters underneath them. Here is an example:
re> /(?<=pqr)abc(?=xyz)/ re> /(?<=pqr)abc(?=xyz)/
data> 123pqrabcxyz456\=allusedtext data> 123pqrabcxyz456\=allusedtext
0: pqrabcxyz 0: pqrabcxyz
<<< >>> <<< >>>
data> 123pqrabcxy\=ph,allusedtext data> 123pqrabcxy\=ph,allusedtext
Partial match: pqrabcxy Partial match: pqrabcxy
<<< <<<
The first, complete match shows that the matched string is "abc", w The first, complete match shows that the matched string is "abc", wi
ith the preced‐ th the preced‐
ing and following strings "pqr" and "xyz" having been consulted dur ing and following strings "pqr" and "xyz" having been consulted d
ing the match uring the match
(when processing the assertions). The partial match can indicate onl y the preceding (when processing the assertions). The partial match can indicate onl y the preceding
string. string.
The startchar modifier requests that the starting character for the The startchar modifier requests that the starting character for the
match be indi‐ match be indi‐
cated, if it is different to the start of the matched string. The o cated, if it is different to the start of the matched string. The
nly time when only time when
this occurs is when \K has been processed as part of the match. In this occurs is when \K has been processed as part of the match. In t
this situation, his situation,
the output for the matched string is displayed from the starting cha the output for the matched string is displayed from the starting ch
racter instead aracter instead
of from the match point, with circumflex characters under the earl of from the match point, with circumflex characters under the earli
ier characters. er characters.
For example: For example:
re> /abc\Kxyz/ re> /abc\Kxyz/
data> abcxyz\=startchar data> abcxyz\=startchar
0: abcxyz 0: abcxyz
^^^ ^^^
Unlike allusedtext, the startchar modifier can be used with JIT. However, these Unlike allusedtext, the startchar modifier can be used with JIT. However, these
two modifiers are mutually exclusive. two modifiers are mutually exclusive.
Showing the value of all capture groups Showing the value of all capture groups
The allcaptures modifier requests that the values of all potential The allcaptures modifier requests that the values of all potential c
captured paren‐ aptured paren‐
theses be output after a match. By default, only those up to the hig theses be output after a match. By default, only those up to the hi
hest one actu‐ ghest one actu‐
ally used in the match are output (corresponding to the re ally used in the match are output (corresponding to the ret
turn code from urn code from
pcre2_match()). Groups that did not take part in the match are outpu t as "<unset>". pcre2_match()). Groups that did not take part in the match are outpu t as "<unset>".
This modifier is not relevant for DFA matching (which does no captur This modifier is not relevant for DFA matching (which does no capt
ing) and does uring) and does
not apply when replace is specified; it is ignored, with a warn not apply when replace is specified; it is ignored, with a warnin
ing message, if g message, if
present. present.
Showing the entire ovector, for all outcomes Showing the entire ovector, for all outcomes
The allvector modifier requests that the entire ovector be shown, wh atever the out‐ The allvector modifier requests that the entire ovector be shown, wh atever the out‐
come of the match. Compare allcaptures, which shows only up to the come of the match. Compare allcaptures, which shows only up to the
maximum number maximum number
of capture groups for the pattern, and then only for a successful c of capture groups for the pattern, and then only for a successful co
omplete non-DFA mplete non-DFA
match. This modifier, which acts after any match result, and also fo r DFA matching, match. This modifier, which acts after any match result, and also fo r DFA matching,
provides a means of checking that there are no unexpected modificati provides a means of checking that there are no unexpected modificat
ons to ovector ions to ovector
fields. Before each match attempt, the ovector is filled with a spe fields. Before each match attempt, the ovector is filled with a spec
cial value, and ial value, and
if this is found in both elements of a capturing pair, "<unchanged>" is output. Af‐ if this is found in both elements of a capturing pair, "<unchanged>" is output. Af‐
ter a successful match, this applies to all groups after the maximum ter a successful match, this applies to all groups after the maximu
capture group m capture group
for the pattern. In other cases it applies to the entire ovector. for the pattern. In other cases it applies to the entire ovector. Af
After a partial ter a partial
match, the first two elements are the only ones that should be set. match, the first two elements are the only ones that should be s
After a DFA et. After a DFA
match, the amount of ovector that is used depends on the number match, the amount of ovector that is used depends on the number of
of matches that matches that
were found. were found.
Testing pattern callouts Testing pattern callouts
A callout function is supplied when pcre2test calls the library matc hing functions, A callout function is supplied when pcre2test calls the library matc hing functions,
unless callout_none is specified. Its behaviour can be controlled by unless callout_none is specified. Its behaviour can be controlled b
various modi‐ y various modi‐
fiers listed above whose names begin with callout_. Details are gi fiers listed above whose names begin with callout_. Details are give
ven in the sec‐ n in the sec‐
tion entitled "Callouts" below. Testing callouts from pcre2_substi tion entitled "Callouts" below. Testing callouts from pcre2_subs
tute() is de‐ titute() is de‐
scribed separately in "Testing the substitution function" below. scribed separately in "Testing the substitution function" below.
Finding all matches in a string Finding all matches in a string
Searching for all possible matches within a subject can be requeste d by the global Searching for all possible matches within a subject can be requested by the global
or altglobal modifier. After finding a match, the matching function is called again or altglobal modifier. After finding a match, the matching function is called again
to search the remainder of the subject. The difference between globa l and altglobal to search the remainder of the subject. The difference between globa l and altglobal
is that the former uses the start_offset argument to pcr e2_match() or is that the former uses the start_offset argument to pc re2_match() or
pcre2_dfa_match() to start searching at a new point within the entir e string (which pcre2_dfa_match() to start searching at a new point within the entir e string (which
is what Perl does), whereas the latter passes over a shortened subj ect. This makes is what Perl does), whereas the latter passes over a shortened subje ct. This makes
a difference to the matching process if the pattern begins with a lo okbehind asser‐ a difference to the matching process if the pattern begins with a lo okbehind asser‐
tion (including \b or \B). tion (including \b or \B).
If an empty string is matched, the next match is done with the PCR If an empty string is matched, the next match is done with the PCR
E2_NOTEMPTY_AT‐ E2_NOTEMPTY_AT‐
START and PCRE2_ANCHORED flags set, in order to search for anot START and PCRE2_ANCHORED flags set, in order to search for anoth
her, non-empty, er, non-empty,
match at the same point in the subject. If this match fails, the st match at the same point in the subject. If this match fails, the
art offset is start offset is
advanced, and the normal match is retried. This imitates the way Pe advanced, and the normal match is retried. This imitates the way Per
rl handles such l handles such
cases when using the /g modifier or the split() function. Normally, cases when using the /g modifier or the split() function. Normally,
the start off‐ the start off‐
set is advanced by one character, but if the newline convention rec set is advanced by one character, but if the newline convention reco
ognizes CRLF as gnizes CRLF as
a newline, and the current character is CR followed by LF, an advanc a newline, and the current character is CR followed by LF, an advan
e of two char‐ ce of two char‐
acters occurs. acters occurs.
Testing substring extraction functions Testing substring extraction functions
The copy and get modifiers can be used to test the pcre2_substring The copy and get modifiers can be used to test the pcre2_substring_
_copy_xxx() and copy_xxx() and
pcre2_substring_get_xxx() functions. They can be given more than o pcre2_substring_get_xxx() functions. They can be given more than
nce, and each once, and each
can specify a capture group name or number, for example: can specify a capture group name or number, for example:
abcd\=copy=1,copy=3,get=G1 abcd\=copy=1,copy=3,get=G1
If the #subject command is used to set default copy and/or get list If the #subject command is used to set default copy and/or get lists
s, these can be , these can be
unset by specifying a negative number to cancel all numbered groups unset by specifying a negative number to cancel all numbered grou
and an empty ps and an empty
name to cancel all named groups. name to cancel all named groups.
The getall modifier tests pcre2_substring_list_get(), which extrac ts all captured The getall modifier tests pcre2_substring_list_get(), which extracts all captured
substrings. substrings.
If the subject line is successfully matched, the substrings extracte d by the conve‐ If the subject line is successfully matched, the substrings extracte d by the conve‐
nience functions are output with C, G, or L after the string number instead of a nience functions are output with C, G, or L after the string numb er instead of a
colon. This is in addition to the normal full list. The string lengt h (that is, the colon. This is in addition to the normal full list. The string lengt h (that is, the
return from the extraction function) is given in parentheses after each substring, return from the extraction function) is given in parentheses after e ach substring,
followed by the name when the extraction was by name. followed by the name when the extraction was by name.
Testing the substitution function Testing the substitution function
If the replace modifier is set, the pcre2_substitute() function is called instead If the replace modifier is set, the pcre2_substitute() function is called instead
of one of the matching functions (or after one call of pcre2_match() in the case of of one of the matching functions (or after one call of pcre2_match() in the case of
PCRE2_SUBSTITUTE_MATCHED). Note that replacement strings cannot cont ain commas, be‐ PCRE2_SUBSTITUTE_MATCHED). Note that replacement strings cannot cont ain commas, be‐
cause a comma signifies the end of a modifier. This is not thought to be an issue cause a comma signifies the end of a modifier. This is not thought t o be an issue
in a test program. in a test program.
Specifying a completely empty replacement string disables this modif ier. However, Specifying a completely empty replacement string disables this modi fier. However,
it is possible to specify an empty replacement by providing a buffer length, as de‐ it is possible to specify an empty replacement by providing a buffer length, as de‐
scribed below, for an otherwise empty replacement. scribed below, for an otherwise empty replacement.
Unlike subject strings, pcre2test does not process replacement str Unlike subject strings, pcre2test does not process replacement strin
ings for escape gs for escape
sequences. In UTF mode, a replacement string is checked to see if i sequences. In UTF mode, a replacement string is checked to see i
t is a valid f it is a valid
UTF-8 string. If so, it is correctly converted to a UTF string of UTF-8 string. If so, it is correctly converted to a UTF string of t
the appropriate he appropriate
code unit width. If it is not a valid UTF-8 string, the individual c code unit width. If it is not a valid UTF-8 string, the individual
ode units are code units are
copied directly. This provides a means of passing an invalid UTF-8 s tring for test‐ copied directly. This provides a means of passing an invalid UTF-8 s tring for test‐
ing purposes. ing purposes.
The following modifiers set options (in additional to the normal mat ch options) for The following modifiers set options (in additional to the normal mat ch options) for
pcre2_substitute(): pcre2_substitute():
global PCRE2_SUBSTITUTE_GLOBAL global PCRE2_SUBSTITUTE_GLOBAL
substitute_extended PCRE2_SUBSTITUTE_EXTENDED substitute_extended PCRE2_SUBSTITUTE_EXTENDED
substitute_literal PCRE2_SUBSTITUTE_LITERAL substitute_literal PCRE2_SUBSTITUTE_LITERAL
substitute_matched PCRE2_SUBSTITUTE_MATCHED substitute_matched PCRE2_SUBSTITUTE_MATCHED
substitute_overflow_length PCRE2_SUBSTITUTE_OVERFLOW_LENGTH substitute_overflow_length PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
substitute_replacement_only PCRE2_SUBSTITUTE_REPLACEMENT_ONLY substitute_replacement_only PCRE2_SUBSTITUTE_REPLACEMENT_ONLY
substitute_unknown_unset PCRE2_SUBSTITUTE_UNKNOWN_UNSET substitute_unknown_unset PCRE2_SUBSTITUTE_UNKNOWN_UNSET
substitute_unset_empty PCRE2_SUBSTITUTE_UNSET_EMPTY substitute_unset_empty PCRE2_SUBSTITUTE_UNSET_EMPTY
See the pcre2api documentation for details of these options. See the pcre2api documentation for details of these options.
After a successful substitution, the modified string is output, preceded by the After a successful substitution, the modified string is output, pr eceded by the
number of replacements. This may be zero if there were no matches. H ere is a simple number of replacements. This may be zero if there were no matches. H ere is a simple
example of a substitution test: example of a substitution test:
/abc/replace=xxx /abc/replace=xxx
=abc=abc= =abc=abc=
1: =xxx=abc= 1: =xxx=abc=
=abc=abc=\=global =abc=abc=\=global
2: =xxx=xxx= 2: =xxx=xxx=
Subject and replacement strings should be kept relatively short (f ewer than 256 Subject and replacement strings should be kept relatively short (fewer than 256
characters) for substitution tests, as fixed-size buffers are used. To make it easy characters) for substitution tests, as fixed-size buffers are used. To make it easy
to test for buffer overflow, if the replacement string starts w to test for buffer overflow, if the replacement string starts with
ith a number in a number in
square brackets, that number is passed to pcre2_substitute() as the square brackets, that number is passed to pcre2_substitute() as
size of the the size of the
output buffer, with the replacement string starting at the next cha output buffer, with the replacement string starting at the next char
racter. Here is acter. Here is
an example that tests the edge case: an example that tests the edge case:
/abc/ /abc/
123abc123\=replace=[10]XYZ 123abc123\=replace=[10]XYZ
1: 123XYZ123 1: 123XYZ123
123abc123\=replace=[9]XYZ 123abc123\=replace=[9]XYZ
Failed: error -47: no more memory Failed: error -47: no more memory
The default action of pcre2_substitute() is to return PCRE2_ERROR_NO MEMORY when the The default action of pcre2_substitute() is to return PCRE2_ERROR_NO MEMORY when the
output buffer is too small. However, if the PCRE2_SUBSTITUTE_OVERFLO W_LENGTH option output buffer is too small. However, if the PCRE2_SUBSTITUTE_OVERFLO W_LENGTH option
is set (by using the substitute_overflow_length modifier), pcre2_sub is set (by using the substitute_overflow_length modifier), pcre2_su
stitute() con‐ bstitute() con‐
tinues to go through the motions of matching and substituting (bu tinues to go through the motions of matching and substituting (but
t not doing any not doing any
callouts), in order to compute the size of buffer that is required. callouts), in order to compute the size of buffer that is required.
When this hap‐ When this hap‐
pens, pcre2test shows the required buffer length (which include pens, pcre2test shows the required buffer length (which includes
s space for the space for the
trailing zero) as part of the error message. For example: trailing zero) as part of the error message. For example:
/abc/substitute_overflow_length /abc/substitute_overflow_length
123abc123\=replace=[9]XYZ 123abc123\=replace=[9]XYZ
Failed: error -47: no more memory: 10 code units are needed Failed: error -47: no more memory: 10 code units are needed
A replacement string is ignored with POSIX and DFA matching. Spec ifying partial A replacement string is ignored with POSIX and DFA matching. Spe cifying partial
matching provokes an error return ("bad option value") from pcre2_su bstitute(). matching provokes an error return ("bad option value") from pcre2_su bstitute().
Testing substitute callouts Testing substitute callouts
If the substitute_callout modifier is set, a substitution callout If the substitute_callout modifier is set, a substitution callout fu
function is set nction is set
up. The null_context modifier must not be set, because the address o up. The null_context modifier must not be set, because the address
f the callout of the callout
function is passed in a match context. When the callout function i function is passed in a match context. When the callout function is
s called (after called (after
each substitution), details of the input and output strings are outp each substitution), details of the input and output strings are ou
ut. For exam‐ tput. For exam‐
ple: ple:
/abc/g,replace=<$0>,substitute_callout /abc/g,replace=<$0>,substitute_callout
abcdefabcpqr abcdefabcpqr
1(1) Old 0 3 "abc" New 0 5 "<abc>" 1(1) Old 0 3 "abc" New 0 5 "<abc>"
2(1) Old 6 9 "abc" New 8 13 "<abc>" 2(1) Old 6 9 "abc" New 8 13 "<abc>"
2: <abc>def<abc>pqr 2: <abc>def<abc>pqr
The first number on each callout line is the count of matches. Th The first number on each callout line is the count of matches. The
e parenthesized parenthesized
number is the number of pairs that are set in the ovector (that is, number is the number of pairs that are set in the ovector (that is
one more than , one more than
the number of capturing groups that were set). Then are listed the the number of capturing groups that were set). Then are listed the o
offsets of the ffsets of the
old substring, its contents, and the same for the replacement. old substring, its contents, and the same for the replacement.
By default, the substitution callout function returns zero, which ac cepts the re‐ By default, the substitution callout function returns zero, which accepts the re‐
placement and causes matching to continue if /g was used. Two furthe r modifiers can placement and causes matching to continue if /g was used. Two furthe r modifiers can
be used to test other return values. If substitute_skip is set to be used to test other return values. If substitute_skip is set to a
a value greater value greater
than zero the callout function returns +1 for the match of that numb than zero the callout function returns +1 for the match of that nu
er, and simi‐ mber, and simi‐
larly substitute_stop returns -1. These cause the replacement to b larly substitute_stop returns -1. These cause the replacement to be
e rejected, and rejected, and
-1 causes no further matching to take place. If either of them are -1 causes no further matching to take place. If either of them a
set, substi‐ re set, substi‐
tute_callout is assumed. For example: tute_callout is assumed. For example:
/abc/g,replace=<$0>,substitute_skip=1 /abc/g,replace=<$0>,substitute_skip=1
abcdefabcpqr abcdefabcpqr
1(1) Old 0 3 "abc" New 0 5 "<abc> SKIPPED" 1(1) Old 0 3 "abc" New 0 5 "<abc> SKIPPED"
2(1) Old 6 9 "abc" New 6 11 "<abc>" 2(1) Old 6 9 "abc" New 6 11 "<abc>"
2: abcdef<abc>pqr 2: abcdef<abc>pqr
abcdefabcpqr\=substitute_stop=1 abcdefabcpqr\=substitute_stop=1
1(1) Old 0 3 "abc" New 0 5 "<abc> STOPPED" 1(1) Old 0 3 "abc" New 0 5 "<abc> STOPPED"
1: abcdefabcpqr 1: abcdefabcpqr
If both are set for the same number, stop takes precedence. Only a single skip or If both are set for the same number, stop takes precedence. Only a s ingle skip or
stop is supported, which is sufficient for testing that the feature works. stop is supported, which is sufficient for testing that the feature works.
Setting the JIT stack size Setting the JIT stack size
The jitstack modifier provides a way of setting the maximum stack si ze that is used The jitstack modifier provides a way of setting the maximum stack si ze that is used
by the just-in-time optimization code. It is ignored if JIT optimiza tion is not be‐ by the just-in-time optimization code. It is ignored if JIT optimiza tion is not be‐
ing used. The value is a number of kibibytes (units of 1024 bytes). ing used. The value is a number of kibibytes (units of 1024 bytes
Setting zero ). Setting zero
reverts to the default of 32KiB. Providing a stack that is larger t reverts to the default of 32KiB. Providing a stack that is larger th
han the default an the default
is necessary only for very complicated patterns. If jitstack is set is necessary only for very complicated patterns. If jitstack is se
non-zero on a t non-zero on a
subject line it overrides any value that was set on the pattern. subject line it overrides any value that was set on the pattern.
Setting heap, match, and depth limits Setting heap, match, and depth limits
The heap_limit, match_limit, and depth_limit modifiers set the app The heap_limit, match_limit, and depth_limit modifiers set the appr
ropriate limits opriate limits
in the match context. These values are ignored when the find_limits in the match context. These values are ignored when the find_limi
or find_lim‐ ts or find_lim‐
its_noheap modifier is specified. its_noheap modifier is specified.
Finding minimum limits Finding minimum limits
If the find_limits modifier is present on a subject line, pcre2test calls the rele‐ If the find_limits modifier is present on a subject line, pcre2test calls the rele‐
vant matching function several times, setting different values in th e match context vant matching function several times, setting different values in th e match context
via pcre2_set_heap_limit(), pcre2_set_match_limit(), or pcre2_set_de pth_limit() un‐ via pcre2_set_heap_limit(), pcre2_set_match_limit(), or pcre2_set_de pth_limit() un‐
til it finds the smallest value for each parameter that allows th til it finds the smallest value for each parameter that allows the
e match to com‐ match to com‐
plete without a "limit exceeded" error. The match itself may succeed plete without a "limit exceeded" error. The match itself may succ
or fail. An eed or fail. An
alternative modifier, find_limits_noheap, omits the heap limit. This is used in the alternative modifier, find_limits_noheap, omits the heap limit. This is used in the
standard tests, because the minimum heap limit varies between sys standard tests, because the minimum heap limit varies between system
tems. If JIT is s. If JIT is
being used, only the match limit is relevant, and the other two are being used, only the match limit is relevant, and the other two ar
automatically e automatically
omitted. omitted.
When using this modifier, the pattern should not contain any limit s ettings such as When using this modifier, the pattern should not contain any limit s ettings such as
(*LIMIT_MATCH=...) within it. If such a setting is present and is lower than the (*LIMIT_MATCH=...) within it. If such a setting is present and is l ower than the
minimum matching value, the minimum value cannot be found because minimum matching value, the minimum value cannot be found because
pcre2_set_match_limit() etc. are only able to reduce the value o f an in-pattern pcre2_set_match_limit() etc. are only able to reduce the value of an in-pattern
limit; they cannot increase it. limit; they cannot increase it.
For non-DFA matching, the minimum depth_limit number is a measure For non-DFA matching, the minimum depth_limit number is a meas
of how much ure of how much
nested backtracking happens (that is, how deeply the pattern's tre nested backtracking happens (that is, how deeply the pattern's tree
e is searched). is searched).
In the case of DFA matching, depth_limit controls the depth of recur In the case of DFA matching, depth_limit controls the depth of rec
sive calls of ursive calls of
the internal function that is used for handling pattern recursion, the internal function that is used for handling pattern recursion,
lookaround as‐ lookaround as‐
sertions, and atomic groups. sertions, and atomic groups.
For non-DFA matching, the match_limit number is a measure of the am For non-DFA matching, the match_limit number is a measure of the
ount of back‐ amount of back‐
tracking that takes place, and learning the minimum value can be i tracking that takes place, and learning the minimum value can be in
nstructive. For structive. For
most simple matches, the number is quite small, but for patterns wi most simple matches, the number is quite small, but for patterns
th very large with very large
numbers of matching possibilities, it can become large very quickly with increasing numbers of matching possibilities, it can become large very quickly with increasing
length of subject string. In the case of DFA matching, match_limit c ontrols the to‐ length of subject string. In the case of DFA matching, match_limit c ontrols the to‐
tal number of calls, both recursive and non-recursive, to the in ternal matching tal number of calls, both recursive and non-recursive, to the int ernal matching
function, thus controlling the overall amount of computing resource that is used. function, thus controlling the overall amount of computing resource that is used.
For both kinds of matching, the heap_limit number, which is in kibib ytes (units of For both kinds of matching, the heap_limit number, which is in kibi bytes (units of
1024 bytes), limits the amount of heap memory used for matching. 1024 bytes), limits the amount of heap memory used for matching.
Showing MARK names Showing MARK names
The mark modifier causes the names from backtracking control ver The mark modifier causes the names from backtracking control verbs
bs that are re‐ that are re‐
turned from calls to pcre2_match() to be displayed. If a mark is r turned from calls to pcre2_match() to be displayed. If a mark is
eturned for a returned for a
match, non-match, or partial match, pcre2test shows it. For a ma match, non-match, or partial match, pcre2test shows it. For a match
tch, it is on a , it is on a
line by itself, tagged with "MK:". Otherwise, it is added to the non -match message. line by itself, tagged with "MK:". Otherwise, it is added to the non -match message.
Showing memory usage Showing memory usage
The memory modifier causes pcre2test to log the sizes of all heap me mory allocation The memory modifier causes pcre2test to log the sizes of all heap me mory allocation
and freeing calls that occur during a call to pcre2_match() or pcr and freeing calls that occur during a call to pcre2_match() or pcr
e2_dfa_match(). e2_dfa_match().
In the latter case, heap memory is used only when a match require In the latter case, heap memory is used only when a match requires
s more internal more internal
workspace that the default allocation on the stack, so in many cases workspace that the default allocation on the stack, so in many case
there will be s there will be
no output. No heap memory is allocated during matching with JIT. Fo no output. No heap memory is allocated during matching with JIT. For
r this modifier this modifier
to work, the null_context modifier must not be set on both the patte rn and the sub‐ to work, the null_context modifier must not be set on both the patte rn and the sub‐
ject, though it can be set on one or the other. ject, though it can be set on one or the other.
Showing the heap frame overall vector size Showing the heap frame overall vector size
The heapframes_size modifier is relevant for matches using pcre2_m The heapframes_size modifier is relevant for matches using pcre2_
atch() without match() without
JIT. After a match has run (whether successful or not) the size, i JIT. After a match has run (whether successful or not) the size, in
n bytes, of the bytes, of the
allocated heap frames vector that is left attached to the match allocated heap frames vector that is left attached to the matc
data block is h data block is
shown. If the matching action involved several calls to pcre2_match( ) (for example, shown. If the matching action involved several calls to pcre2_match( ) (for example,
global matching or for timing) only the final value is shown. global matching or for timing) only the final value is shown.
This modifier is ignored, with a warning, for POSIX or DFA matchin g. JIT matching This modifier is ignored, with a warning, for POSIX or DFA matching. JIT matching
does not use the heap frames vector, so the size is always zero, unl ess there was a does not use the heap frames vector, so the size is always zero, unl ess there was a
previous non-JIT match. Note that specifing a size of zero for the previous non-JIT match. Note that specifing a size of zero for th
output vector e output vector
(see below) causes pcre2test to free its match data block (and (see below) causes pcre2test to free its match data block (and a
associated heap ssociated heap
frames vector) and allocate a new one. frames vector) and allocate a new one.
Setting a starting offset Setting a starting offset
The offset modifier sets an offset in the subject string at which ma tching starts. The offset modifier sets an offset in the subject string at which m atching starts.
Its value is a number of code units, not characters. Its value is a number of code units, not characters.
Setting an offset limit Setting an offset limit
The offset_limit modifier sets a limit for unanchored matches. If a match cannot be The offset_limit modifier sets a limit for unanchored matches. If a match cannot be
found starting at or before this offset in the subject, a "no m atch" return is found starting at or before this offset in the subject, a "no mat ch" return is
given. The data value is a number of code units, not characters. Whe n this modifier given. The data value is a number of code units, not characters. Whe n this modifier
is used, the use_offset_limit modifier must have been set for the pa ttern; if not, is used, the use_offset_limit modifier must have been set for the p attern; if not,
an error is generated. an error is generated.
Setting the size of the output vector Setting the size of the output vector
The ovector modifier applies only to the subject line in which it The ovector modifier applies only to the subject line in which it a
appears, though ppears, though
of course it can also be used to set a default in a #subject command of course it can also be used to set a default in a #subject comman
. It specifies d. It specifies
the number of pairs of offsets that are available for storing matchi ng information. the number of pairs of offsets that are available for storing matchi ng information.
The default is 15. The default is 15.
A value of zero is useful when testing the POSIX API because it caus es regexec() to A value of zero is useful when testing the POSIX API because it caus es regexec() to
be called with a NULL capture vector. When not testing the POSIX API, a value of be called with a NULL capture vector. When not testing the POSIX API , a value of
zero is used to cause pcre2_match_data_create_from_pattern() to be c alled, in order zero is used to cause pcre2_match_data_create_from_pattern() to be c alled, in order
to create a new match block of exactly the right size for the patter to create a new match block of exactly the right size for the patt
n. (It is not ern. (It is not
possible to create a match block with a zero-length ovector; the possible to create a match block with a zero-length ovector; there
re is always at is always at
least one pair of offsets.) The old match data block is freed. least one pair of offsets.) The old match data block is freed.
Passing the subject as zero-terminated Passing the subject as zero-terminated
By default, the subject string is passed to a native API matching fu nction with its By default, the subject string is passed to a native API matching fu nction with its
correct length. In order to test the facility for passing a zero-ter minated string, correct length. In order to test the facility for passing a zero-ter minated string,
the zero_terminate modifier is provided. It causes the length to the zero_terminate modifier is provided. It causes the length
be passed as to be passed as
PCRE2_ZERO_TERMINATED. When matching via the POSIX interface, this PCRE2_ZERO_TERMINATED. When matching via the POSIX interface, this m
modifier is ig‐ odifier is ig‐
nored, with a warning. nored, with a warning.
When testing pcre2_substitute(), this modifier also has the effect o f passing the When testing pcre2_substitute(), this modifier also has the effect of passing the
replacement string as zero-terminated. replacement string as zero-terminated.
Passing a NULL context, subject, or replacement Passing a NULL context, subject, or replacement
Normally, pcre2test passes a context block to pcre2_match(), pcr e2_dfa_match(), Normally, pcre2test passes a context block to pcre2_match(), pcr e2_dfa_match(),
pcre2_jit_match() or pcre2_substitute(). If the null_context modifi er is set, how‐ pcre2_jit_match() or pcre2_substitute(). If the null_context modifi er is set, how‐
ever, NULL is passed. This is for testing that the matching and subs titution func‐ ever, NULL is passed. This is for testing that the matching and sub stitution func‐
tions behave correctly in this case (they use default values). This modifier cannot tions behave correctly in this case (they use default values). This modifier cannot
be used with the find_limits, find_limits_noheap, or substitute_call out modifiers. be used with the find_limits, find_limits_noheap, or substitute_call out modifiers.
Similarly, for testing purposes, if the null_subject or null_repla Similarly, for testing purposes, if the null_subject or null_replac
cement modifier ement modifier
is set, the subject or replacement string pointers are passed as is set, the subject or replacement string pointers are passed a
NULL, respec‐ s NULL, respec‐
tively, to the relevant functions. tively, to the relevant functions.
THE ALTERNATIVE MATCHING FUNCTION THE ALTERNATIVE MATCHING FUNCTION
By default, pcre2test uses the standard PCRE2 matching function, p By default, pcre2test uses the standard PCRE2 matching function, pc
cre2_match() to re2_match() to
match each subject line. PCRE2 also supports an alternative matc match each subject line. PCRE2 also supports an alternative mat
hing function, ching function,
pcre2_dfa_match(), which operates in a different way, and has som pcre2_dfa_match(), which operates in a different way, and has some
e restrictions. restrictions.
The differences between the two functions are described in the pcre2 The differences between the two functions are described in the pcre
matching docu‐ 2matching docu‐
mentation. mentation.
If the dfa modifier is set, the alternative matching function is us If the dfa modifier is set, the alternative matching function is use
ed. This func‐ d. This func‐
tion finds all possible matches at a given point in the subject. If, tion finds all possible matches at a given point in the subject. I
however, the f, however, the
dfa_shortest modifier is set, processing stops after the first match is found. This dfa_shortest modifier is set, processing stops after the first match is found. This
is always the shortest possible match. is always the shortest possible match.
DEFAULT OUTPUT FROM pcre2test DEFAULT OUTPUT FROM pcre2test
This section describes the output when the normal matching function, pcre2_match(), This section describes the output when the normal matching function, pcre2_match(),
is being used. is being used.
When a match succeeds, pcre2test outputs the list of captured subst rings, starting When a match succeeds, pcre2test outputs the list of captured substr ings, starting
with number 0 for the string that matched the whole pattern. Otherw ise, it outputs with number 0 for the string that matched the whole pattern. Otherw ise, it outputs
"No match" when the return is PCRE2_ERROR_NOMATCH, or "Partial match :" followed by "No match" when the return is PCRE2_ERROR_NOMATCH, or "Partial matc h:" followed by
the partially matching substring when the return is PCRE2_ERROR_PART IAL. (Note that the partially matching substring when the return is PCRE2_ERROR_PART IAL. (Note that
this is the entire substring that was inspected during the partia l match; it may this is the entire substring that was inspected during the partial match; it may
include characters before the actual match start if a lookbehind ass ertion, \K, \b, include characters before the actual match start if a lookbehind ass ertion, \K, \b,
or \B was involved.) or \B was involved.)
For any other return, pcre2test outputs the PCRE2 negative error num ber and a short For any other return, pcre2test outputs the PCRE2 negative error num ber and a short
descriptive phrase. If the error is a failed UTF string check, the c ode unit offset descriptive phrase. If the error is a failed UTF string check, the c ode unit offset
of the start of the failing character is also output. Here is an exa mple of an in‐ of the start of the failing character is also output. Here is an ex ample of an in‐
teractive pcre2test run. teractive pcre2test run.
$ pcre2test $ pcre2test
PCRE2 version 10.22 2016-07-29 PCRE2 version 10.22 2016-07-29
re> /^abc(\d+)/ re> /^abc(\d+)/
data> abc123 data> abc123
0: abc123 0: abc123
1: 123 1: 123
data> xyz data> xyz
No match No match
Unset capturing substrings that are not followed by one that is se Unset capturing substrings that are not followed by one that is set
t are not shown are not shown
by pcre2test unless the allcaptures modifier is specified. In the f by pcre2test unless the allcaptures modifier is specified. In the
ollowing exam‐ following exam‐
ple, there are two capturing substrings, but when the first data l ple, there are two capturing substrings, but when the first data lin
ine is matched, e is matched,
the second, unset substring is not shown. An "internal" unset substr ing is shown as the second, unset substring is not shown. An "internal" unset substr ing is shown as
"<unset>", as for the second data line. "<unset>", as for the second data line.
re> /(a)|(b)/ re> /(a)|(b)/
data> a data> a
0: a 0: a
1: a 1: a
data> b data> b
0: b 0: b
1: <unset> 1: <unset>
2: b 2: b
If the strings contain any non-printing characters, they are output as \xhh escapes If the strings contain any non-printing characters, they are output as \xhh escapes
if the value is less than 256 and UTF mode is not set. Otherwise the y are output as if the value is less than 256 and UTF mode is not set. Otherwise the y are output as
\x{hh...} escapes. See below for the definition of non-printing char \x{hh...} escapes. See below for the definition of non-printing cha
acters. If the racters. If the
aftertext modifier is set, the output for substring 0 is followed aftertext modifier is set, the output for substring 0 is followed by
by the rest of the rest of
the subject string, identified by "0+" like this: the subject string, identified by "0+" like this:
re> /cat/aftertext re> /cat/aftertext
data> cataract data> cataract
0: cat 0: cat
0+ aract 0+ aract
If global matching is requested, the results of successive matching attempts are If global matching is requested, the results of successive matchi ng attempts are
output in sequence, like this: output in sequence, like this:
re> /\Bi(\w\w)/g re> /\Bi(\w\w)/g
data> Mississippi data> Mississippi
0: iss 0: iss
1: ss 1: ss
0: iss 0: iss
1: ss 1: ss
0: ipp 0: ipp
1: pp 1: pp
"No match" is output only if the first match attempt fails. Here is an example of a "No match" is output only if the first match attempt fails. Here is an example of a
failure message (the offset 4 that is specified by the offset modif ier is past the failure message (the offset 4 that is specified by the offset modifi er is past the
end of the subject string): end of the subject string):
re> /xyz/ re> /xyz/
data> xyz\=offset=4 data> xyz\=offset=4
Error -24 (bad offset value) Error -24 (bad offset value)
Note that whereas patterns can be continued over several lines (a pl ain ">" prompt Note that whereas patterns can be continued over several lines (a p lain ">" prompt
is used for continuations), subject lines may not. However newlines can be included is used for continuations), subject lines may not. However newlines can be included
in a subject by means of the \n escape (or \r, \r\n, etc., depending on the newline in a subject by means of the \n escape (or \r, \r\n, etc., depending on the newline
sequence setting). sequence setting).
OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION
When the alternative matching function, pcre2_dfa_match(), is used, the output con‐ When the alternative matching function, pcre2_dfa_match(), is used, the output con‐
sists of a list of all the matches that start at the first point in the subject sists of a list of all the matches that start at the first point i n the subject
where there is at least one match. For example: where there is at least one match. For example:
re> /(tang|tangerine|tan)/ re> /(tang|tangerine|tan)/
data> yellow tangerine\=dfa data> yellow tangerine\=dfa
0: tangerine 0: tangerine
1: tang 1: tang
2: tan 2: tan
Using the normal matching function on this data finds only "tang" . The longest Using the normal matching function on this data finds only "tan g". The longest
matching string is always given first (and numbered zero). After a P CRE2_ERROR_PAR‐ matching string is always given first (and numbered zero). After a P CRE2_ERROR_PAR‐
TIAL return, the output is "Partial match:", followed by the par TIAL return, the output is "Partial match:", followed by the part
tially matching ially matching
substring. Note that this is the entire substring that was inspect substring. Note that this is the entire substring that was inspe
ed during the cted during the
partial match; it may include characters before the actual match sta rt if a lookbe‐ partial match; it may include characters before the actual match sta rt if a lookbe‐
hind assertion, \b, or \B was involved. (\K is not supported for DFA matching.) hind assertion, \b, or \B was involved. (\K is not supported for DFA matching.)
If global matching is requested, the search for further matches res umes at the end If global matching is requested, the search for further matches resu mes at the end
of the longest match. For example: of the longest match. For example:
re> /(tang|tangerine|tan)/g re> /(tang|tangerine|tan)/g
data> yellow tangerine and tangy sultana\=dfa data> yellow tangerine and tangy sultana\=dfa
0: tangerine 0: tangerine
1: tang 1: tang
2: tan 2: tan
0: tang 0: tang
1: tan 1: tan
0: tan 0: tan
The alternative matching function does not support substring capture , so the modi‐ The alternative matching function does not support substring captur e, so the modi‐
fiers that are concerned with captured substrings are not relevant. fiers that are concerned with captured substrings are not relevant.
RESTARTING AFTER A PARTIAL MATCH RESTARTING AFTER A PARTIAL MATCH
When the alternative matching function has given the PCRE2_ERROR_ When the alternative matching function has given the PCRE2_ERROR_P
PARTIAL return, ARTIAL return,
indicating that the subject partially matched the pattern, you ca indicating that the subject partially matched the pattern, you
n restart the can restart the
match with additional subject data by means of the dfa_restart modi match with additional subject data by means of the dfa_restart modif
fier. For exam‐ ier. For exam‐
ple: ple:
re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d $/ re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d $/
data> 23ja\=ps,dfa data> 23ja\=ps,dfa
Partial match: 23ja Partial match: 23ja
data> n05\=dfa,dfa_restart data> n05\=dfa,dfa_restart
0: n05 0: n05
For further information about partial matching, see the pcre2partial documentation. For further information about partial matching, see the pcre2partial documentation.
CALLOUTS CALLOUTS
If the pattern contains any callout requests, pcre2test's callou If the pattern contains any callout requests, pcre2test's call
t function is out function is
called during matching unless callout_none is specified. This called during matching unless callout_none is specified. This wo
works with both rks with both
matching functions, and with JIT, though there are some differences matching functions, and with JIT, though there are some difference
in behaviour. s in behaviour.
The output for callouts with numerical arguments and those with stri ng arguments is The output for callouts with numerical arguments and those with stri ng arguments is
slightly different. slightly different.
Callouts with numerical arguments Callouts with numerical arguments
By default, the callout function displays the callout number, the st art and current By default, the callout function displays the callout number, the st art and current
positions in the subject text at the callout time, and the next pat tern item to be positions in the subject text at the callout time, and the next patt ern item to be
tested. For example: tested. For example:
--->pqrabcdef --->pqrabcdef
0 ^ ^ \d 0 ^ ^ \d
This output indicates that callout number 0 occurred for a match at This output indicates that callout number 0 occurred for a match a
tempt starting ttempt starting
at the fourth character of the subject string, when the pointer was at the fourth character of the subject string, when the pointer was
at the seventh at the seventh
character, and when the next pattern item was \d. Just one circumfle character, and when the next pattern item was \d. Just one circumfl
x is output if ex is output if
the start and current positions are the same, or if the current po the start and current positions are the same, or if the current pos
sition precedes ition precedes
the start position, which can happen if the callout is in a lookbehi nd assertion. the start position, which can happen if the callout is in a lookbehi nd assertion.
Callouts numbered 255 are assumed to be automatic callouts, inserted as a result of Callouts numbered 255 are assumed to be automatic callouts, inserted as a result of
the auto_callout pattern modifier. In this case, instead of showin g the callout the auto_callout pattern modifier. In this case, instead of show ing the callout
number, the offset in the pattern, preceded by a plus, is output. Fo r example: number, the offset in the pattern, preceded by a plus, is output. Fo r example:
re> /\d?[A-E]\*/auto_callout re> /\d?[A-E]\*/auto_callout
data> E* data> E*
--->E* --->E*
+0 ^ \d? +0 ^ \d?
+3 ^ [A-E] +3 ^ [A-E]
+8 ^^ \* +8 ^^ \*
+10 ^ ^ +10 ^ ^
0: E* 0: E*
skipping to change at line 1631 skipping to change at line 1640
data> abc data> abc
--->abc --->abc
+0 ^ a +0 ^ a
+1 ^^ (*MARK:X) +1 ^^ (*MARK:X)
+10 ^^ b +10 ^^ b
Latest Mark: X Latest Mark: X
+11 ^ ^ c +11 ^ ^ c
+12 ^ ^ +12 ^ ^
0: abc 0: abc
The mark changes between matching "a" and "b", but stays the same for the rest of The mark changes between matching "a" and "b", but stays the same fo r the rest of
the match, so nothing more is output. If, as a result of backtrackin g, the mark re‐ the match, so nothing more is output. If, as a result of backtrackin g, the mark re‐
verts to being unset, the text "<unset>" is output. verts to being unset, the text "<unset>" is output.
Callouts with string arguments Callouts with string arguments
The output for a callout with a string argument is similar, except t The output for a callout with a string argument is similar, except
hat instead of that instead of
outputting a callout number before the position indicators, the cal outputting a callout number before the position indicators, the call
lout string and out string and
its offset in the pattern string are output before the reflection o its offset in the pattern string are output before the reflection
f the subject of the subject
string, and the subject string is reflected for each callout. For ex ample: string, and the subject string is reflected for each callout. For ex ample:
re> /^ab(?C'first')cd(?C"second")ef/ re> /^ab(?C'first')cd(?C"second")ef/
data> abcdefg data> abcdefg
Callout (7): 'first' Callout (7): 'first'
--->abcdefg --->abcdefg
^ ^ c ^ ^ c
Callout (20): "second" Callout (20): "second"
--->abcdefg --->abcdefg
^ ^ e ^ ^ e
0: abcdef 0: abcdef
Callout modifiers Callout modifiers
The callout function in pcre2test returns zero (carry on matching) The callout function in pcre2test returns zero (carry on matching) b
by default, but y default, but
you can use a callout_fail modifier in a subject line to change this you can use a callout_fail modifier in a subject line to change thi
and other pa‐ s and other pa‐
rameters of the callout (see below). rameters of the callout (see below).
If the callout_capture modifier is set, the current captured groups are output when If the callout_capture modifier is set, the current captured groups are output when
a callout occurs. This is useful only for non-DFA matching, as pc re2_dfa_match() a callout occurs. This is useful only for non-DFA matching, as pc re2_dfa_match()
does not support capturing, so no captures are ever shown. does not support capturing, so no captures are ever shown.
The normal callout output, showing the callout number or pattern o ffset (as de‐ The normal callout output, showing the callout number or pattern offset (as de‐
scribed above) is suppressed if the callout_no_where modifier is set . scribed above) is suppressed if the callout_no_where modifier is set .
When using the interpretive matching function pcre2_match() witho When using the interpretive matching function pcre2_match() without
ut JIT, setting JIT, setting
the callout_extra modifier causes additional output from pcre2test's the callout_extra modifier causes additional output from pcre2test'
callout func‐ s callout func‐
tion to be generated. For the first callout in a match attempt at tion to be generated. For the first callout in a match attempt at a
a new starting new starting
position in the subject, "New match attempt" is output. If there has position in the subject, "New match attempt" is output. If there h
been a back‐ as been a back‐
track since the last callout (or start of matching if this is the track since the last callout (or start of matching if this is the f
first callout), irst callout),
"Backtrack" is output, followed by "No other matching paths" if the backtrack ended "Backtrack" is output, followed by "No other matching paths" if the backtrack ended
the previous match attempt. For example: the previous match attempt. For example:
re> /(a+)b/auto_callout,no_start_optimize,no_auto_possess re> /(a+)b/auto_callout,no_start_optimize,no_auto_possess
data> aac\=callout_extra data> aac\=callout_extra
New match attempt New match attempt
--->aac --->aac
+0 ^ ( +0 ^ (
+1 ^ a+ +1 ^ a+
+3 ^ ^ ) +3 ^ ^ )
skipping to change at line 1707 skipping to change at line 1716
+0 ^ ( +0 ^ (
+1 ^ a+ +1 ^ a+
Backtrack Backtrack
No other matching paths No other matching paths
New match attempt New match attempt
--->aac --->aac
+0 ^ ( +0 ^ (
+1 ^ a+ +1 ^ a+
No match No match
Notice that various optimizations must be turned off if you want all possible Notice that various optimizations must be turned off if you wa nt all possible
matching paths to be scanned. If no_start_optimize is not used, ther e is an immedi‐ matching paths to be scanned. If no_start_optimize is not used, ther e is an immedi‐
ate "no match", without any callouts, because the starting optimi ate "no match", without any callouts, because the starting optimiza
zation fails to tion fails to
find "b" in the subject, which it knows must be present for find "b" in the subject, which it knows must be present fo
any match. If r any match. If
no_auto_possess is not used, the "a+" item is turned into "a++", wh no_auto_possess is not used, the "a+" item is turned into "a++", whi
ich reduces the ch reduces the
number of backtracks. number of backtracks.
The callout_extra modifier has no effect if used with the DFA matchi ng function, or The callout_extra modifier has no effect if used with the DFA matchi ng function, or
with JIT. with JIT.
Return values from callouts Return values from callouts
The default return from the callout function is zero, which allows m atching to con‐ The default return from the callout function is zero, which allows m atching to con‐
tinue. The callout_fail modifier can be given one or two numbers. If there is only tinue. The callout_fail modifier can be given one or two numbers. I f there is only
one number, 1 is returned instead of 0 (causing matching to backtrac k) when a call‐ one number, 1 is returned instead of 0 (causing matching to backtrac k) when a call‐
out of that number is reached. If two numbers (<n>:<m>) are given out of that number is reached. If two numbers (<n>:<m>) are given,
, 1 is returned 1 is returned
when callout <n> is reached and there have been at least <m> callou when callout <n> is reached and there have been at least <m> call
ts. The call‐ outs. The call‐
out_error modifier is similar, except that PCRE2_ERROR_CALLOUT is re turned, causing out_error modifier is similar, except that PCRE2_ERROR_CALLOUT is re turned, causing
the entire matching process to be aborted. If both these modifiers are set for the the entire matching process to be aborted. If both these modifiers a re set for the
same callout number, callout_error takes precedence. Note that callo uts with string same callout number, callout_error takes precedence. Note that callo uts with string
arguments are always given the number zero. arguments are always given the number zero.
The callout_data modifier can be given an unsigned or a negative num The callout_data modifier can be given an unsigned or a negative n
ber. This is umber. This is
set as the "user data" that is passed to the matching function, set as the "user data" that is passed to the matching function, an
and passed back d passed back
when the callout function is invoked. Any value other than zero is u sed as a return when the callout function is invoked. Any value other than zero is u sed as a return
from pcre2test's callout function. from pcre2test's callout function.
Inserting callouts can be helpful when using pcre2test to check comp licated regular Inserting callouts can be helpful when using pcre2test to check comp licated regular
expressions. For further information about callouts, see the pcre2ca llout documen‐ expressions. For further information about callouts, see the pcre2c allout documen‐
tation. tation.
NON-PRINTING CHARACTERS NON-PRINTING CHARACTERS
When pcre2test is outputting text in the compiled version of a patte rn, bytes other When pcre2test is outputting text in the compiled version of a patte rn, bytes other
than 32-126 are always treated as non-printing characters and are therefore shown than 32-126 are always treated as non-printing characters and are t herefore shown
as hex escapes. as hex escapes.
When pcre2test is outputting text that is a matched part of a subje When pcre2test is outputting text that is a matched part of a sub
ct string, it ject string, it
behaves in the same way, unless a different locale has been set behaves in the same way, unless a different locale has been set fo
for the pattern r the pattern
(using the locale modifier). In this case, the isprint() function is (using the locale modifier). In this case, the isprint() function
used to dis‐ is used to dis‐
tinguish printing and non-printing characters. tinguish printing and non-printing characters.
SAVING AND RESTORING COMPILED PATTERNS SAVING AND RESTORING COMPILED PATTERNS
It is possible to save compiled patterns on disc or elsewhere, It is possible to save compiled patterns on disc or elsewhere, an
and reload them d reload them
later, subject to a number of restrictions. JIT data cannot be saved later, subject to a number of restrictions. JIT data cannot be sav
. The host on ed. The host on
which the patterns are reloaded must be running the same version of PCRE2, with the which the patterns are reloaded must be running the same version of PCRE2, with the
same code unit width, and must also have the same endianness, po same code unit width, and must also have the same endianness, poin
inter width and ter width and
PCRE2_SIZE type. Before compiled patterns can be saved they must PCRE2_SIZE type. Before compiled patterns can be saved they must
be serialized, be serialized,
that is, converted to a stream of bytes. A single byte stream may c that is, converted to a stream of bytes. A single byte stream may co
ontain any num‐ ntain any num‐
ber of compiled patterns, but they must all use the same character t ables. A single ber of compiled patterns, but they must all use the same character t ables. A single
copy of the tables is included in the byte stream (its size is 1088 bytes). copy of the tables is included in the byte stream (its size is 1088 bytes).
The functions whose names begin with pcre2_serialize_ are used for s The functions whose names begin with pcre2_serialize_ are used for
erializing and serializing and
de-serializing. They are described in the pcre2serialize documen de-serializing. They are described in the pcre2serialize documenta
tation. In this tion. In this
section we describe the features of pcre2test that can be used to te section we describe the features of pcre2test that can be used to t
st these func‐ est these func‐
tions. tions.
Note that "serialization" in PCRE2 does not convert compiled pat Note that "serialization" in PCRE2 does not convert compiled patter
terns to an ab‐ ns to an ab‐
stract format like Java or .NET. It just makes a reloadable byte stract format like Java or .NET. It just makes a reloadable by
code stream. te code stream.
Hence the restrictions on reloading mentioned above. Hence the restrictions on reloading mentioned above.
In pcre2test, when a pattern with push modifier is successfully In pcre2test, when a pattern with push modifier is successfully co
compiled, it is mpiled, it is
pushed onto a stack of compiled patterns, and pcre2test expects the pushed onto a stack of compiled patterns, and pcre2test expects t
next line to he next line to
contain a new pattern (or command) instead of a subject line. B contain a new pattern (or command) instead of a subject line. By
y contrast, the contrast, the
pushcopy modifier causes a copy of the compiled pattern to be stacke pushcopy modifier causes a copy of the compiled pattern to be stack
d, leaving the ed, leaving the
original available for immediate matching. By using push and/or pus original available for immediate matching. By using push and/or push
hcopy, a number copy, a number
of patterns can be compiled and retained. These modifiers are inc of patterns can be compiled and retained. These modifiers are in
ompatible with compatible with
posix, and control modifiers that act at match time are ignored ( posix, and control modifiers that act at match time are ignored (wi
with a message) th a message)
for the stacked patterns. The jitverify modifier applies only at com pile time. for the stacked patterns. The jitverify modifier applies only at com pile time.
The command The command
#save <filename> #save <filename>
causes all the stacked patterns to be serialized and the result w ritten to the causes all the stacked patterns to be serialized and the result written to the
named file. Afterwards, all the stacked patterns are freed. The comm and named file. Afterwards, all the stacked patterns are freed. The comm and
#load <filename> #load <filename>
reads the data in the file, and then arranges for it to be de-seria reads the data in the file, and then arranges for it to be de-serial
lized, with the ized, with the
resulting compiled patterns added to the pattern stack. The pattern resulting compiled patterns added to the pattern stack. The patter
on the top of n on the top of
the stack can be retrieved by the #pop command, which must be follo the stack can be retrieved by the #pop command, which must be follow
wed by lines of ed by lines of
subjects that are to be matched with the pattern, terminated as usua subjects that are to be matched with the pattern, terminated as us
l by an empty ual by an empty
line or end of file. This command may be followed by a modifier line or end of file. This command may be followed by a modifier l
list containing ist containing
only control modifiers that act after a pattern has been compiled. only control modifiers that act after a pattern has been compiled.
In particular, In particular,
hex, posix, posix_nosub, push, and pushcopy are not allowed, nor hex, posix, posix_nosub, push, and pushcopy are not allowed, nor ar
are any option- e any option-
setting modifiers. The JIT modifiers are, however permitted. Here setting modifiers. The JIT modifiers are, however permitted. Her
is an example e is an example
that saves and reloads two patterns. that saves and reloads two patterns.
/abc/push /abc/push
/xyz/push /xyz/push
#save tempfile #save tempfile
#load tempfile #load tempfile
#pop info #pop info
xyz xyz
#pop jit,bincode #pop jit,bincode
abc abc
If jitverify is used with #pop, it does not automatically imply jit , which is dif‐ If jitverify is used with #pop, it does not automatically imply jit, which is dif‐
ferent behaviour from when it is used on a pattern. ferent behaviour from when it is used on a pattern.
The #popcopy command is analogous to the pushcopy modifier in that i t makes current The #popcopy command is analogous to the pushcopy modifier in that i t makes current
a copy of the topmost stack pattern, leaving the original still on t he stack. a copy of the topmost stack pattern, leaving the original still on t he stack.
SEE ALSO SEE ALSO
pcre2(3), pcre2api(3), pcre2callout(3), pcre2jit, pcre2matching( 3), pcre2par‐ pcre2(3), pcre2api(3), pcre2callout(3), pcre2jit, pcre2matching (3), pcre2par‐
tial(d), pcre2pattern(3), pcre2serialize(3). tial(d), pcre2pattern(3), pcre2serialize(3).
AUTHOR AUTHOR
Philip Hazel Philip Hazel
Retired from University Computing Service Retired from University Computing Service
Cambridge, England. Cambridge, England.
REVISION REVISION
Last updated: 27 January 2024 Last updated: 24 April 2024
Copyright (c) 1997-2024 University of Cambridge. Copyright (c) 1997-2024 University of Cambridge.
PCRE 10.43 27 January 2024 PCRE2TEST(1) PCRE 10.44 24 April 2024 PCRE2TEST(1)
 End of changes. 138 change blocks. 
473 lines changed or deleted 486 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/