pcre2test.1 | pcre2test.1 | |||
---|---|---|---|---|
skipping to change at line 57 | skipping to change at line 57 | |||
The input is processed using C's string functions, so must not conta in binary zeros, even though | The input is processed using C's string functions, so must not conta in binary zeros, even though | |||
in Unix-like environments, fgets() treats any bytes other than ne wline as data characters. An | in Unix-like environments, fgets() treats any bytes other than ne wline as data characters. An | |||
error is generated if a binary zero is encountered. By default subje ct lines are processed for | error is generated if a binary zero is encountered. By default subje ct lines are processed for | |||
backslash escapes, which makes it possible to include any data valu e in strings that are passed | backslash escapes, which makes it possible to include any data valu e in strings that are passed | |||
to the library for matching. For patterns, there is a facility for s pecifying some or all of the | to the library for matching. For patterns, there is a facility for s pecifying some or all of the | |||
8-bit input characters as hexadecimal pairs, which makes it possible to include binary zeros. | 8-bit input characters as hexadecimal pairs, which makes it possible to include binary zeros. | |||
Input for the 16-bit and 32-bit libraries | Input for the 16-bit and 32-bit libraries | |||
When testing the 16-bit or 32-bit libraries, there is a need to be a ble to generate character | When testing the 16-bit or 32-bit libraries, there is a need to be a ble to generate character | |||
code points greater than 255 in the strings that are passed to the | code points greater than 255 in the strings that are passed to th | |||
library. For subject lines, | e library. For subject lines | |||
backslash escapes can be used. In addition, when the utf modifier (s | and some patterns, backslash escapes can be used. In addition, when | |||
ee "Setting compilation op‐ | the utf modifier (see "Set‐ | |||
tions" below) is set, the pattern and any following subject li | ting compilation options" below) is set, the pattern and any follow | |||
nes are interpreted as UTF-8 | ing subject lines are inter‐ | |||
strings and translated to UTF-16 or UTF-32 as appropriate. | preted as UTF-8 strings and translated to UTF-16 or UTF-32 as approp | |||
riate. | ||||
For non-UTF testing of wide characters, the utf8_input modifier can be used. This is mutually | For non-UTF testing of wide characters, the utf8_input modifier can be used. This is mutually | |||
exclusive with utf, and is allowed only in 16-bit or 32-bit mode. It causes the pattern and fol‐ | exclusive with utf, and is allowed only in 16-bit or 32-bit mode. It causes the pattern and fol‐ | |||
lowing subject lines to be treated as UTF-8 according to the ori ginal definition (RFC 2279), | lowing subject lines to be treated as UTF-8 according to the ori ginal definition (RFC 2279), | |||
which allows for character values up to 0x7fffffff. Each character i s placed in one 16-bit or | which allows for character values up to 0x7fffffff. Each character i s placed in one 16-bit or | |||
32-bit code unit (in the 16-bit case, values greater than 0xffff cau se an error to occur). | 32-bit code unit (in the 16-bit case, values greater than 0xffff cau se an error to occur). | |||
UTF-8 (in its original definition) is not capable of encoding val ues greater than 0x7fffffff, | UTF-8 (in its original definition) is not capable of encoding val ues greater than 0x7fffffff, | |||
but such values can be handled by the 32-bit library. When testing t his library in non-UTF mode | but such values can be handled by the 32-bit library. When testing t his library in non-UTF mode | |||
with utf8_input set, if any character is preceded by the byte 0xff (which is an invalid byte in | with utf8_input set, if any character is preceded by the byte 0xff (which is an invalid byte in | |||
UTF-8) 0x80000000 is added to the character's value. This is the onl | UTF-8) 0x80000000 is added to the character's value. For subject str | |||
y way of passing such code | ings, using an escape se‐ | |||
points in a pattern string. For subject strings, using an escape seq | quence is preferable. | |||
uence is preferable. | ||||
COMMAND LINE OPTIONS | COMMAND LINE OPTIONS | |||
-8 If the 8-bit library has been built, this option causes it to be used (this is the de‐ | -8 If the 8-bit library has been built, this option causes it to be used (this is the de‐ | |||
fault). If the 8-bit library has not been built, this opti on causes an error. | fault). If the 8-bit library has not been built, this opti on causes an error. | |||
-16 If the 16-bit library has been built, this option causes it to be used. If the 8-bit | -16 If the 16-bit library has been built, this option causes it to be used. If the 8-bit | |||
library has not been built, this is the default. If the 16 -bit library has not been | library has not been built, this is the default. If the 16 -bit library has not been | |||
built, this option causes an error. | built, this option causes an error. | |||
skipping to change at line 105 | skipping to change at line 105 | |||
-C Output the version number of the PCRE2 library, and all a vailable information about | -C Output the version number of the PCRE2 library, and all a vailable information about | |||
the optional features that are included, and then exit wi th zero exit code. All other | the optional features that are included, and then exit wi th zero exit code. All other | |||
options are ignored. If both -C and -LM are present, which ever is first is recognized. | options are ignored. If both -C and -LM are present, which ever is first is recognized. | |||
-C option Output information about a specific build-time option, the n exit. This functionality | -C option Output information about a specific build-time option, the n exit. This functionality | |||
is intended for use in scripts such as RunTest. The follow ing options output the value | is intended for use in scripts such as RunTest. The follow ing options output the value | |||
and set the exit code as indicated: | and set the exit code as indicated: | |||
ebcdic-nl the code for LF (= NL) in an EBCDIC environme nt: | ebcdic-nl the code for LF (= NL) in an EBCDIC environme nt: | |||
0x15 or 0x25 | either 0x15 or 0x25 | |||
0 if used in an ASCII environment | 0 if used in an ASCII/Unicode environment | |||
exit code is always 0 | exit code is always 0 | |||
linksize the configured internal link size (2, 3, or 4 ) | linksize the configured internal link size (2, 3, or 4 ) | |||
exit code is set to the link size | exit code is set to the link size | |||
newline the default newline setting: | newline the default newline setting: | |||
CR, LF, CRLF, ANYCRLF, ANY, or NUL | CR, LF, CRLF, ANYCRLF, ANY, or NUL | |||
exit code is always 0 | exit code is always 0 | |||
bsr the default setting for what \R matches: | bsr the default setting for what \R matches: | |||
ANYCRLF or ANY | ANYCRLF or ANY | |||
exit code is always 0 | exit code is always 0 | |||
skipping to change at line 128 | skipping to change at line 128 | |||
same value: | same value: | |||
backslash-C \C is supported (not locked out) | backslash-C \C is supported (not locked out) | |||
ebcdic compiled for an EBCDIC environment | ebcdic compiled for an EBCDIC environment | |||
jit just-in-time support is available | jit just-in-time support is available | |||
pcre2-16 the 16-bit library was built | pcre2-16 the 16-bit library was built | |||
pcre2-32 the 32-bit library was built | pcre2-32 the 32-bit library was built | |||
pcre2-8 the 8-bit library was built | pcre2-8 the 8-bit library was built | |||
unicode Unicode support is available | unicode Unicode support is available | |||
Note that the availability of JIT support in the library d | ||||
oes not guarantee that it | ||||
can actually be used because in some environments it is u | ||||
nable to allocate executable | ||||
memory. The option "jitusable" gives more detailed informa | ||||
tion. It returns one of the | ||||
following values: | ||||
0 JIT is available and usable | ||||
1 JIT is available but cannot allocate executable memor | ||||
y | ||||
2 JIT is not available | ||||
3 Unexpected return from test call to pcre2_jit_compile | ||||
() | ||||
If an unknown option is given, an error message is output; the exit code is 0. | If an unknown option is given, an error message is output; the exit code is 0. | |||
-d Behave as if each pattern has the debug modifier; the inte rnal form and information | -d Behave as if each pattern has the debug modifier; the i nternal form and information | |||
about the compiled pattern is output after compilation; -d is equivalent to -b -i. | about the compiled pattern is output after compilation; -d is equivalent to -b -i. | |||
-dfa Behave as if each subject line has the dfa modifier ; matching is done using the | -dfa Behave as if each subject line has the dfa modifier; ma tching is done using the | |||
pcre2_dfa_match() function instead of the default pcre2_ma tch(). | pcre2_dfa_match() function instead of the default pcre2_ma tch(). | |||
-error number[,number,...] | -error number[,number,...] | |||
Call pcre2_get_error_message() for each of the error numbe | Call pcre2_get_error_message() for each of the error nu | |||
rs in the comma-separated | mbers in the comma-separated | |||
list, display the resulting messages on the standard outp | list, display the resulting messages on the standard outpu | |||
ut, then exit with zero exit | t, then exit with zero exit | |||
code. The numbers may be positive or negative. This is a | code. The numbers may be positive or negative. This i | |||
convenience facility for | s a convenience facility for | |||
PCRE2 maintainers. | PCRE2 maintainers. | |||
-help Output a brief summary these options and then exit. | -help Output a brief summary these options and then exit. | |||
-i Behave as if each pattern has the info modifier; informa tion about the compiled pat‐ | -i Behave as if each pattern has the info modifier; informati on about the compiled pat‐ | |||
tern is given after compilation. | tern is given after compilation. | |||
-jit Behave as if each pattern line has the jit modifier; aft er successful compilation, | -jit Behave as if each pattern line has the jit modifier; a fter successful compilation, | |||
each pattern is passed to the just-in-time compiler, if av ailable. | each pattern is passed to the just-in-time compiler, if av ailable. | |||
-jitfast Behave as if each pattern line has the jitfast modifier; a fter successful compilation, | -jitfast Behave as if each pattern line has the jitfast modifier; a fter successful compilation, | |||
each pattern is passed to the just-in-time compiler, if available, and each subject | each pattern is passed to the just-in-time compiler, if av ailable, and each subject | |||
line is passed directly to the JIT matcher via its "fast p ath". | line is passed directly to the JIT matcher via its "fast p ath". | |||
-jitverify | -jitverify | |||
Behave as if each pattern line has the jitverify modifier; | Behave as if each pattern line has the jitverify modifie | |||
after successful compila‐ | r; after successful compila‐ | |||
tion, each pattern is passed to the just-in-time compile | tion, each pattern is passed to the just-in-time compiler, | |||
r, if available, and the use | if available, and the use | |||
of JIT for matching is verified. | of JIT for matching is verified. | |||
-LM List modifiers: write a list of available pattern and subj ect modifiers to the stan‐ | -LM List modifiers: write a list of available pattern and su bject modifiers to the stan‐ | |||
dard output, then exit with zero exit code. All other opti ons are ignored. If both -C | dard output, then exit with zero exit code. All other opti ons are ignored. If both -C | |||
and any -Lx options are present, whichever is first is rec ognized. | and any -Lx options are present, whichever is first is rec ognized. | |||
-LP List properties: write a list of recognized Unicode proper ties to the standard output, | -LP List properties: write a list of recognized Unicode proper ties to the standard output, | |||
then exit with zero exit code. All other options are ign ored. If both -C and any -Lx | then exit with zero exit code. All other options are ignor ed. If both -C and any -Lx | |||
options are present, whichever is first is recognized. | options are present, whichever is first is recognized. | |||
-LS List scripts: write a list of recognized Unicode script na | -LS List scripts: write a list of recognized Unicode script n | |||
mes to the standard output, | ames to the standard output, | |||
then exit with zero exit code. All other options are ign | then exit with zero exit code. All other options are ignor | |||
ored. If both -C and any -Lx | ed. If both -C and any -Lx | |||
options are present, whichever is first is recognized. | options are present, whichever is first is recognized. | |||
-pattern modifier-list | -pattern modifier-list | |||
Behave as if each pattern line contains the given modifier s. | Behave as if each pattern line contains the given modifier s. | |||
-q Do not output the version number of pcre2test at the start of execution. | -q Do not output the version number of pcre2test at the start of execution. | |||
-S size On Unix-like systems, set the size of the run-time stack t o size mebibytes (units of | -S size On Unix-like systems, set the size of the run-time stack to size mebibytes (units of | |||
1024*1024 bytes). | 1024*1024 bytes). | |||
-subject modifier-list | -subject modifier-list | |||
Behave as if each subject line contains the given modifier s. | Behave as if each subject line contains the given modifier s. | |||
-t Run each compile and match many times with a timer, and ou tput the resulting times per | -t Run each compile and match many times with a timer, and ou tput the resulting times per | |||
compile or match. When JIT is used, separate times are g iven for the initial compile | compile or match. When JIT is used, separate times are giv en for the initial compile | |||
and the JIT compile. You can control the number of iterati ons that are used for timing | and the JIT compile. You can control the number of iterati ons that are used for timing | |||
by following -t with a number (as a separate item on the c ommand line). For example, | by following -t with a number (as a separate item on the command line). For example, | |||
"-t 1000" iterates 1000 times. The default is to iterate 5 00,000 times. | "-t 1000" iterates 1000 times. The default is to iterate 5 00,000 times. | |||
-tm This is like -t except that it times only the matching pha se, not the compile phase. | -tm This is like -t except that it times only the matching pha se, not the compile phase. | |||
-T -TM These behave like -t and -tm, but in addition, at the e nd of a run, the total times | -T -TM These behave like -t and -tm, but in addition, at the end of a run, the total times | |||
for all compiles and matches are output. | for all compiles and matches are output. | |||
-version Output the PCRE2 version number and then exit. | -version Output the PCRE2 version number and then exit. | |||
DESCRIPTION | DESCRIPTION | |||
If pcre2test is given two filename arguments, it reads from the firs t and writes to the second. | If pcre2test is given two filename arguments, it reads from the fir st and writes to the second. | |||
If the first name is "-", input is taken from the standard input. If pcre2test is given only one | If the first name is "-", input is taken from the standard input. If pcre2test is given only one | |||
argument, it reads from that file and writes to stdout. Otherw ise, it reads from stdin and | argument, it reads from that file and writes to stdout. Otherwise, it reads from stdin and | |||
writes to stdout. | writes to stdout. | |||
When pcre2test is built, a configuration option can specify that it | When pcre2test is built, a configuration option can specify that | |||
should be linked with the | it should be linked with the | |||
libreadline or libedit library. When this is done, if the input is | libreadline or libedit library. When this is done, if the input is f | |||
from a terminal, it is read | rom a terminal, it is read | |||
using the readline() function. This provides line-editing and histo | using the readline() function. This provides line-editing and hi | |||
ry facilities. The output | story facilities. The output | |||
from the -help option states whether or not readline() will be used. | from the -help option states whether or not readline() will be used. | |||
The program handles any number of tests, each of which consists o | The program handles any number of tests, each of which consists of a | |||
f a set of input lines. Each | set of input lines. Each | |||
set starts with a regular expression pattern, followed by any number | set starts with a regular expression pattern, followed by any n | |||
of subject lines to be | umber of subject lines to be | |||
matched against that pattern. In between sets of test data, command | matched against that pattern. In between sets of test data, command | |||
lines that begin with # may | lines that begin with # may | |||
appear. This file format, with some restrictions, can also be pro | appear. This file format, with some restrictions, can also be | |||
cessed by the perltest.sh | processed by the perltest.sh | |||
script that is distributed with PCRE2 as a means of checking tha | script that is distributed with PCRE2 as a means of checking that th | |||
t the behaviour of PCRE2 and | e behaviour of PCRE2 and | |||
Perl is the same. For a specification of perltest.sh, see the commen | Perl is the same. For a specification of perltest.sh, see the comm | |||
ts near its beginning. See | ents near its beginning. See | |||
also the #perltest command below. | also the #perltest command below. | |||
When the input is a terminal, pcre2test prompts for each line of input, using "re>" to prompt | When the input is a terminal, pcre2test prompts for each line of inp ut, using "re>" to prompt | |||
for regular expression patterns, and "data>" to prompt for subject l ines. Command lines starting | for regular expression patterns, and "data>" to prompt for subject l ines. Command lines starting | |||
with # can be entered only in response to the "re>" prompt. | with # can be entered only in response to the "re>" prompt. | |||
Each subject line is matched separately and independently. If you wa nt to do multi-line matches, | Each subject line is matched separately and independently. If you wa nt to do multi-line matches, | |||
you have to use the \n escape sequence (or \r or \r\n, etc., dependi | you have to use the \n escape sequence (or \r or \r\n, etc., depen | |||
ng on the newline setting) | ding on the newline setting) | |||
in a single line of input to encode the newline sequences. There | in a single line of input to encode the newline sequences. There is | |||
is no limit on the length of | no limit on the length of | |||
subject lines; the input buffer is automatically extended if it is t oo small. There are replica‐ | subject lines; the input buffer is automatically extended if it is t oo small. There are replica‐ | |||
tion features that makes it possible to generate long repetitive pat tern or subject lines with‐ | tion features that makes it possible to generate long repetitive pa ttern or subject lines with‐ | |||
out having to supply them explicitly. | out having to supply them explicitly. | |||
An empty line or the end of the file signals the end of the subjec t lines for a test, at which | An empty line or the end of the file signals the end of the subject lines for a test, at which | |||
point a new pattern or command line is expected if there is still in put to be read. | point a new pattern or command line is expected if there is still in put to be read. | |||
COMMAND LINES | COMMAND LINES | |||
In between sets of test data, a line that begins with # is interpret ed as a command line. If the | In between sets of test data, a line that begins with # is interpret ed as a command line. If the | |||
first character is followed by white space or an exclamation mark, t he line is treated as a com‐ | first character is followed by white space or an exclamation mark, t he line is treated as a com‐ | |||
ment, and ignored. Otherwise, the following commands are recognized: | ment, and ignored. Otherwise, the following commands are recognized: | |||
#forbid_utf | #forbid_utf | |||
Subsequent patterns automatically have the PCRE2_NEVER_UTF and PC | Subsequent patterns automatically have the PCRE2_NEVER_UTF and | |||
RE2_NEVER_UCP options set, | PCRE2_NEVER_UCP options set, | |||
which locks out the use of the PCRE2_UTF and PCRE2_UCP options and | which locks out the use of the PCRE2_UTF and PCRE2_UCP options and t | |||
the use of (*UTF) and (*UCP) | he use of (*UTF) and (*UCP) | |||
at the start of patterns. This command also forces an error if a sub sequent pattern contains any | at the start of patterns. This command also forces an error if a sub sequent pattern contains any | |||
occurrences of \P, \p, or \X, which are still supported when PCRE2_U TF is not set, but which re‐ | occurrences of \P, \p, or \X, which are still supported when PCRE2_U TF is not set, but which re‐ | |||
quire Unicode property support to be included in the library. | quire Unicode property support to be included in the library. | |||
This is a trigger guard that is used in test files to ensure that UT | This is a trigger guard that is used in test files to ensure that U | |||
F or Unicode property tests | TF or Unicode property tests | |||
are not accidentally added to files that are used when Unicode su | are not accidentally added to files that are used when Unicode suppo | |||
pport is not included in the | rt is not included in the | |||
library. Setting PCRE2_NEVER_UTF and PCRE2_NEVER_UCP as a default ca | library. Setting PCRE2_NEVER_UTF and PCRE2_NEVER_UCP as a default | |||
n also be obtained by the | can also be obtained by the | |||
use of #pattern; the difference is that #forbid_utf cannot be unse | use of #pattern; the difference is that #forbid_utf cannot be unset, | |||
t, and the automatic options | and the automatic options | |||
are not displayed in pattern information, to avoid cluttering up tes t output. | are not displayed in pattern information, to avoid cluttering up tes t output. | |||
#load <filename> | #load <filename> | |||
This command is used to load a set of precompiled patterns from a fi le, as described in the sec‐ | This command is used to load a set of precompiled patterns from a fi le, as described in the sec‐ | |||
tion entitled "Saving and restoring compiled patterns" below. | tion entitled "Saving and restoring compiled patterns" below. | |||
#loadtables <filename> | #loadtables <filename> | |||
This command is used to load a set of binary character tables that c an be accessed by the ta‐ | This command is used to load a set of binary character tables tha t can be accessed by the ta‐ | |||
bles=3 qualifier. Such tables can be created by the pcre2_dftables p rogram with the -b option. | bles=3 qualifier. Such tables can be created by the pcre2_dftables p rogram with the -b option. | |||
#newline_default [<newline-list>] | #newline_default [<newline-list>] | |||
When PCRE2 is built, a default newline convention can be specified. | When PCRE2 is built, a default newline convention can be specified. | |||
This determines which char‐ | This determines which char‐ | |||
acters and/or character pairs are recognized as indicating a newline | acters and/or character pairs are recognized as indicating a new | |||
in a pattern or subject | line in a pattern or subject | |||
string. The default can be overridden when a pattern is compiled. | string. The default can be overridden when a pattern is compiled. Th | |||
The standard test files con‐ | e standard test files con‐ | |||
tain tests of various newline conventions, but the majority of the t | tain tests of various newline conventions, but the majority of the | |||
ests expect a single line‐ | tests expect a single line‐ | |||
feed to be recognized as a newline by default. Without special acti | feed to be recognized as a newline by default. Without special actio | |||
on the tests would fail when | n the tests would fail when | |||
PCRE2 is compiled with either CR or CRLF as the default newline. | PCRE2 is compiled with either CR or CRLF as the default newline. | |||
The #newline_default command specifies a list of newline types that are acceptable as the de‐ | The #newline_default command specifies a list of newline types th at are acceptable as the de‐ | |||
fault. The types must be one of CR, LF, CRLF, ANYCRLF, ANY, or NUL ( in upper or lower case), for | fault. The types must be one of CR, LF, CRLF, ANYCRLF, ANY, or NUL ( in upper or lower case), for | |||
example: | example: | |||
#newline_default LF Any anyCRLF | #newline_default LF Any anyCRLF | |||
If the default newline is in the list, this command has no effect. | If the default newline is in the list, this command has no effect. O | |||
Otherwise, except when test‐ | therwise, except when test‐ | |||
ing the POSIX API, a newline modifier that specifies the first newli | ing the POSIX API, a newline modifier that specifies the first ne | |||
ne convention in the list | wline convention in the list | |||
(LF in the above example) is added to any pattern that does not alre ady have a newline modifier. | (LF in the above example) is added to any pattern that does not alre ady have a newline modifier. | |||
If the newline list is empty, the feature is turned off. This comma nd is present in a number of | If the newline list is empty, the feature is turned off. This comman d is present in a number of | |||
the standard test input files. | the standard test input files. | |||
When the POSIX API is being tested there is no way to override the d | When the POSIX API is being tested there is no way to override the | |||
efault newline convention, | default newline convention, | |||
though it is possible to set the newline convention from within the | though it is possible to set the newline convention from within the | |||
pattern. A warning is given | pattern. A warning is given | |||
if the posix or posix_nosub modifier is used when #newline_default w | if the posix or posix_nosub modifier is used when #newline_default | |||
ould set a default for the | would set a default for the | |||
non-POSIX API. | non-POSIX API. | |||
#pattern <modifier-list> | #pattern <modifier-list> | |||
This command sets a default modifier list that applies to all subse quent patterns. Modifiers on | This command sets a default modifier list that applies to all subseq uent patterns. Modifiers on | |||
a pattern can change these settings. | a pattern can change these settings. | |||
#perltest | #perltest | |||
This line is used in test files that can also be processed by perlte | This line is used in test files that can also be processed by perl | |||
st.sh to confirm that Perl | test.sh to confirm that Perl | |||
gives the same results as PCRE2. Subsequent tests are checked for t | gives the same results as PCRE2. Subsequent tests are checked for th | |||
he use of pcre2test features | e use of pcre2test features | |||
that are incompatible with the perltest.sh script. | that are incompatible with the perltest.sh script. | |||
Patterns must use '/' as their delimiter, and only certain modifie | Patterns must use '/' as their delimiter, and only certain modi | |||
rs are supported. Comment | fiers are supported. Comment | |||
lines, #pattern commands, and #subject commands that set or uns | lines, #pattern commands, and #subject commands that set or unset | |||
et "mark" are recognized and | "mark" are recognized and | |||
acted on. The #perltest, #forbid_utf, and #newline_default commands, | acted on. The #perltest, #forbid_utf, and #newline_default comma | |||
which are needed in the | nds, which are needed in the | |||
relevant pcre2test files, are silently ignored. All other command l | relevant pcre2test files, are silently ignored. All other command li | |||
ines are ignored, but give a | nes are ignored, but give a | |||
warning message. The #perltest command helps detect tests that are a ccidentally put in the wrong | warning message. The #perltest command helps detect tests that are a ccidentally put in the wrong | |||
file or use the wrong delimiter. For more details of the perltest.sh script see the comments it | file or use the wrong delimiter. For more details of the perltest.s h script see the comments it | |||
contains. | contains. | |||
#pop [<modifiers>] | #pop [<modifiers>] | |||
#popcopy [<modifiers>] | #popcopy [<modifiers>] | |||
These commands are used to manipulate the stack of compiled patter ns, as described in the sec‐ | These commands are used to manipulate the stack of compiled patterns , as described in the sec‐ | |||
tion entitled "Saving and restoring compiled patterns" below. | tion entitled "Saving and restoring compiled patterns" below. | |||
#save <filename> | #save <filename> | |||
This command is used to save a set of compiled patterns to a file, a s described in the section | This command is used to save a set of compiled patterns to a file, as described in the section | |||
entitled "Saving and restoring compiled patterns" below. | entitled "Saving and restoring compiled patterns" below. | |||
#subject <modifier-list> | #subject <modifier-list> | |||
This command sets a default modifier list that applies to all sub sequent subject lines. Modi‐ | This command sets a default modifier list that applies to all subseq uent subject lines. Modi‐ | |||
fiers on a subject line can change these settings. | fiers on a subject line can change these settings. | |||
MODIFIER SYNTAX | MODIFIER SYNTAX | |||
Modifier lists are used with both pattern and subject lines. Items i n a list are separated by | Modifier lists are used with both pattern and subject lines. Item s in a list are separated by | |||
commas followed by optional white space. Trailing whitespace in a mo difier list is ignored. Some | commas followed by optional white space. Trailing whitespace in a mo difier list is ignored. Some | |||
modifiers may be given for both patterns and subject lines, where | modifiers may be given for both patterns and subject lines, whereas | |||
as others are valid only for | others are valid only for | |||
one or the other. Each modifier has a long name, for example "anchor | one or the other. Each modifier has a long name, for example "anch | |||
ed", and some of them must | ored", and some of them must | |||
be followed by an equals sign and a value, for example, "offset=12". Values cannot contain comma | be followed by an equals sign and a value, for example, "offset=12". Values cannot contain comma | |||
characters, but may contain spaces. Modifiers that do not take value s may be preceded by a minus | characters, but may contain spaces. Modifiers that do not take value s may be preceded by a minus | |||
sign to turn off a previous setting. | sign to turn off a previous setting. | |||
A few of the more common modifiers can also be specified as single | A few of the more common modifiers can also be specified as single l | |||
letters, for example "i" for | etters, for example "i" for | |||
"caseless". In documentation, following the Perl convention, these | "caseless". In documentation, following the Perl convention, th | |||
are written with a slash | ese are written with a slash | |||
("the /i modifier") for clarity. Abbreviated modifiers must all | ("the /i modifier") for clarity. Abbreviated modifiers must all be | |||
be concatenated in the first | concatenated in the first | |||
item of a modifier list. If the first item is not recognized as a lo | item of a modifier list. If the first item is not recognized as a l | |||
ng modifier name, it is in‐ | ong modifier name, it is in‐ | |||
terpreted as a sequence of these abbreviations. For example: | terpreted as a sequence of these abbreviations. For example: | |||
/abc/ig,newline=cr,jit=3 | /abc/ig,newline=cr,jit=3 | |||
This is a pattern line whose modifier list starts with two one-lette r modifiers (/i and /g). The | This is a pattern line whose modifier list starts with two one-lette r modifiers (/i and /g). The | |||
lower-case abbreviated modifiers are the same as used in Perl. | lower-case abbreviated modifiers are the same as used in Perl. | |||
PATTERN SYNTAX | PATTERN SYNTAX | |||
A pattern line must start with one of the following characters (co mmon symbols, excluding pat‐ | A pattern line must start with one of the following characters (comm on symbols, excluding pat‐ | |||
tern meta-characters): | tern meta-characters): | |||
/ ! " ' ` - = _ : ; , % & @ ~ | / ! " ' ` - = _ : ; , % & @ ~ | |||
This is interpreted as the pattern's delimiter. A regular expression may be continued over sev‐ | This is interpreted as the pattern's delimiter. A regular expressio n may be continued over sev‐ | |||
eral input lines, in which case the newline characters are included within it. It is possible to | eral input lines, in which case the newline characters are included within it. It is possible to | |||
include the delimiter as a literal within the pattern by escaping it with a backslash, for exam‐ | include the delimiter as a literal within the pattern by escaping it with a backslash, for exam‐ | |||
ple | ple | |||
/abc\/def/ | /abc\/def/ | |||
If you do this, the escape and the delimiter form part of the patte rn, but since the delimiters | If you do this, the escape and the delimiter form part of the patter n, but since the delimiters | |||
are all non-alphanumeric, the inclusion of the backslash does not af fect the pattern's interpre‐ | are all non-alphanumeric, the inclusion of the backslash does not af fect the pattern's interpre‐ | |||
tation. Note, however, that this trick does not work within \Q...\E | tation. Note, however, that this trick does not work within \Q...\ | |||
literal bracketing because | E literal bracketing because | |||
the backslash will itself be interpreted as a literal. If the term | the backslash will itself be interpreted as a literal. If the termin | |||
inating delimiter is immedi‐ | ating delimiter is immedi‐ | |||
ately followed by a backslash, for example, | ately followed by a backslash, for example, | |||
/abc/\ | /abc/\ | |||
a backslash is added to the end of the pattern. This is done to prov ide a way of testing the er‐ | a backslash is added to the end of the pattern. This is done to prov ide a way of testing the er‐ | |||
ror condition that arises if a pattern finishes with a backslash, be cause | ror condition that arises if a pattern finishes with a backslash, be cause | |||
/abc\/ | /abc\/ | |||
is interpreted as the first line of a pattern that starts with "abc/ ", causing pcre2test to read | is interpreted as the first line of a pattern that starts with "abc/ ", causing pcre2test to read | |||
the next line as a continuation of the regular expression. | the next line as a continuation of the regular expression. | |||
A pattern can be followed by a modifier list (details below). | A pattern can be followed by a modifier list (details below). | |||
SUBJECT LINE SYNTAX | SUBJECT LINE SYNTAX | |||
Before each subject line is passed to pcre2_match(), pcre2_dfa_matc | Before each subject line is passed to pcre2_match(), pcre2_dfa_ma | |||
h(), or pcre2_jit_match(), | tch(), or pcre2_jit_match(), | |||
leading and trailing white space is removed, and the line is scanne | leading and trailing white space is removed, and the line is scanned | |||
d for backslash escapes, un‐ | for backslash escapes, un‐ | |||
less the subject_literal modifier was set for the pattern. The follo | less the subject_literal modifier was set for the pattern. The foll | |||
wing provide a means of en‐ | owing provide a means of en‐ | |||
coding non-printing characters in a visible way: | coding non-printing characters in a visible way: | |||
\a alarm (BEL, \x07) | \a alarm (BEL, \x07) | |||
\b backspace (\x08) | \b backspace (\x08) | |||
\e escape (\x27) | \e escape (\x27) | |||
\f form feed (\x0c) | \f form feed (\x0c) | |||
\n newline (\x0a) | \n newline (\x0a) | |||
\r carriage return (\x0d) | \N{U+hh...} unicode character (any number of hex digits) | |||
\t tab (\x09) | \r carriage return (\x0d) | |||
\v vertical tab (\x0b) | \t tab (\x09) | |||
\nnn octal character (up to 3 octal digits); always | \v vertical tab (\x0b) | |||
a byte unless > 255 in UTF-8 or 16-bit or 32-bit mode | \ddd octal number (up to 3 octal digits); represent a singl | |||
\o{dd...} octal character (any number of octal digits} | e | |||
\xhh hexadecimal byte (up to 2 hex digits) | code point unless larger than 255 with the 8-bit lib | |||
\x{hh...} hexadecimal character (any number of hex digits) | rary | |||
\o{dd...} octal number (any number of octal digits} representing | ||||
The use of \x{hh...} is not dependent on the use of the utf modifie | a | |||
r on the pattern. It is rec‐ | character in UTF mode or a code point | |||
ognized always. There may be any number of hexadecimal digits inside | \xhh hexadecimal byte (up to 2 hex digits) | |||
the braces; invalid values | \x{hh...} hexadecimal number (up to 8 hex digits) representing a | |||
provoke error messages. | character in UTF mode or a code point | |||
Note that \xhh specifies one byte rather than one character in UTF- | Invoking \N{U+hh...} or \x{hh...} doesn't require the use of the utf | |||
8 mode; this makes it possi‐ | modifier on the pattern. It | |||
ble to construct invalid UTF-8 sequences for testing purposes. On th | is always recognized. There may be any number of hexadecimal digits | |||
e other hand, \x{hh} is in‐ | inside the braces; invalid | |||
terpreted as a UTF-8 character in UTF-8 mode, generating more t | values provoke error messages but when using \N{U+hh...} with som | |||
han one byte if the value is | e invalid unicode characters | |||
greater than 127. When testing the 8-bit library not in UTF-8 mode, | they will be accepted with a warning instead. | |||
\x{hh} generates one byte | ||||
for values less than 256, and causes an error for greater values. | Note that even in UTF-8 mode, \xhh (and depending of how large, \ddd | |||
) describe one byte rather | ||||
than one character; this makes it possible to construct invalid UTF- | ||||
8 sequences for testing pur‐ | ||||
poses. On the other hand, \x{hh...} is interpreted as a UTF-8 charac | ||||
ter in UTF-8 mode, only gen‐ | ||||
erating more than one byte if the value is greater than 127. To av | ||||
oid the ambiguity it is pre‐ | ||||
ferred to use \N{U+hh...} when describing characters. When testing | ||||
the 8-bit library not in | ||||
UTF-8 mode, \x{hh} generates one byte for values that could fit on | ||||
it, and causes an error for | ||||
greater values. | ||||
In UTF-16 mode, all 4-digit \x{hhhh} values are accepted. This ma | When testing the 16-bit library, not in UTF-16 mode, all 4-digit \x{ | |||
kes it possible to construct | hhhh} values are accepted. | |||
invalid UTF-16 sequences for testing purposes. | This makes it possible to construct invalid UTF-16 sequences for tes | |||
ting purposes. | ||||
In UTF-32 mode, all 4- to 8-digit \x{...} values are accepted. This | When testing the 32-bit library, not in UTF-32 mode, all 4 to 8- | |||
makes it possible to con‐ | digit \x{...} values are ac‐ | |||
struct invalid UTF-32 sequences for testing purposes. | cepted. This makes it possible to construct invalid UTF-32 sequences | |||
for testing purposes. | ||||
There is a special backslash sequence that specifies replication of one or more characters: | There is a special backslash sequence that specifies replication of one or more characters: | |||
\[<characters>]{<count>} | \[<characters>]{<count>} | |||
This makes it possible to test long strings without having to provi de them as part of the file. | This makes it possible to test long strings without having to provid e them as part of the file. | |||
For example: | For example: | |||
\[abc]{4} | \[abc]{4} | |||
is converted to "abcabcabcabc". This feature does not support nesti ng. To include a closing | is converted to "abcabcabcabc". This feature does not support n esting. To include a closing | |||
square bracket in the characters, code it as \x5D. | square bracket in the characters, code it as \x5D. | |||
A backslash followed by an equals sign marks the end of the subje ct string and the start of a | A backslash followed by an equals sign marks the end of the subject string and the start of a | |||
modifier list. For example: | modifier list. For example: | |||
abc\=notbol,notempty | abc\=notbol,notempty | |||
If the subject string is empty and \= is followed by whitespace, the line is treated as a com‐ | If the subject string is empty and \= is followed by whitespace, t he line is treated as a com‐ | |||
ment line, and is not used for matching. For example: | ment line, and is not used for matching. For example: | |||
\= This is a comment. | \= This is a comment. | |||
abc\= This is an invalid modifier list. | abc\= This is an invalid modifier list. | |||
A backslash followed by any other non-alphanumeric character ju | A backslash followed by any other non-alphanumeric character just | |||
st escapes that character. A | escapes that character. A | |||
backslash followed by anything else causes an error. However, if the | backslash followed by anything else causes an error. However, if th | |||
very last character in the | e very last character in the | |||
line is a backslash (and there is no modifier list), it is ignored. | line is a backslash (and there is no modifier list), it is ignored. | |||
This gives a way of passing | This gives a way of passing | |||
an empty line as data, since a real empty line terminates the data i nput. | an empty line as data, since a real empty line terminates the data i nput. | |||
If the subject_literal modifier is set for a pattern, all subject li nes that follow are treated | If the subject_literal modifier is set for a pattern, all subject l ines that follow are treated | |||
as literals, with no special treatment of backslashes. No replicati on is possible, and any sub‐ | as literals, with no special treatment of backslashes. No replicati on is possible, and any sub‐ | |||
ject modifiers must be set as defaults by a #subject command. | ject modifiers must be set as defaults by a #subject command. | |||
PATTERN MODIFIERS | PATTERN MODIFIERS | |||
There are several types of modifier that can appear in pattern line s. Except where noted below, | There are several types of modifier that can appear in pattern lines . Except where noted below, | |||
they may also be used in #pattern commands. A pattern's modifier lis t can add to or override de‐ | they may also be used in #pattern commands. A pattern's modifier lis t can add to or override de‐ | |||
fault modifiers that were set by a previous #pattern command. | fault modifiers that were set by a previous #pattern command. | |||
Setting compilation options | Setting compilation options | |||
The following modifiers set options for pcre2_compile(). Most of the | The following modifiers set options for pcre2_compile(). Most of | |||
m set bits in the options | them set bits in the options | |||
argument of that function, but those whose names start with PCRE2_ | argument of that function, but those whose names start with PCRE2_EX | |||
EXTRA are additional options | TRA are additional options | |||
that are set in the compile context. Some of these options have s | that are set in the compile context. Some of these options have | |||
ingle-letter abbreviations. | single-letter abbreviations. | |||
There is special handling for /x: if a second x is present, PCRE | There is special handling for /x: if a second x is present, PCRE2_E | |||
2_EXTENDED is converted into | XTENDED is converted into | |||
PCRE2_EXTENDED_MORE as in Perl. A third appearance adds PCRE2_EXTEN | PCRE2_EXTENDED_MORE as in Perl. A third appearance adds PCRE2_E | |||
DED as well, though this | XTENDED as well, though this | |||
makes no difference to the way pcre2_compile() behaves. See pcre2 | makes no difference to the way pcre2_compile() behaves. See pcre2api | |||
api for a description of the | for a description of the | |||
effects of these options. | effects of these options. | |||
allow_empty_class set PCRE2_ALLOW_EMPTY_CLASS | allow_empty_class set PCRE2_ALLOW_EMPTY_CLASS | |||
allow_lookaround_bsk set PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK | allow_lookaround_bsk set PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK | |||
allow_surrogate_escapes set PCRE2_EXTRA_ALLOW_SURROGATE_ESCA PES | allow_surrogate_escapes set PCRE2_EXTRA_ALLOW_SURROGATE_ESCA PES | |||
alt_bsux set PCRE2_ALT_BSUX | alt_bsux set PCRE2_ALT_BSUX | |||
alt_circumflex set PCRE2_ALT_CIRCUMFLEX | alt_circumflex set PCRE2_ALT_CIRCUMFLEX | |||
alt_extended_class set PCRE2_ALT_EXTENDED_CLASS | ||||
alt_verbnames set PCRE2_ALT_VERBNAMES | alt_verbnames set PCRE2_ALT_VERBNAMES | |||
anchored set PCRE2_ANCHORED | anchored set PCRE2_ANCHORED | |||
/a ascii_all set all ASCII options | /a ascii_all set all ASCII options | |||
ascii_bsd set PCRE2_EXTRA_ASCII_BSD | ascii_bsd set PCRE2_EXTRA_ASCII_BSD | |||
ascii_bss set PCRE2_EXTRA_ASCII_BSS | ascii_bss set PCRE2_EXTRA_ASCII_BSS | |||
ascii_bsw set PCRE2_EXTRA_ASCII_BSW | ascii_bsw set PCRE2_EXTRA_ASCII_BSW | |||
ascii_digit set PCRE2_EXTRA_ASCII_DIGIT | ascii_digit set PCRE2_EXTRA_ASCII_DIGIT | |||
ascii_posix set PCRE2_EXTRA_ASCII_POSIX | ascii_posix set PCRE2_EXTRA_ASCII_POSIX | |||
auto_callout set PCRE2_AUTO_CALLOUT | auto_callout set PCRE2_AUTO_CALLOUT | |||
bad_escape_is_literal set PCRE2_EXTRA_BAD_ESCAPE_IS_LITERA L | bad_escape_is_literal set PCRE2_EXTRA_BAD_ESCAPE_IS_LITERA L | |||
skipping to change at line 491 | skipping to change at line 508 | |||
/xx extended_more set PCRE2_EXTENDED_MORE | /xx extended_more set PCRE2_EXTENDED_MORE | |||
extra_alt_bsux set PCRE2_EXTRA_ALT_BSUX | extra_alt_bsux set PCRE2_EXTRA_ALT_BSUX | |||
firstline set PCRE2_FIRSTLINE | firstline set PCRE2_FIRSTLINE | |||
literal set PCRE2_LITERAL | literal set PCRE2_LITERAL | |||
match_line set PCRE2_EXTRA_MATCH_LINE | match_line set PCRE2_EXTRA_MATCH_LINE | |||
match_invalid_utf set PCRE2_MATCH_INVALID_UTF | match_invalid_utf set PCRE2_MATCH_INVALID_UTF | |||
match_unset_backref set PCRE2_MATCH_UNSET_BACKREF | match_unset_backref set PCRE2_MATCH_UNSET_BACKREF | |||
match_word set PCRE2_EXTRA_MATCH_WORD | match_word set PCRE2_EXTRA_MATCH_WORD | |||
/m multiline set PCRE2_MULTILINE | /m multiline set PCRE2_MULTILINE | |||
never_backslash_c set PCRE2_NEVER_BACKSLASH_C | never_backslash_c set PCRE2_NEVER_BACKSLASH_C | |||
never_callout set PCRE2_EXTRA_NEVER_CALLOUT | ||||
never_ucp set PCRE2_NEVER_UCP | never_ucp set PCRE2_NEVER_UCP | |||
never_utf set PCRE2_NEVER_UTF | never_utf set PCRE2_NEVER_UTF | |||
/n no_auto_capture set PCRE2_NO_AUTO_CAPTURE | /n no_auto_capture set PCRE2_NO_AUTO_CAPTURE | |||
no_auto_possess set PCRE2_NO_AUTO_POSSESS | no_auto_possess set PCRE2_NO_AUTO_POSSESS | |||
no_bs0 set PCRE2_EXTRA_NO_BS0 | ||||
no_dotstar_anchor set PCRE2_NO_DOTSTAR_ANCHOR | no_dotstar_anchor set PCRE2_NO_DOTSTAR_ANCHOR | |||
no_start_optimize set PCRE2_NO_START_OPTIMIZE | no_start_optimize set PCRE2_NO_START_OPTIMIZE | |||
no_utf_check set PCRE2_NO_UTF_CHECK | no_utf_check set PCRE2_NO_UTF_CHECK | |||
python_octal set PCRE2_EXTRA_PYTHON_OCTAL | ||||
turkish_casing set PCRE2_EXTRA_TURKISH_CASING | ||||
ucp set PCRE2_UCP | ucp set PCRE2_UCP | |||
ungreedy set PCRE2_UNGREEDY | ungreedy set PCRE2_UNGREEDY | |||
use_offset_limit set PCRE2_USE_OFFSET_LIMIT | use_offset_limit set PCRE2_USE_OFFSET_LIMIT | |||
utf set PCRE2_UTF | utf set PCRE2_UTF | |||
As well as turning on the PCRE2_UTF option, the utf modifier causes | As well as turning on the PCRE2_UTF option, the utf modifier causes | |||
all non-printing characters | all non-printing characters | |||
in output strings to be printed using the \x{hh...} notation. Othe | in output strings to be printed using the \x{hh...} notation. Otherw | |||
rwise, those less than 0x100 | ise, those less than 0x100 | |||
are output in hex without the curly brackets. Setting utf in 16-bit | are output in hex without the curly brackets. Setting utf in 16-bi | |||
or 32-bit mode also causes | t or 32-bit mode also causes | |||
pattern and subject strings to be translated to UTF-16 or UTF-32 | pattern and subject strings to be translated to UTF-16 or UTF-32, | |||
, respectively, before being | respectively, before being | |||
passed to library functions. | passed to library functions. | |||
The following modifiers enable or disable performance optimization | ||||
s by calling pcre2_set_opti‐ | ||||
mize() before invoking the regex compiler. | ||||
optimization_full enable all optional optimizations | ||||
optimization_none disable all optional optimizations | ||||
auto_possess auto-possessify variable quantifiers | ||||
auto_possess_off don't auto-possessify variable quantifi | ||||
ers | ||||
dotstar_anchor anchor patterns starting with .* | ||||
dotstar_anchor_off don't anchor patterns starting with .* | ||||
start_optimize enable pre-scan of subject string | ||||
start_optimize_off disable pre-scan of subject string | ||||
See the pcre2_set_optimize documentation for details on these optimi | ||||
zations. | ||||
Setting compilation controls | Setting compilation controls | |||
The following modifiers affect the compilation process or request in formation about the pattern. | The following modifiers affect the compilation process or request in formation about the pattern. | |||
There are single-letter abbreviations for some that are heavily used in the test files. | There are single-letter abbreviations for some that are heavily used in the test files. | |||
bsr=[anycrlf|unicode] specify \R handling | ||||
/B bincode show binary code without lengths | /B bincode show binary code without lengths | |||
bsr=[anycrlf|unicode] specify \R handling | ||||
callout_info show callout information | callout_info show callout information | |||
convert=<options> request foreign pattern conversion | convert=<options> request foreign pattern conversion | |||
convert_glob_escape=c set glob escape character | convert_glob_escape=c set glob escape character | |||
convert_glob_separator=c set glob separator character | convert_glob_separator=c set glob separator character | |||
convert_length set convert buffer length | convert_length set convert buffer length | |||
debug same as info,fullbincode | debug same as info,fullbincode | |||
expand expand repetition syntax in pattern | ||||
framesize show matching frame size | framesize show matching frame size | |||
fullbincode show binary code with lengths | fullbincode show binary code with lengths | |||
/I info show info about compiled pattern | /I info show info about compiled pattern | |||
hex unquoted characters are hexadecimal | hex unquoted characters are hexadecimal | |||
jit[=<number>] use JIT | jit[=<number>] use JIT | |||
jitfast use JIT fast path | jitfast use JIT fast path | |||
jitverify verify JIT use | jitverify verify JIT use | |||
locale=<name> use this locale | locale=<name> use this locale | |||
max_pattern_compiled ) set maximum compiled pattern | max_pattern_compiled ) set maximum compiled pattern | |||
_length=<n> ) length (bytes) | _length=<n> ) length (bytes) | |||
skipping to change at line 543 | skipping to change at line 579 | |||
max_varlookbehind=<n> set maximum variable lookbehind leng th | max_varlookbehind=<n> set maximum variable lookbehind leng th | |||
memory show memory used | memory show memory used | |||
newline=<type> set newline type | newline=<type> set newline type | |||
null_context compile with a NULL context | null_context compile with a NULL context | |||
null_pattern pass pattern as NULL | null_pattern pass pattern as NULL | |||
parens_nest_limit=<n> set maximum parentheses depth | parens_nest_limit=<n> set maximum parentheses depth | |||
posix use the POSIX API | posix use the POSIX API | |||
posix_nosub use the POSIX API with REG_NOSUB | posix_nosub use the POSIX API with REG_NOSUB | |||
push push compiled pattern onto the stack | push push compiled pattern onto the stack | |||
pushcopy push a copy onto the stack | pushcopy push a copy onto the stack | |||
pushtablescopy push a copy with tables onto the sta ck | ||||
stackguard=<number> test the stackguard feature | stackguard=<number> test the stackguard feature | |||
subject_literal treat all subject lines as literal | subject_literal treat all subject lines as literal | |||
tables=[0|1|2|3] select internal tables | tables=[0|1|2|3] select internal tables | |||
use_length do not zero-terminate the pattern | use_length do not zero-terminate the pattern | |||
utf8_input treat input as UTF-8 | utf8_input treat input as UTF-8 | |||
The effects of these modifiers are described in the following sectio ns. | The effects of these modifiers are described in the following sectio ns. | |||
Newline and \R handling | Newline and \R handling | |||
skipping to change at line 858 | skipping to change at line 895 | |||
allvector show the entire ovector | allvector show the entire ovector | |||
allusedtext show all consulted text | allusedtext show all consulted text | |||
altglobal alternative global matching | altglobal alternative global matching | |||
/g global global matching | /g global global matching | |||
heapframes_size show match data heapframes size | heapframes_size show match data heapframes size | |||
jitstack=<n> set size of JIT stack | jitstack=<n> set size of JIT stack | |||
mark show mark values | mark show mark values | |||
replace=<string> specify a replacement string | replace=<string> specify a replacement string | |||
startchar show starting character when relev ant | startchar show starting character when relev ant | |||
substitute_callout use substitution callouts | substitute_callout use substitution callouts | |||
substitute_case_callout use substitution case callouts | ||||
substitute_extended use PCRE2_SUBSTITUTE_EXTENDED | substitute_extended use PCRE2_SUBSTITUTE_EXTENDED | |||
substitute_literal use PCRE2_SUBSTITUTE_LITERAL | substitute_literal use PCRE2_SUBSTITUTE_LITERAL | |||
substitute_matched use PCRE2_SUBSTITUTE_MATCHED | substitute_matched use PCRE2_SUBSTITUTE_MATCHED | |||
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENG TH | substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENG TH | |||
substitute_replacement_only use PCRE2_SUBSTITUTE_REPLACEMENT_O NLY | substitute_replacement_only use PCRE2_SUBSTITUTE_REPLACEMENT_O NLY | |||
substitute_skip=<n> skip substitution <n> | substitute_skip=<n> skip substitution <n> | |||
substitute_stop=<n> skip substitution <n> and followin g | substitute_stop=<n> skip substitution <n> and followin g | |||
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET | substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET | |||
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY | substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY | |||
skipping to change at line 925 | skipping to change at line 963 | |||
The convert_glob_escape and convert_glob_separator modifiers can be used to specify the escape | The convert_glob_escape and convert_glob_separator modifiers can be used to specify the escape | |||
and separator characters for glob processing, overriding the defaul ts, which are operating-sys‐ | and separator characters for glob processing, overriding the defaul ts, which are operating-sys‐ | |||
tem dependent. | tem dependent. | |||
SUBJECT MODIFIERS | SUBJECT MODIFIERS | |||
The modifiers that can appear in subject lines and the #subject comm and are of two types. | The modifiers that can appear in subject lines and the #subject comm and are of two types. | |||
Setting match options | Setting match options | |||
The following modifiers set options for pcre2_match() or pcre2_dfa_m atch(). See pcreapi for a | The following modifiers set options for pcre2_match() or pcre2_dfa_m atch(). See pcre2api for a | |||
description of their effects. | description of their effects. | |||
anchored set PCRE2_ANCHORED | anchored set PCRE2_ANCHORED | |||
copy_matched_subject set PCRE2_COPY_MATCHED_SUBJECT | ||||
endanchored set PCRE2_ENDANCHORED | endanchored set PCRE2_ENDANCHORED | |||
dfa_restart set PCRE2_DFA_RESTART | dfa_restart set PCRE2_DFA_RESTART | |||
dfa_shortest set PCRE2_DFA_SHORTEST | dfa_shortest set PCRE2_DFA_SHORTEST | |||
disable_recurseloop_check set PCRE2_DISABLE_RECURSELOOP_CHECK | disable_recurseloop_check set PCRE2_DISABLE_RECURSELOOP_CHECK | |||
no_jit set PCRE2_NO_JIT | no_jit set PCRE2_NO_JIT | |||
no_utf_check set PCRE2_NO_UTF_CHECK | no_utf_check set PCRE2_NO_UTF_CHECK | |||
notbol set PCRE2_NOTBOL | notbol set PCRE2_NOTBOL | |||
notempty set PCRE2_NOTEMPTY | notempty set PCRE2_NOTEMPTY | |||
notempty_atstart set PCRE2_NOTEMPTY_ATSTART | notempty_atstart set PCRE2_NOTEMPTY_ATSTART | |||
noteol set PCRE2_NOTEOL | noteol set PCRE2_NOTEOL | |||
skipping to change at line 972 | skipping to change at line 1011 | |||
Setting match controls | Setting match controls | |||
The following modifiers affect the matching process or request addit ional information. Some of | The following modifiers affect the matching process or request addit ional information. Some of | |||
them may also be specified on a pattern line (see above), in which c ase they apply to every sub‐ | them may also be specified on a pattern line (see above), in which c ase they apply to every sub‐ | |||
ject line that is matched against that pattern, but can be overrid den by modifiers on the sub‐ | ject line that is matched against that pattern, but can be overrid den by modifiers on the sub‐ | |||
ject. | ject. | |||
aftertext show text after match | aftertext show text after match | |||
allaftertext show text after captures | allaftertext show text after captures | |||
allcaptures show all captures | allcaptures show all captures | |||
allvector show the entire ovector | ||||
allusedtext show all consulted text (non-JIT on ly) | allusedtext show all consulted text (non-JIT on ly) | |||
allvector show the entire ovector | ||||
altglobal alternative global matching | altglobal alternative global matching | |||
callout_capture show captures at callout time | callout_capture show captures at callout time | |||
callout_data=<n> set a value to pass via callouts | callout_data=<n> set a value to pass via callouts | |||
callout_error=<n>[:<m>] control callout error | callout_error=<n>[:<m>] control callout error | |||
callout_extra show extra callout information | callout_extra show extra callout information | |||
callout_fail=<n>[:<m>] control callout failure | callout_fail=<n>[:<m>] control callout failure | |||
callout_no_where do not show position of a callout | callout_no_where do not show position of a callout | |||
callout_none do not supply a callout function | callout_none do not supply a callout function | |||
copy=<number or name> copy captured substring | copy=<number or name> copy captured substring | |||
depth_limit=<n> set a depth limit | depth_limit=<n> set a depth limit | |||
skipping to change at line 1007 | skipping to change at line 1046 | |||
null_replacement substitute with NULL replacement | null_replacement substitute with NULL replacement | |||
null_subject match with NULL subject | null_subject match with NULL subject | |||
offset=<n> set starting offset | offset=<n> set starting offset | |||
offset_limit=<n> set offset limit | offset_limit=<n> set offset limit | |||
ovector=<n> set size of output vector | ovector=<n> set size of output vector | |||
recursion_limit=<n> obsolete synonym for depth_limit | recursion_limit=<n> obsolete synonym for depth_limit | |||
replace=<string> specify a replacement string | replace=<string> specify a replacement string | |||
startchar show startchar when relevant | startchar show startchar when relevant | |||
startoffset=<n> same as offset=<n> | startoffset=<n> same as offset=<n> | |||
substitute_callout use substitution callouts | substitute_callout use substitution callouts | |||
substitute_extedded use PCRE2_SUBSTITUTE_EXTENDED | substitute_case_callout use substitution case callouts | |||
substitute_extended use PCRE2_SUBSTITUTE_EXTENDED | ||||
substitute_literal use PCRE2_SUBSTITUTE_LITERAL | substitute_literal use PCRE2_SUBSTITUTE_LITERAL | |||
substitute_matched use PCRE2_SUBSTITUTE_MATCHED | substitute_matched use PCRE2_SUBSTITUTE_MATCHED | |||
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGT H | substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGT H | |||
substitute_replacement_only use PCRE2_SUBSTITUTE_REPLACEMENT_O NLY | substitute_replacement_only use PCRE2_SUBSTITUTE_REPLACEMENT_O NLY | |||
substitute_skip=<n> skip substitution number n | substitute_skip=<n> skip substitution number n | |||
substitute_stop=<n> skip substitution number n and grea ter | substitute_stop=<n> skip substitution number n and grea ter | |||
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET | substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET | |||
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY | substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY | |||
zero_terminate pass the subject as zero-terminated | zero_terminate pass the subject as zero-terminated | |||
skipping to change at line 1236 | skipping to change at line 1276 | |||
1(1) Old 0 3 "abc" New 0 5 "<abc> SKIPPED" | 1(1) Old 0 3 "abc" New 0 5 "<abc> SKIPPED" | |||
2(1) Old 6 9 "abc" New 6 11 "<abc>" | 2(1) Old 6 9 "abc" New 6 11 "<abc>" | |||
2: abcdef<abc>pqr | 2: abcdef<abc>pqr | |||
abcdefabcpqr\=substitute_stop=1 | abcdefabcpqr\=substitute_stop=1 | |||
1(1) Old 0 3 "abc" New 0 5 "<abc> STOPPED" | 1(1) Old 0 3 "abc" New 0 5 "<abc> STOPPED" | |||
1: abcdefabcpqr | 1: abcdefabcpqr | |||
If both are set for the same number, stop takes precedence. Only a s ingle skip or stop is sup‐ | If both are set for the same number, stop takes precedence. Only a s ingle skip or stop is sup‐ | |||
ported, which is sufficient for testing that the feature works. | ported, which is sufficient for testing that the feature works. | |||
Testing substitute case callouts | ||||
If the substitute_case_callout modifier is set, a substitution case | ||||
callout function is set up. | ||||
The callout function is called for each substituted chunk which is t | ||||
o be case-transformed. | ||||
The callout function passed is a fixed function with implementation | ||||
for certain behaviours: in‐ | ||||
puts which shrink when case-transformed; inputs which grow; inputs w | ||||
ith distinct upper/lower/ti‐ | ||||
tlecase forms. The characters which are not special-cased for testi | ||||
ng purposes are left unmodi‐ | ||||
fied, as if they are caseless characters. | ||||
Setting the JIT stack size | Setting the JIT stack size | |||
The jitstack modifier provides a way of setting the maximum stack si ze that is used by the just- | The jitstack modifier provides a way of setting the maximum stack si ze that is used by the just- | |||
in-time optimization code. It is ignored if JIT optimization is no t being used. The value is a | in-time optimization code. It is ignored if JIT optimization is not being used. The value is a | |||
number of kibibytes (units of 1024 bytes). Setting zero reverts to t he default of 32KiB. Provid‐ | number of kibibytes (units of 1024 bytes). Setting zero reverts to t he default of 32KiB. Provid‐ | |||
ing a stack that is larger than the default is necessary only for ve ry complicated patterns. If | ing a stack that is larger than the default is necessary only for v ery complicated patterns. If | |||
jitstack is set non-zero on a subject line it overrides any value th at was set on the pattern. | jitstack is set non-zero on a subject line it overrides any value th at was set on the pattern. | |||
Setting heap, match, and depth limits | Setting heap, match, and depth limits | |||
The heap_limit, match_limit, and depth_limit modifiers set the app | The heap_limit, match_limit, and depth_limit modifiers set the appro | |||
ropriate limits in the match | priate limits in the match | |||
context. These values are ignored when the find_limits or find_limit | context. These values are ignored when the find_limits or find_limi | |||
s_noheap modifier is speci‐ | ts_noheap modifier is speci‐ | |||
fied. | fied. | |||
Finding minimum limits | Finding minimum limits | |||
If the find_limits modifier is present on a subject line, pcre2test | If the find_limits modifier is present on a subject line, pcre2test | |||
calls the relevant matching | calls the relevant matching | |||
function several times, setting different values in th | function several times, setting different values in | |||
e match context via | the match context via | |||
pcre2_set_heap_limit(), pcre2_set_match_limit(), or pcre2_set_dept | pcre2_set_heap_limit(), pcre2_set_match_limit(), or pcre2_set_depth_ | |||
h_limit() until it finds the | limit() until it finds the | |||
smallest value for each parameter that allows the match to complete | smallest value for each parameter that allows the match to complet | |||
without a "limit exceeded" | e without a "limit exceeded" | |||
error. The match itself may succeed or fail. An alternative modifie | error. The match itself may succeed or fail. An alternative modifier | |||
r, find_limits_noheap, omits | , find_limits_noheap, omits | |||
the heap limit. This is used in the standard tests, because the mini | the heap limit. This is used in the standard tests, because the m | |||
mum heap limit varies be‐ | inimum heap limit varies be‐ | |||
tween systems. If JIT is being used, only the match limit is relevan t, and the other two are au‐ | tween systems. If JIT is being used, only the match limit is relevan t, and the other two are au‐ | |||
tomatically omitted. | tomatically omitted. | |||
When using this modifier, the pattern should not contain an y limit settings such as | When using this modifier, the pattern should not contain an y limit settings such as | |||
(*LIMIT_MATCH=...) within it. If such a setting is present and is lo | (*LIMIT_MATCH=...) within it. If such a setting is present and is l | |||
wer than the minimum match‐ | ower than the minimum match‐ | |||
ing value, the minimum value cannot be found because pcre2_set_matc | ing value, the minimum value cannot be found because pcre2_set_match | |||
h_limit() etc. are only able | _limit() etc. are only able | |||
to reduce the value of an in-pattern limit; they cannot increase it. | to reduce the value of an in-pattern limit; they cannot increase it. | |||
For non-DFA matching, the minimum depth_limit number is a measure of | For non-DFA matching, the minimum depth_limit number is a measure o | |||
how much nested backtrack‐ | f how much nested backtrack‐ | |||
ing happens (that is, how deeply the pattern's tree is searched). | ing happens (that is, how deeply the pattern's tree is searched). In | |||
In the case of DFA matching, | the case of DFA matching, | |||
depth_limit controls the depth of recursive calls of the internal fu nction that is used for han‐ | depth_limit controls the depth of recursive calls of the internal fu nction that is used for han‐ | |||
dling pattern recursion, lookaround assertions, and atomic groups. | dling pattern recursion, lookaround assertions, and atomic groups. | |||
For non-DFA matching, the match_limit number is a measure of the am | For non-DFA matching, the match_limit number is a measure of the | |||
ount of backtracking that | amount of backtracking that | |||
takes place, and learning the minimum value can be instructive. | takes place, and learning the minimum value can be instructive. For | |||
For most simple matches, the | most simple matches, the | |||
number is quite small, but for patterns with very large numbers of | number is quite small, but for patterns with very large numbers o | |||
matching possibilities, it | f matching possibilities, it | |||
can become large very quickly with increasing length of subjec | can become large very quickly with increasing length of subject str | |||
t string. In the case of DFA | ing. In the case of DFA | |||
matching, match_limit controls the total number of calls, both recur | matching, match_limit controls the total number of calls, both rec | |||
sive and non-recursive, to | ursive and non-recursive, to | |||
the internal matching function, thus controlling the overall amoun | the internal matching function, thus controlling the overall amount | |||
t of computing resource that | of computing resource that | |||
is used. | is used. | |||
For both kinds of matching, the heap_limit number, which is in kibib ytes (units of 1024 bytes), | For both kinds of matching, the heap_limit number, which is in kibi bytes (units of 1024 bytes), | |||
limits the amount of heap memory used for matching. | limits the amount of heap memory used for matching. | |||
Showing MARK names | Showing MARK names | |||
The mark modifier causes the names from backtracking control verbs | The mark modifier causes the names from backtracking control verbs t | |||
that are returned from calls | hat are returned from calls | |||
to pcre2_match() to be displayed. If a mark is returned for a mat | to pcre2_match() to be displayed. If a mark is returned for a | |||
ch, non-match, or partial | match, non-match, or partial | |||
match, pcre2test shows it. For a match, it is on a line by itsel | match, pcre2test shows it. For a match, it is on a line by itself, | |||
f, tagged with "MK:". Other‐ | tagged with "MK:". Other‐ | |||
wise, it is added to the non-match message. | wise, it is added to the non-match message. | |||
Showing memory usage | Showing memory usage | |||
The memory modifier causes pcre2test to log the sizes of all heap me | The memory modifier causes pcre2test to log the sizes of all heap m | |||
mory allocation and freeing | emory allocation and freeing | |||
calls that occur during a call to pcre2_match() or pcre2_dfa_match | calls that occur during a call to pcre2_match() or pcre2_dfa_match() | |||
(). In the latter case, heap | . In the latter case, heap | |||
memory is used only when a match requires more internal workspace th at the default allocation on | memory is used only when a match requires more internal workspace th at the default allocation on | |||
the stack, so in many cases there will be no output. No heap memory is allocated during matching | the stack, so in many cases there will be no output. No heap memory is allocated during matching | |||
with JIT. For this modifier to work, the null_context modifier must not be set on both the pat‐ | with JIT. For this modifier to work, the null_context modifier must not be set on both the pat‐ | |||
tern and the subject, though it can be set on one or the other. | tern and the subject, though it can be set on one or the other. | |||
Showing the heap frame overall vector size | Showing the heap frame overall vector size | |||
The heapframes_size modifier is relevant for matches using pcre2_ | The heapframes_size modifier is relevant for matches using pcre2_mat | |||
match() without JIT. After a | ch() without JIT. After a | |||
match has run (whether successful or not) the size, in bytes, of the | match has run (whether successful or not) the size, in bytes, of th | |||
allocated heap frames vec‐ | e allocated heap frames vec‐ | |||
tor that is left attached to the match data block is shown. If the m atching action involved sev‐ | tor that is left attached to the match data block is shown. If the m atching action involved sev‐ | |||
eral calls to pcre2_match() (for example, global matching or for tim ing) only the final value is | eral calls to pcre2_match() (for example, global matching or for tim ing) only the final value is | |||
shown. | shown. | |||
This modifier is ignored, with a warning, for POSIX or DFA matchin | This modifier is ignored, with a warning, for POSIX or DFA matching. | |||
g. JIT matching does not use | JIT matching does not use | |||
the heap frames vector, so the size is always zero, unless there was | the heap frames vector, so the size is always zero, unless there w | |||
a previous non-JIT match. | as a previous non-JIT match. | |||
Note that specifing a size of zero for the output vector (see bel | Note that specifing a size of zero for the output vector (see below) | |||
ow) causes pcre2test to free | causes pcre2test to free | |||
its match data block (and associated heap frames vector) and allocat e a new one. | its match data block (and associated heap frames vector) and allocat e a new one. | |||
Setting a starting offset | Setting a starting offset | |||
The offset modifier sets an offset in the subject string at which ma tching starts. Its value is | The offset modifier sets an offset in the subject string at which m atching starts. Its value is | |||
a number of code units, not characters. | a number of code units, not characters. | |||
Setting an offset limit | Setting an offset limit | |||
The offset_limit modifier sets a limit for unanchored matches. If a match cannot be found start‐ | The offset_limit modifier sets a limit for unanchored matches. If a match cannot be found start‐ | |||
ing at or before this offset in the subject, a "no match" return i | ing at or before this offset in the subject, a "no match" return is | |||
s given. The data value is a | given. The data value is a | |||
number of code units, not characters. When this modifier is used, th | number of code units, not characters. When this modifier is used, t | |||
e use_offset_limit modifier | he use_offset_limit modifier | |||
must have been set for the pattern; if not, an error is generated. | must have been set for the pattern; if not, an error is generated. | |||
Setting the size of the output vector | Setting the size of the output vector | |||
The ovector modifier applies only to the subject line in which it | The ovector modifier applies only to the subject line in which it ap | |||
appears, though of course it | pears, though of course it | |||
can also be used to set a default in a #subject command. It specifie | can also be used to set a default in a #subject command. It spec | |||
s the number of pairs of | ifies the number of pairs of | |||
offsets that are available for storing matching information. The def ault is 15. | offsets that are available for storing matching information. The def ault is 15. | |||
A value of zero is useful when testing the POSIX API because it c | A value of zero is useful when testing the POSIX API because it caus | |||
auses regexec() to be called | es regexec() to be called | |||
with a NULL capture vector. When not testing the POSIX API, a value | with a NULL capture vector. When not testing the POSIX API, a va | |||
of zero is used to cause | lue of zero is used to cause | |||
pcre2_match_data_create_from_pattern() to be called, in order to cre ate a new match block of ex‐ | pcre2_match_data_create_from_pattern() to be called, in order to cre ate a new match block of ex‐ | |||
actly the right size for the pattern. (It is not possible to creat | actly the right size for the pattern. (It is not possible to create | |||
e a match block with a zero- | a match block with a zero- | |||
length ovector; there is always at least one pair of offsets.) The | length ovector; there is always at least one pair of offsets.) | |||
old match data block is | The old match data block is | |||
freed. | freed. | |||
Passing the subject as zero-terminated | Passing the subject as zero-terminated | |||
By default, the subject string is passed to a native API matchi | By default, the subject string is passed to a native API matching | |||
ng function with its correct | function with its correct | |||
length. In order to test the facility for passing a zero-terminated | length. In order to test the facility for passing a zero-terminate | |||
string, the zero_terminate | d string, the zero_terminate | |||
modifier is provided. It causes the length to be passed as PCRE2_ZE | modifier is provided. It causes the length to be passed as PCRE2_ZER | |||
RO_TERMINATED. When matching | O_TERMINATED. When matching | |||
via the POSIX interface, this modifier is ignored, with a warning. | via the POSIX interface, this modifier is ignored, with a warning. | |||
When testing pcre2_substitute(), this modifier also has the effect o f passing the replacement | When testing pcre2_substitute(), this modifier also has the effec t of passing the replacement | |||
string as zero-terminated. | string as zero-terminated. | |||
Passing a NULL context, subject, or replacement | Passing a NULL context, subject, or replacement | |||
Normally, pcre2test passes a context block to pcre2_m | Normally, pcre2test passes a context block to pcre2_ma | |||
atch(), pcre2_dfa_match(), | tch(), pcre2_dfa_match(), | |||
pcre2_jit_match() or pcre2_substitute(). If the null_context modifi | pcre2_jit_match() or pcre2_substitute(). If the null_context modif | |||
er is set, however, NULL is | ier is set, however, NULL is | |||
passed. This is for testing that the matching and substitution f | passed. This is for testing that the matching and substitution func | |||
unctions behave correctly in | tions behave correctly in | |||
this case (they use default values). This modifier cannot be u | this case (they use default values). This modifier cannot b | |||
sed with the find_limits, | e used with the find_limits, | |||
find_limits_noheap, or substitute_callout modifiers. | find_limits_noheap, or substitute_callout modifiers. | |||
Similarly, for testing purposes, if the null_subject or null_repl | Similarly, for testing purposes, if the null_subject or null_replace | |||
acement modifier is set, the | ment modifier is set, the | |||
subject or replacement string pointers are passed as NULL, respectiv | subject or replacement string pointers are passed as NULL, respect | |||
ely, to the relevant func‐ | ively, to the relevant func‐ | |||
tions. | tions. | |||
THE ALTERNATIVE MATCHING FUNCTION | THE ALTERNATIVE MATCHING FUNCTION | |||
By default, pcre2test uses the standard PCRE2 matching function, pcre2_match() to match each | By default, pcre2test uses the standard PCRE2 matching function, pc re2_match() to match each | |||
subject line. PCRE2 also supports an alternative matching function, pcre2_dfa_match(), which op‐ | subject line. PCRE2 also supports an alternative matching function, pcre2_dfa_match(), which op‐ | |||
erates in a different way, and has some restrictions. The difference s between the two functions | erates in a different way, and has some restrictions. The differenc es between the two functions | |||
are described in the pcre2matching documentation. | are described in the pcre2matching documentation. | |||
If the dfa modifier is set, the alternative matching function is us ed. This function finds all | If the dfa modifier is set, the alternative matching function is use d. This function finds all | |||
possible matches at a given point in the subject. If, however, the d fa_shortest modifier is set, | possible matches at a given point in the subject. If, however, the d fa_shortest modifier is set, | |||
processing stops after the first match is found. This is always the shortest possible match. | processing stops after the first match is found. This is always the shortest possible match. | |||
DEFAULT OUTPUT FROM pcre2test | DEFAULT OUTPUT FROM pcre2test | |||
This section describes the output when the normal matching function , pcre2_match(), is being | This section describes the output when the normal matching funct ion, pcre2_match(), is being | |||
used. | used. | |||
When a match succeeds, pcre2test outputs the list of captured substr ings, starting with number 0 | When a match succeeds, pcre2test outputs the list of captured substr ings, starting with number 0 | |||
for the string that matched the whole pattern. Otherwise, it output s "No match" when the return | for the string that matched the whole pattern. Otherwise, it output s "No match" when the return | |||
is PCRE2_ERROR_NOMATCH, or "Partial match:" followed by the parti | is PCRE2_ERROR_NOMATCH, or "Partial match:" followed by the partiall | |||
ally matching substring when | y matching substring when | |||
the return is PCRE2_ERROR_PARTIAL. (Note that this is the entire sub | the return is PCRE2_ERROR_PARTIAL. (Note that this is the entire | |||
string that was inspected | substring that was inspected | |||
during the partial match; it may include characters before the act | during the partial match; it may include characters before the actua | |||
ual match start if a lookbe‐ | l match start if a lookbe‐ | |||
hind assertion, \K, \b, or \B was involved.) | hind assertion, \K, \b, or \B was involved.) | |||
For any other return, pcre2test outputs the PCRE2 negative error num | For any other return, pcre2test outputs the PCRE2 negative error nu | |||
ber and a short descriptive | mber and a short descriptive | |||
phrase. If the error is a failed UTF string check, the code uni | phrase. If the error is a failed UTF string check, the code unit off | |||
t offset of the start of the | set of the start of the | |||
failing character is also output. Here is an example of an interacti ve pcre2test run. | failing character is also output. Here is an example of an interacti ve pcre2test run. | |||
$ pcre2test | $ pcre2test | |||
PCRE2 version 10.22 2016-07-29 | PCRE2 version 10.22 2016-07-29 | |||
re> /^abc(\d+)/ | re> /^abc(\d+)/ | |||
data> abc123 | data> abc123 | |||
0: abc123 | 0: abc123 | |||
1: 123 | 1: 123 | |||
data> xyz | data> xyz | |||
No match | No match | |||
Unset capturing substrings that are not followed by one that is set | Unset capturing substrings that are not followed by one that is se | |||
are not shown by pcre2test | t are not shown by pcre2test | |||
unless the allcaptures modifier is specified. In the following exam | unless the allcaptures modifier is specified. In the following examp | |||
ple, there are two capturing | le, there are two capturing | |||
substrings, but when the first data line is matched, the second, uns | substrings, but when the first data line is matched, the second, u | |||
et substring is not shown. | nset substring is not shown. | |||
An "internal" unset substring is shown as "<unset>", as for the seco nd data line. | An "internal" unset substring is shown as "<unset>", as for the seco nd data line. | |||
re> /(a)|(b)/ | re> /(a)|(b)/ | |||
data> a | data> a | |||
0: a | 0: a | |||
1: a | 1: a | |||
data> b | data> b | |||
0: b | 0: b | |||
1: <unset> | 1: <unset> | |||
2: b | 2: b | |||
If the strings contain any non-printing characters, they are output as \xhh escapes if the value | If the strings contain any non-printing characters, they are output as \xhh escapes if the value | |||
is less than 256 and UTF mode is not set. Otherwise they are outp | is less than 256 and UTF mode is not set. Otherwise they are output | |||
ut as \x{hh...} escapes. See | as \x{hh...} escapes. See | |||
below for the definition of non-printing characters. If the aftertex | below for the definition of non-printing characters. If the afterte | |||
t modifier is set, the out‐ | xt modifier is set, the out‐ | |||
put for substring 0 is followed by the rest of the subject string, i dentified by "0+" like this: | put for substring 0 is followed by the rest of the subject string, i dentified by "0+" like this: | |||
re> /cat/aftertext | re> /cat/aftertext | |||
data> cataract | data> cataract | |||
0: cat | 0: cat | |||
0+ aract | 0+ aract | |||
If global matching is requested, the results of successive matchin g attempts are output in se‐ | If global matching is requested, the results of successive matching attempts are output in se‐ | |||
quence, like this: | quence, like this: | |||
re> /\Bi(\w\w)/g | re> /\Bi(\w\w)/g | |||
data> Mississippi | data> Mississippi | |||
0: iss | 0: iss | |||
1: ss | 1: ss | |||
0: iss | 0: iss | |||
1: ss | 1: ss | |||
0: ipp | 0: ipp | |||
1: pp | 1: pp | |||
"No match" is output only if the first match attempt fails. Here is an example of a failure mes‐ | "No match" is output only if the first match attempt fails. Here is an example of a failure mes‐ | |||
sage (the offset 4 that is specified by the offset modifier is past the end of the subject | sage (the offset 4 that is specified by the offset modifier is past the end of the subject | |||
string): | string): | |||
re> /xyz/ | re> /xyz/ | |||
data> xyz\=offset=4 | data> xyz\=offset=4 | |||
Error -24 (bad offset value) | Error -24 (bad offset value) | |||
Note that whereas patterns can be continued over several lines (a plain ">" prompt is used for | Note that whereas patterns can be continued over several lines (a pl ain ">" prompt is used for | |||
continuations), subject lines may not. However newlines can be inclu ded in a subject by means of | continuations), subject lines may not. However newlines can be inclu ded in a subject by means of | |||
the \n escape (or \r, \r\n, etc., depending on the newline sequence setting). | the \n escape (or \r, \r\n, etc., depending on the newline sequence setting). | |||
OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION | OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION | |||
When the alternative matching function, pcre2_dfa_match(), is used, the output consists of a | When the alternative matching function, pcre2_dfa_match(), is us ed, the output consists of a | |||
list of all the matches that start at the first point in the subject where there is at least one | list of all the matches that start at the first point in the subject where there is at least one | |||
match. For example: | match. For example: | |||
re> /(tang|tangerine|tan)/ | re> /(tang|tangerine|tan)/ | |||
data> yellow tangerine\=dfa | data> yellow tangerine\=dfa | |||
0: tangerine | 0: tangerine | |||
1: tang | 1: tang | |||
2: tan | 2: tan | |||
Using the normal matching function on this data finds only "tang". | Using the normal matching function on this data finds only "tang". T | |||
The longest matching string | he longest matching string | |||
is always given first (and numbered zero). After a PCRE2_ERROR_PARTI | is always given first (and numbered zero). After a PCRE2_ERROR_P | |||
AL return, the output is | ARTIAL return, the output is | |||
"Partial match:", followed by the partially matching substring. | "Partial match:", followed by the partially matching substring. Note | |||
Note that this is the entire | that this is the entire | |||
substring that was inspected during the partial match; it may includ | substring that was inspected during the partial match; it may inclu | |||
e characters before the ac‐ | de characters before the ac‐ | |||
tual match start if a lookbehind assertion, \b, or \B was involved. (\K is not supported for DFA | tual match start if a lookbehind assertion, \b, or \B was involved. (\K is not supported for DFA | |||
matching.) | matching.) | |||
If global matching is requested, the search for further match es resumes at the end of the | If global matching is requested, the search for further matches re sumes at the end of the | |||
longest match. For example: | longest match. For example: | |||
re> /(tang|tangerine|tan)/g | re> /(tang|tangerine|tan)/g | |||
data> yellow tangerine and tangy sultana\=dfa | data> yellow tangerine and tangy sultana\=dfa | |||
0: tangerine | 0: tangerine | |||
1: tang | 1: tang | |||
2: tan | 2: tan | |||
0: tang | 0: tang | |||
1: tan | 1: tan | |||
0: tan | 0: tan | |||
The alternative matching function does not support substring capture , so the modifiers that are | The alternative matching function does not support substring captur e, so the modifiers that are | |||
concerned with captured substrings are not relevant. | concerned with captured substrings are not relevant. | |||
RESTARTING AFTER A PARTIAL MATCH | RESTARTING AFTER A PARTIAL MATCH | |||
When the alternative matching function has given the PCRE2_ERROR_PAR TIAL return, indicating that | When the alternative matching function has given the PCRE2_ERROR_PAR TIAL return, indicating that | |||
the subject partially matched the pattern, you can restart the m atch with additional subject | the subject partially matched the pattern, you can restart the matc h with additional subject | |||
data by means of the dfa_restart modifier. For example: | data by means of the dfa_restart modifier. For example: | |||
re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d $/ | re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d $/ | |||
data> 23ja\=ps,dfa | data> 23ja\=ps,dfa | |||
Partial match: 23ja | Partial match: 23ja | |||
data> n05\=dfa,dfa_restart | data> n05\=dfa,dfa_restart | |||
0: n05 | 0: n05 | |||
For further information about partial matching, see the pcre2partial documentation. | For further information about partial matching, see the pcre2partial documentation. | |||
CALLOUTS | CALLOUTS | |||
If the pattern contains any callout requests, pcre2test's callout | If the pattern contains any callout requests, pcre2test's callo | |||
function is called during | ut function is called during | |||
matching unless callout_none is specified. This works with both | matching unless callout_none is specified. This works with both mat | |||
matching functions, and with | ching functions, and with | |||
JIT, though there are some differences in behaviour. The output for | JIT, though there are some differences in behaviour. The output for | |||
callouts with numerical ar‐ | callouts with numerical ar‐ | |||
guments and those with string arguments is slightly different. | guments and those with string arguments is slightly different. | |||
Callouts with numerical arguments | Callouts with numerical arguments | |||
By default, the callout function displays the callout number, the st art and current positions in | By default, the callout function displays the callout number, the st art and current positions in | |||
the subject text at the callout time, and the next pattern item to b e tested. For example: | the subject text at the callout time, and the next pattern item to b e tested. For example: | |||
--->pqrabcdef | --->pqrabcdef | |||
0 ^ ^ \d | 0 ^ ^ \d | |||
This output indicates that callout number 0 occurred for a match at | This output indicates that callout number 0 occurred for a match att | |||
tempt starting at the fourth | empt starting at the fourth | |||
character of the subject string, when the pointer was at the seventh | character of the subject string, when the pointer was at the sev | |||
character, and when the | enth character, and when the | |||
next pattern item was \d. Just one circumflex is output if the sta | next pattern item was \d. Just one circumflex is output if the start | |||
rt and current positions are | and current positions are | |||
the same, or if the current position precedes the start position, wh | the same, or if the current position precedes the start position, w | |||
ich can happen if the call‐ | hich can happen if the call‐ | |||
out is in a lookbehind assertion. | out is in a lookbehind assertion. | |||
Callouts numbered 255 are assumed to be automatic callouts, | Callouts numbered 255 are assumed to be automatic callouts, inse | |||
inserted as a result of the | rted as a result of the | |||
auto_callout pattern modifier. In this case, instead of showing the | auto_callout pattern modifier. In this case, instead of showing th | |||
callout number, the offset | e callout number, the offset | |||
in the pattern, preceded by a plus, is output. For example: | in the pattern, preceded by a plus, is output. For example: | |||
re> /\d?[A-E]\*/auto_callout | re> /\d?[A-E]\*/auto_callout | |||
data> E* | data> E* | |||
--->E* | --->E* | |||
+0 ^ \d? | +0 ^ \d? | |||
+3 ^ [A-E] | +3 ^ [A-E] | |||
+8 ^^ \* | +8 ^^ \* | |||
+10 ^ ^ | +10 ^ ^ | |||
0: E* | 0: E* | |||
If a pattern contains (*MARK) items, an additional line is output whenever a change of latest | If a pattern contains (*MARK) items, an additional line is output wh enever a change of latest | |||
mark is passed to the callout function. For example: | mark is passed to the callout function. For example: | |||
re> /a(*MARK:X)bc/auto_callout | re> /a(*MARK:X)bc/auto_callout | |||
data> abc | data> abc | |||
--->abc | --->abc | |||
+0 ^ a | +0 ^ a | |||
+1 ^^ (*MARK:X) | +1 ^^ (*MARK:X) | |||
+10 ^^ b | +10 ^^ b | |||
Latest Mark: X | Latest Mark: X | |||
+11 ^ ^ c | +11 ^ ^ c | |||
+12 ^ ^ | +12 ^ ^ | |||
0: abc | 0: abc | |||
The mark changes between matching "a" and "b", but stays the same fo | The mark changes between matching "a" and "b", but stays the same f | |||
r the rest of the match, so | or the rest of the match, so | |||
nothing more is output. If, as a result of backtracking, the mark | nothing more is output. If, as a result of backtracking, the mark re | |||
reverts to being unset, the | verts to being unset, the | |||
text "<unset>" is output. | text "<unset>" is output. | |||
Callouts with string arguments | Callouts with string arguments | |||
The output for a callout with a string argument is similar, except t | The output for a callout with a string argument is similar, except | |||
hat instead of outputting a | that instead of outputting a | |||
callout number before the position indicators, the callout string a | callout number before the position indicators, the callout string an | |||
nd its offset in the pattern | d its offset in the pattern | |||
string are output before the reflection of the subject string, and t | string are output before the reflection of the subject string, a | |||
he subject string is re‐ | nd the subject string is re‐ | |||
flected for each callout. For example: | flected for each callout. For example: | |||
re> /^ab(?C'first')cd(?C"second")ef/ | re> /^ab(?C'first')cd(?C"second")ef/ | |||
data> abcdefg | data> abcdefg | |||
Callout (7): 'first' | Callout (7): 'first' | |||
--->abcdefg | --->abcdefg | |||
^ ^ c | ^ ^ c | |||
Callout (20): "second" | Callout (20): "second" | |||
--->abcdefg | --->abcdefg | |||
^ ^ e | ^ ^ e | |||
0: abcdef | 0: abcdef | |||
Callout modifiers | Callout modifiers | |||
The callout function in pcre2test returns zero (carry on matching) b y default, but you can use a | The callout function in pcre2test returns zero (carry on matching) b y default, but you can use a | |||
callout_fail modifier in a subject line to change this and other pa rameters of the callout (see | callout_fail modifier in a subject line to change this and other par ameters of the callout (see | |||
below). | below). | |||
If the callout_capture modifier is set, the current captured groups | If the callout_capture modifier is set, the current captured grou | |||
are output when a callout | ps are output when a callout | |||
occurs. This is useful only for non-DFA matching, as pcre2_dfa_matc | occurs. This is useful only for non-DFA matching, as pcre2_dfa_match | |||
h() does not support captur‐ | () does not support captur‐ | |||
ing, so no captures are ever shown. | ing, so no captures are ever shown. | |||
The normal callout output, showing the callout number or pattern off set (as described above) is | The normal callout output, showing the callout number or pattern of fset (as described above) is | |||
suppressed if the callout_no_where modifier is set. | suppressed if the callout_no_where modifier is set. | |||
When using the interpretive matching function pcre2_match() without JIT, setting the callout_ex‐ | When using the interpretive matching function pcre2_match() without JIT, setting the callout_ex‐ | |||
tra modifier causes additional output from pcre2test's callout funct ion to be generated. For the | tra modifier causes additional output from pcre2test's callout funct ion to be generated. For the | |||
first callout in a match attempt at a new starting position in the subject, "New match attempt" | first callout in a match attempt at a new starting position in the s ubject, "New match attempt" | |||
is output. If there has been a backtrack since the last callout (or start of matching if this is | is output. If there has been a backtrack since the last callout (or start of matching if this is | |||
the first callout), "Backtrack" is output, followed by "No other mat ching paths" if the back‐ | the first callout), "Backtrack" is output, followed by "No other matching paths" if the back‐ | |||
track ended the previous match attempt. For example: | track ended the previous match attempt. For example: | |||
re> /(a+)b/auto_callout,no_start_optimize,no_auto_possess | re> /(a+)b/auto_callout,no_start_optimize,no_auto_possess | |||
data> aac\=callout_extra | data> aac\=callout_extra | |||
New match attempt | New match attempt | |||
--->aac | --->aac | |||
+0 ^ ( | +0 ^ ( | |||
+1 ^ a+ | +1 ^ a+ | |||
+3 ^ ^ ) | +3 ^ ^ ) | |||
+4 ^ ^ b | +4 ^ ^ b | |||
skipping to change at line 1614 | skipping to change at line 1664 | |||
+0 ^ ( | +0 ^ ( | |||
+1 ^ a+ | +1 ^ a+ | |||
Backtrack | Backtrack | |||
No other matching paths | No other matching paths | |||
New match attempt | New match attempt | |||
--->aac | --->aac | |||
+0 ^ ( | +0 ^ ( | |||
+1 ^ a+ | +1 ^ a+ | |||
No match | No match | |||
Notice that various optimizations must be turned off if you want al | Notice that various optimizations must be turned off if you want all | |||
l possible matching paths to | possible matching paths to | |||
be scanned. If no_start_optimize is not used, there is an immediate | be scanned. If no_start_optimize is not used, there is an immed | |||
"no match", without any | iate "no match", without any | |||
callouts, because the starting optimization fails to find "b" in | callouts, because the starting optimization fails to find "b" in the | |||
the subject, which it knows | subject, which it knows | |||
must be present for any match. If no_auto_possess is not used, the | must be present for any match. If no_auto_possess is not used, | |||
"a+" item is turned into | the "a+" item is turned into | |||
"a++", which reduces the number of backtracks. | "a++", which reduces the number of backtracks. | |||
The callout_extra modifier has no effect if used with the DFA matchi ng function, or with JIT. | The callout_extra modifier has no effect if used with the DFA matchi ng function, or with JIT. | |||
Return values from callouts | Return values from callouts | |||
The default return from the callout function is zero, which allo | The default return from the callout function is zero, which allows | |||
ws matching to continue. The | matching to continue. The | |||
callout_fail modifier can be given one or two numbers. If there is o | callout_fail modifier can be given one or two numbers. If there | |||
nly one number, 1 is re‐ | is only one number, 1 is re‐ | |||
turned instead of 0 (causing matching to backtrack) when a callout o f that number is reached. If | turned instead of 0 (causing matching to backtrack) when a callout o f that number is reached. If | |||
two numbers (<n>:<m>) are given, 1 is returned when callout <n> is reached and there have been | two numbers (<n>:<m>) are given, 1 is returned when callout <n> is r eached and there have been | |||
at least <m> callouts. The callout_error modifier is similar, except that PCRE2_ERROR_CALLOUT is | at least <m> callouts. The callout_error modifier is similar, except that PCRE2_ERROR_CALLOUT is | |||
returned, causing the entire matching process to be aborted. If both these modifiers are set for | returned, causing the entire matching process to be aborted. If both these modifiers are set for | |||
the same callout number, callout_error takes precedence. Note that c allouts with string argu‐ | the same callout number, callout_error takes precedence. Note tha t callouts with string argu‐ | |||
ments are always given the number zero. | ments are always given the number zero. | |||
The callout_data modifier can be given an unsigned or a negative | The callout_data modifier can be given an unsigned or a negative num | |||
number. This is set as the | ber. This is set as the | |||
"user data" that is passed to the matching function, and passed back | "user data" that is passed to the matching function, and passed ba | |||
when the callout function | ck when the callout function | |||
is invoked. Any value other than zero is used as a return from pcre2 test's callout function. | is invoked. Any value other than zero is used as a return from pcre2 test's callout function. | |||
Inserting callouts can be helpful when using pcre2test to check comp licated regular expressions. | Inserting callouts can be helpful when using pcre2test to check comp licated regular expressions. | |||
For further information about callouts, see the pcre2callout documen tation. | For further information about callouts, see the pcre2callout documen tation. | |||
NON-PRINTING CHARACTERS | NON-PRINTING CHARACTERS | |||
When pcre2test is outputting text in the compiled version of a patt ern, bytes other than 32-126 | When pcre2test is outputting text in the compiled version of a patte rn, bytes other than 32-126 | |||
are always treated as non-printing characters and are therefore show n as hex escapes. | are always treated as non-printing characters and are therefore show n as hex escapes. | |||
When pcre2test is outputting text that is a matched part of a subjec t string, it behaves in the | When pcre2test is outputting text that is a matched part of a subje ct string, it behaves in the | |||
same way, unless a different locale has been set for the pattern (us ing the locale modifier). In | same way, unless a different locale has been set for the pattern (us ing the locale modifier). In | |||
this case, the isprint() function is used to distinguish printing an d non-printing characters. | this case, the isprint() function is used to distinguish printing an d non-printing characters. | |||
SAVING AND RESTORING COMPILED PATTERNS | SAVING AND RESTORING COMPILED PATTERNS | |||
It is possible to save compiled patterns on disc or elsewhere, and r eload them later, subject to | It is possible to save compiled patterns on disc or elsewhere, and r eload them later, subject to | |||
a number of restrictions. JIT data cannot be saved. The host on whi ch the patterns are reloaded | a number of restrictions. JIT data cannot be saved. The host on whic h the patterns are reloaded | |||
must be running the same version of PCRE2, with the same code unit w idth, and must also have the | must be running the same version of PCRE2, with the same code unit w idth, and must also have the | |||
same endianness, pointer width and PCRE2_SIZE type. Before compiled | same endianness, pointer width and PCRE2_SIZE type. Before compile | |||
patterns can be saved they | d patterns can be saved they | |||
must be serialized, that is, converted to a stream of bytes. A si | must be serialized, that is, converted to a stream of bytes. A singl | |||
ngle byte stream may contain | e byte stream may contain | |||
any number of compiled patterns, but they must all use the same char | any number of compiled patterns, but they must all use the same cha | |||
acter tables. A single copy | racter tables. A single copy | |||
of the tables is included in the byte stream (its size is 1088 bytes ). | of the tables is included in the byte stream (its size is 1088 bytes ). | |||
The functions whose names begin with pcre2_serialize_ are used for | The functions whose names begin with pcre2_serialize_ are used for s | |||
serializing and de-serializ‐ | erializing and de-serializ‐ | |||
ing. They are described in the pcre2serialize documentation. In this | ing. They are described in the pcre2serialize documentation. In | |||
section we describe the | this section we describe the | |||
features of pcre2test that can be used to test these functions. | features of pcre2test that can be used to test these functions. | |||
Note that "serialization" in PCRE2 does not convert compiled pattern s to an abstract format like | Note that "serialization" in PCRE2 does not convert compiled pattern s to an abstract format like | |||
Java or .NET. It just makes a reloadable byte code stream. Hence t he restrictions on reloading | Java or .NET. It just makes a reloadable byte code stream. Hence th e restrictions on reloading | |||
mentioned above. | mentioned above. | |||
In pcre2test, when a pattern with push modifier is successfully comp | In pcre2test, when a pattern with push modifier is successfully c | |||
iled, it is pushed onto a | ompiled, it is pushed onto a | |||
stack of compiled patterns, and pcre2test expects the next line | stack of compiled patterns, and pcre2test expects the next line to c | |||
to contain a new pattern (or | ontain a new pattern (or | |||
command) instead of a subject line. By contrast, the pushcopy modifi er causes a copy of the com‐ | command) instead of a subject line. By contrast, the pushcopy modifi er causes a copy of the com‐ | |||
piled pattern to be stacked, leaving the original available for imm ediate matching. By using | piled pattern to be stacked, leaving the original available for immediate matching. By using | |||
push and/or pushcopy, a number of patterns can be compiled and retai ned. These modifiers are in‐ | push and/or pushcopy, a number of patterns can be compiled and retai ned. These modifiers are in‐ | |||
compatible with posix, and control modifiers that act at match time are ignored (with a message) | compatible with posix, and control modifiers that act at match time are ignored (with a message) | |||
for the stacked patterns. The jitverify modifier applies only at com pile time. | for the stacked patterns. The jitverify modifier applies only at com pile time. | |||
The command | The command | |||
#save <filename> | #save <filename> | |||
causes all the stacked patterns to be serialized and the result wr itten to the named file. Af‐ | causes all the stacked patterns to be serialized and the result writ ten to the named file. Af‐ | |||
terwards, all the stacked patterns are freed. The command | terwards, all the stacked patterns are freed. The command | |||
#load <filename> | #load <filename> | |||
reads the data in the file, and then arranges for it to be de-seria | reads the data in the file, and then arranges for it to be de-se | |||
lized, with the resulting | rialized, with the resulting | |||
compiled patterns added to the pattern stack. The pattern on the | compiled patterns added to the pattern stack. The pattern on the top | |||
top of the stack can be re‐ | of the stack can be re‐ | |||
trieved by the #pop command, which must be followed by lines of subj | trieved by the #pop command, which must be followed by lines of sub | |||
ects that are to be matched | jects that are to be matched | |||
with the pattern, terminated as usual by an empty line or end of fi | with the pattern, terminated as usual by an empty line or end of fil | |||
le. This command may be fol‐ | e. This command may be fol‐ | |||
lowed by a modifier list containing only control modifiers that act | lowed by a modifier list containing only control modifiers that | |||
after a pattern has been | act after a pattern has been | |||
compiled. In particular, hex, posix, posix_nosub, push, and pushc | compiled. In particular, hex, posix, posix_nosub, push, and pushcopy | |||
opy are not allowed, nor are | are not allowed, nor are | |||
any option-setting modifiers. The JIT modifiers are, however permit ted. Here is an example that | any option-setting modifiers. The JIT modifiers are, however permit ted. Here is an example that | |||
saves and reloads two patterns. | saves and reloads two patterns. | |||
/abc/push | /abc/push | |||
/xyz/push | /xyz/push | |||
#save tempfile | #save tempfile | |||
#load tempfile | #load tempfile | |||
#pop info | #pop info | |||
xyz | xyz | |||
#pop jit,bincode | #pop jit,bincode | |||
abc | abc | |||
If jitverify is used with #pop, it does not automatically imply jit, which is different behav‐ | If jitverify is used with #pop, it does not automatically imply ji t, which is different behav‐ | |||
iour from when it is used on a pattern. | iour from when it is used on a pattern. | |||
The #popcopy command is analogous to the pushcopy modifier in tha t it makes current a copy of | The #popcopy command is analogous to the pushcopy modifier in that i t makes current a copy of | |||
the topmost stack pattern, leaving the original still on the stack. | the topmost stack pattern, leaving the original still on the stack. | |||
SEE ALSO | SEE ALSO | |||
pcre2(3), pcre2api(3), pcre2callout(3), pcre2jit, pcre2matching(3), pcre2partial(d), pcre2pat‐ | pcre2(3), pcre2api(3), pcre2callout(3), pcre2jit, pcre2matching(3) , pcre2partial(d), pcre2pat‐ | |||
tern(3), pcre2serialize(3). | tern(3), pcre2serialize(3). | |||
AUTHOR | AUTHOR | |||
Philip Hazel | Philip Hazel | |||
Retired from University Computing Service | Retired from University Computing Service | |||
Cambridge, England. | Cambridge, England. | |||
REVISION | REVISION | |||
Last updated: 24 April 2024 | Last updated: 26 December 2024 | |||
Copyright (c) 1997-2024 University of Cambridge. | Copyright (c) 1997-2024 University of Cambridge. | |||
PCRE 10.44 24 April 2024 PCRE2TEST(1) | PCRE2 10.45-RC1 26 December 2024 PCRE2TEST(1) | |||
End of changes. 146 change blocks. | ||||
440 lines changed or deleted | 511 lines changed or added | |||
This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |