pcre2grep.1 | pcre2grep.1 | |||
---|---|---|---|---|
skipping to change at line 290 | skipping to change at line 290 | |||
selected if any of the fixed strings are found in it (subject to -w or -x, if | selected if any of the fixed strings are found in it (subject to -w or -x, if | |||
present). This option applies only to the patterns that ar e matched against the con‐ | present). This option applies only to the patterns that ar e matched against the con‐ | |||
tents of files; it does not apply to patterns specifie d by any of the --include or | tents of files; it does not apply to patterns specifie d by any of the --include or | |||
--exclude options. | --exclude options. | |||
-f filename, --file=filename | -f filename, --file=filename | |||
Read patterns from the file, one per line. As is the case with patterns on the command | Read patterns from the file, one per line. As is the case with patterns on the command | |||
line, no delimiters should be used. What constitutes a new line when reading the file | line, no delimiters should be used. What constitutes a new line when reading the file | |||
is the operating system's default interpretation of \n. The --newline option has no | is the operating system's default interpretation of \n. The --newline option has no | |||
effect on this option. Trailing white space is removed fro m each line, and blank lines | effect on this option. Trailing white space is removed fro m each line, and blank lines | |||
are ignored. An empty file contains no patterns and theref | are ignored unless the --posix-pattern-file option is also | |||
ore matches nothing. Pat‐ | provided. An empty file | |||
terns read from a file in this way may contain binary zer | contains no patterns and therefore matches nothing. Patte | |||
os, which are treated as or‐ | rns read from a file in this | |||
dinary data characters. | way may contain binary zeros, which are treated as ordinar | |||
y character literals. | ||||
If this option is given more than once, all the specified files are read. A data line | If this option is given more than once, all the specified files are read. A data line | |||
is output if any of the patterns match it. A file name can be given as "-" to refer to | is output if any of the patterns match it. A file name can be given as "-" to refer to | |||
the standard input. When -f is used, patterns specified on the command line using -e | the standard input. When -f is used, patterns specified on the command line using -e | |||
may also be present; they are matched before the file's pa tterns. However, no pattern | may also be present; they are matched before the file's pa tterns. However, no pattern | |||
is taken from the command line; all arguments are treated as the names of paths to be | is taken from the command line; all arguments are treated as the names of paths to be | |||
searched. | searched. | |||
--file-list=filename | --file-list=filename | |||
Read a list of files and/or directories that are to be sca nned from the given file, | Read a list of files and/or directories that are to be sca nned from the given file, | |||
skipping to change at line 557 | skipping to change at line 557 | |||
this mode, --colour has no effect, and no context is shown . That is, the -A, -B, and | this mode, --colour has no effect, and no context is shown . That is, the -A, -B, and | |||
-C options are ignored. The --newline option has no eff ect on this option, which is | -C options are ignored. The --newline option has no eff ect on this option, which is | |||
mutually exclusive with --only-matching, --file-offsets, a nd --line-offsets. However, | mutually exclusive with --only-matching, --file-offsets, a nd --line-offsets. However, | |||
like --only-matching, if there is more than one match in a line, each of them causes a | like --only-matching, if there is more than one match in a line, each of them causes a | |||
line of output. | line of output. | |||
Escape sequences starting with a dollar character may be used to insert the contents | Escape sequences starting with a dollar character may be used to insert the contents | |||
of the matched part of the line and/or captured substrings into the text. | of the matched part of the line and/or captured substrings into the text. | |||
$<digits> or ${<digits>} is replaced by the captured subst ring of the given decimal | $<digits> or ${<digits>} is replaced by the captured subst ring of the given decimal | |||
number; zero substitutes the whole match. If the number i | number; $& (or the legacy $0) substitutes the whole mat | |||
s greater than the number of | ch. If the number is greater | |||
capturing substrings, or if the capture is unset, the repl | than the number of capturing substrings, or if the capture | |||
acement is empty. | is unset, the replacement | |||
is empty. | ||||
$a is replaced by bell; $b by backspace; $e by escape; $f by form feed; $n by newline; | $a is replaced by bell; $b by backspace; $e by escape; $f by form feed; $n by newline; | |||
$r by carriage return; $t by tab; $v by vertical tab. | $r by carriage return; $t by tab; $v by vertical tab. | |||
$o<digits> or $o{<digits>} is replaced by the character wh | $o<digits> or $o{<digits>} is replaced by the character w | |||
ose code point is the given | hose code point is the given | |||
octal number. In the first form, up to three octal digi | octal number. In the first form, up to three octal digits | |||
ts are processed. When more | are processed. When more | |||
digits are needed in Unicode mode to specify a wide charac ter, the second form must be | digits are needed in Unicode mode to specify a wide charac ter, the second form must be | |||
used. | used. | |||
$x<digits> or $x{<digits>} is replaced by the character re presented by the given hexa‐ | $x<digits> or $x{<digits>} is replaced by the character re presented by the given hexa‐ | |||
decimal number. In the first form, up to two hexadecimal d | decimal number. In the first form, up to two hexadecima | |||
igits are processed. When | l digits are processed. When | |||
more digits are needed in Unicode mode to specify a wid | more digits are needed in Unicode mode to specify a wide c | |||
e character, the second form | haracter, the second form | |||
must be used. | must be used. | |||
Any other character is substituted by itself. In particula r, $$ is replaced by a sin‐ | Any other character is substituted by itself. In particul ar, $$ is replaced by a sin‐ | |||
gle dollar. | gle dollar. | |||
-o, --only-matching | -o, --only-matching | |||
Show only the part of the line that matched a pattern i | Show only the part of the line that matched a pattern inst | |||
nstead of the whole line. In | ead of the whole line. In | |||
this mode, no context is shown. That is, the -A, -B, and - | this mode, no context is shown. That is, the -A, -B, an | |||
C options are ignored. If | d -C options are ignored. If | |||
there is more than one match in a line, each of them is | there is more than one match in a line, each of them is sh | |||
shown separately, on a sepa‐ | own separately, on a sepa‐ | |||
rate line of output. If -o is combined with -v (invert the | rate line of output. If -o is combined with -v (invert th | |||
sense of the match to find | e sense of the match to find | |||
non-matching lines), no output is generated, but the retur n code is set appropriately. | non-matching lines), no output is generated, but the retur n code is set appropriately. | |||
If the matched portion of the line is empty, nothing is ou tput unless the file name or | If the matched portion of the line is empty, nothing is ou tput unless the file name or | |||
line number are being printed, in which case they are | line number are being printed, in which case they are sho | |||
shown on an otherwise empty | wn on an otherwise empty | |||
line. This option is mutually exclusive with --output, --f | line. This option is mutually exclusive with --output, -- | |||
ile-offsets and --line-off‐ | file-offsets and --line-off‐ | |||
sets. | sets. | |||
-onumber, --only-matching=number | -onumber, --only-matching=number | |||
Show only the part of the line that matched the captur | Show only the part of the line that matched the capturing | |||
ing parentheses of the given | parentheses of the given | |||
number. Up to 50 capturing parentheses are supported by de | number. Up to 50 capturing parentheses are supported b | |||
fault. This limit can be | y default. This limit can be | |||
changed via the --om-capture option. A pattern may con | changed via the --om-capture option. A pattern may contain | |||
tain any number of capturing | any number of capturing | |||
parentheses, but only those whose number is within the lim it can be accessed by -o. An | parentheses, but only those whose number is within the lim it can be accessed by -o. An | |||
error occurs if the number specified by -o is greater than the limit. | error occurs if the number specified by -o is greater than the limit. | |||
-o0 is the same as -o without a number. Because these opti | -o0 is the same as -o without a number. Because these opt | |||
ons can be given without an | ions can be given without an | |||
argument (see above), if an argument is present, it mus | argument (see above), if an argument is present, it must b | |||
t be given in the same shell | e given in the same shell | |||
item, for example, -o3 or --only-matching=2. The comments | item, for example, -o3 or --only-matching=2. The comment | |||
given for the non-argument | s given for the non-argument | |||
case above also apply to this option. If the specified | case above also apply to this option. If the specified cap | |||
capturing parentheses do not | turing parentheses do not | |||
exist in the pattern, or were not set in the match, nothin | exist in the pattern, or were not set in the match, nothi | |||
g is output unless the file | ng is output unless the file | |||
name or line number are being output. | name or line number are being output. | |||
If this option is given multiple times, multiple substring s are output for each match, | If this option is given multiple times, multiple substring s are output for each match, | |||
in the order the options are given, and all on one li | in the order the options are given, and all on one line. | |||
ne. For example, -o3 -o1 -o3 | For example, -o3 -o1 -o3 | |||
causes the substrings matched by capturing parentheses 3 a | causes the substrings matched by capturing parentheses 3 | |||
nd 1 and then 3 again to be | and 1 and then 3 again to be | |||
output. By default, there is no separator (but see the nex t but one option). | output. By default, there is no separator (but see the nex t but one option). | |||
--om-capture=number | --om-capture=number | |||
Set the number of capturing parentheses that can be access ed by -o. The default is 50. | Set the number of capturing parentheses that can be access ed by -o. The default is 50. | |||
--om-separator=text | --om-separator=text | |||
Specify a separating string for multiple occurrences of -o. The default is an empty | Specify a separating string for multiple occurrences of -o . The default is an empty | |||
string. Separating strings are never coloured. | string. Separating strings are never coloured. | |||
-P, --no-ucp | -P, --no-ucp | |||
Starting from release 10.43, when UTF/Unicode mode is spec | Starting from release 10.43, when UTF/Unicode mode is | |||
ified with -u or -U, the | specified with -u or -U, the | |||
PCRE2_UCP option is used by default. This means that t | PCRE2_UCP option is used by default. This means that the | |||
he POSIX classes in patterns | POSIX classes in patterns | |||
match more than just ASCII characters. For example, [:digi t:] matches any Unicode dec‐ | match more than just ASCII characters. For example, [:digi t:] matches any Unicode dec‐ | |||
imal digit. The --no-ucp option suppresses PCRE2_UCP, t | imal digit. The --no-ucp option suppresses PCRE2_UCP | |||
hus restricting the POSIX | , thus restricting the POSIX | |||
classes to ASCII characters, as was the case in earlier r | classes to ASCII characters, as was the case in earlier re | |||
eleases. Note that there are | leases. Note that there are | |||
now more fine-grained option settings within patterns that | now more fine-grained option settings within patterns tha | |||
affect individual classes. | t affect individual classes. | |||
For example, when in UCP mode, the sequence (?aP) restrict s [:word:] to ASCII letters, | For example, when in UCP mode, the sequence (?aP) restrict s [:word:] to ASCII letters, | |||
while allowing \w to match Unicode letters and digits. | while allowing \w to match Unicode letters and digits. | |||
--posix-pattern-file | ||||
When patterns are provided with the -f option, do not trim | ||||
trailing spaces or ignore | ||||
empty lines in a similar way than other grep tools. To k | ||||
eep the behaviour consistent | ||||
with older versions, if the pattern read was terminated wi | ||||
th CRLF (as character liter‐ | ||||
als) then both characters won't be included as part of it, | ||||
so if you really need to | ||||
have pattern ending in '\r', use a escape sequence | ||||
or provide it by a different | ||||
method. | ||||
-q, --quiet | -q, --quiet | |||
Work quietly, that is, display nothing except error mes sages. The exit status indi‐ | Work quietly, that is, display nothing except error messag es. The exit status indi‐ | |||
cates whether or not any matches were found. | cates whether or not any matches were found. | |||
-r, --recursive | -r, --recursive | |||
If any given path is a directory, recursively scan the fil | If any given path is a directory, recursively scan the fi | |||
es it contains, taking note | les it contains, taking note | |||
of any --include and --exclude settings. By default, a d | of any --include and --exclude settings. By default, a dir | |||
irectory is read as a normal | ectory is read as a normal | |||
file; in some operating systems this gives an immediate en | file; in some operating systems this gives an immediate e | |||
d-of-file. This option is a | nd-of-file. This option is a | |||
shorthand for setting the -d option to "recurse". | shorthand for setting the -d option to "recurse". | |||
--recursion-limit=number | --recursion-limit=number | |||
This is an obsolete synonym for --depth-limit. See --match -limit above for details. | This is an obsolete synonym for --depth-limit. See --match -limit above for details. | |||
-s, --no-messages | -s, --no-messages | |||
Suppress error messages about non-existent or unreadable f iles. Such files are quietly | Suppress error messages about non-existent or unreadable f iles. Such files are quietly | |||
skipped. However, the return code is still 2, even if matches were found in other | skipped. However, the return code is still 2, even if mat ches were found in other | |||
files. | files. | |||
-t, --total-count | -t, --total-count | |||
This option is useful when scanning more than one file. If | This option is useful when scanning more than one file. | |||
used on its own, -t sup‐ | If used on its own, -t sup‐ | |||
presses all output except for a grand total number of mat | presses all output except for a grand total number of matc | |||
ching lines (or non-matching | hing lines (or non-matching | |||
lines if -v is used) in all the files. If -t is used with | lines if -v is used) in all the files. If -t is used with | |||
-c, a grand total is output | -c, a grand total is output | |||
except when the previous output is just one line. In o | except when the previous output is just one line. In other | |||
ther words, it is not output | words, it is not output | |||
when just one file's count is listed. If file names are be | when just one file's count is listed. If file names are b | |||
ing output, the grand total | eing output, the grand total | |||
is preceded by "TOTAL:". Otherwise, it appears as just a | is preceded by "TOTAL:". Otherwise, it appears as just ano | |||
nother number. The -t option | ther number. The -t option | |||
is ignored when used with -L (list files without matches), | is ignored when used with -L (list files without match | |||
because the grand total | es), because the grand total | |||
would always be zero. | would always be zero. | |||
-u, --utf Operate in UTF/Unicode mode. This option is available onl | -u, --utf Operate in UTF/Unicode mode. This option is available only | |||
y if PCRE2 has been compiled | if PCRE2 has been compiled | |||
with UTF-8 support. All patterns (including those for any | with UTF-8 support. All patterns (including those for any | |||
--exclude and --include op‐ | --exclude and --include op‐ | |||
tions) and all lines that are scanned must be valid string s of UTF-8 characters. If an | tions) and all lines that are scanned must be valid string s of UTF-8 characters. If an | |||
invalid UTF-8 string is encountered, an error occurs. | invalid UTF-8 string is encountered, an error occurs. | |||
-U, --utf-allow-invalid | -U, --utf-allow-invalid | |||
As --utf, but in addition subject lines may contain invali d UTF-8 code unit sequences. | As --utf, but in addition subject lines may contain invali d UTF-8 code unit sequences. | |||
These can never form part of any pattern match. Patter | These can never form part of any pattern match. Patterns | |||
ns themselves, however, must | themselves, however, must | |||
still be valid UTF-8 strings. This facility allows valid U | still be valid UTF-8 strings. This facility allows vali | |||
TF-8 strings to be sought | d UTF-8 strings to be sought | |||
within arbitrary byte sequences in executable or other bi | within arbitrary byte sequences in executable or other bin | |||
nary files. For more details | ary files. For more details | |||
about matching in non-valid UTF-8 strings, see the pcre2un icode(3) documentation. | about matching in non-valid UTF-8 strings, see the pcre2un icode(3) documentation. | |||
-V, --version | -V, --version | |||
Write the version numbers of pcre2grep and the PCRE2 libra ry to the standard output | Write the version numbers of pcre2grep and the PCRE2 li brary to the standard output | |||
and then exit. Anything else on the command line is ignore d. | and then exit. Anything else on the command line is ignore d. | |||
-v, --invert-match | -v, --invert-match | |||
Invert the sense of the match, so that lines which do n | Invert the sense of the match, so that lines which do not | |||
ot match any of the patterns | match any of the patterns | |||
are the ones that are found. When this option is set, opti | are the ones that are found. When this option is set, opt | |||
ons such as --only-matching | ions such as --only-matching | |||
and --output, which specify parts of a match that are to b e output, are ignored. | and --output, which specify parts of a match that are to b e output, are ignored. | |||
-w, --word-regex, --word-regexp | -w, --word-regex, --word-regexp | |||
Force the patterns only to match "words". That is, ther | Force the patterns only to match "words". That is, there m | |||
e must be a word boundary at | ust be a word boundary at | |||
the start and end of each matched string. This is equivale | the start and end of each matched string. This is equival | |||
nt to having "\b(?:" at the | ent to having "\b(?:" at the | |||
start of each pattern, and ")\b" at the end. This option | start of each pattern, and ")\b" at the end. This option a | |||
applies only to the patterns | pplies only to the patterns | |||
that are matched against the contents of files; it does no | that are matched against the contents of files; it does | |||
t apply to patterns speci‐ | not apply to patterns speci‐ | |||
fied by any of the --include or --exclude options. | fied by any of the --include or --exclude options. | |||
-x, --line-regex, --line-regexp | -x, --line-regex, --line-regexp | |||
Force the patterns to start matching only at the beginning s of lines, and in addition, | Force the patterns to start matching only at the beginning s of lines, and in addition, | |||
require them to match entire lines. In multiline mode th e match may be more than one | require them to match entire lines. In multiline mode the match may be more than one | |||
line. This is equivalent to having "^(?:" at the start of each pattern and ")$" at the | line. This is equivalent to having "^(?:" at the start of each pattern and ")$" at the | |||
end. This option applies only to the patterns that are mat ched against the contents of | end. This option applies only to the patterns that are mat ched against the contents of | |||
files; it does not apply to patterns specified by any of t he --include or --exclude | files; it does not apply to patterns specified by any o f the --include or --exclude | |||
options. | options. | |||
-Z, --null | -Z, --null | |||
Terminate files names in the regular output with a zero | Terminate files names in the regular output with a zero by | |||
byte (the NUL character) in‐ | te (the NUL character) in‐ | |||
stead of what would normally appear. This is useful when f | stead of what would normally appear. This is useful whe | |||
ile names contain unusual | n file names contain unusual | |||
characters such as colons, hyphens, or even newlines. | characters such as colons, hyphens, or even newlines. The | |||
The option does not apply to | option does not apply to | |||
file names in error messages. | file names in error messages. | |||
ENVIRONMENT VARIABLES | ENVIRONMENT VARIABLES | |||
The environment variables LC_ALL and LC_CTYPE are examined, in that | The environment variables LC_ALL and LC_CTYPE are examined, in t | |||
order, for a locale. The | hat order, for a locale. The | |||
first one that is set is used. This can be overridden by the --lo | first one that is set is used. This can be overridden by the --local | |||
cale option. If no locale is | e option. If no locale is | |||
set, the PCRE2 library's default (usually the "C" locale) is used. | set, the PCRE2 library's default (usually the "C" locale) is used. | |||
NEWLINES | NEWLINES | |||
The -N (--newline) option allows pcre2grep to scan files with newlin | The -N (--newline) option allows pcre2grep to scan files with new | |||
e conventions that differ | line conventions that differ | |||
from the default. This option affects only the way scanned files ar | from the default. This option affects only the way scanned files are | |||
e processed. It does not af‐ | processed. It does not af‐ | |||
fect the interpretation of files specified by the -f, --file-list, - -exclude-from, or --include- | fect the interpretation of files specified by the -f, --file-list, - -exclude-from, or --include- | |||
from options. | from options. | |||
Any parts of the scanned input files that are written to the standar | Any parts of the scanned input files that are written to the sta | |||
d output are copied with | ndard output are copied with | |||
whatever newline sequences they have in the input. However, if the | whatever newline sequences they have in the input. However, if the f | |||
final line of a file is out‐ | inal line of a file is out‐ | |||
put, and it does not end with a newline sequence, a newline sequence | put, and it does not end with a newline sequence, a newline seque | |||
is added. If the newline | nce is added. If the newline | |||
setting is CR, LF, CRLF or NUL, that line ending is output; for th | setting is CR, LF, CRLF or NUL, that line ending is output; for the | |||
e other settings (ANYCRLF or | other settings (ANYCRLF or | |||
ANY) a single NL is used. | ANY) a single NL is used. | |||
The newline setting does not affect the way in which pcre2grep write s newlines in informational | The newline setting does not affect the way in which pcre2grep writ es newlines in informational | |||
messages to the standard output and error streams. Under Windows, t he standard output is set to | messages to the standard output and error streams. Under Windows, t he standard output is set to | |||
be binary, so that "\r\n" at the ends of output lines that are copie d from the input is not con‐ | be binary, so that "\r\n" at the ends of output lines that are copie d from the input is not con‐ | |||
verted to "\r\r\n" by the C I/O library. This means that any mess | verted to "\r\r\n" by the C I/O library. This means that any message | |||
ages written to the standard | s written to the standard | |||
output must end with "\r\n". For all other operating systems, and fo | output must end with "\r\n". For all other operating systems, and f | |||
r all messages to the stan‐ | or all messages to the stan‐ | |||
dard error stream, "\n" is used. | dard error stream, "\n" is used. | |||
OPTIONS COMPATIBILITY WITH GNU GREP | OPTIONS COMPATIBILITY WITH GNU GREP | |||
Many of the short and long forms of pcre2grep's options are the same as in the GNU grep program. | Many of the short and long forms of pcre2grep's options are the same as in the GNU grep program. | |||
Any long option of the form --xxx-regexp (GNU terminology) is a | Any long option of the form --xxx-regexp (GNU terminology) is also | |||
lso available as --xxx-regex | available as --xxx-regex | |||
(PCRE2 terminology). However, the --case-restrict, --depth-limit, - | (PCRE2 terminology). However, the --case-restrict, --depth-limit, | |||
E, --file-list, --file-off‐ | -E, --file-list, --file-off‐ | |||
sets, --heap-limit, --include-dir, --line-offsets, --locale, --match -limit, -M, --multiline, -N, | sets, --heap-limit, --include-dir, --line-offsets, --locale, --match -limit, -M, --multiline, -N, | |||
--newline, --no-ucp, --om-separator, --output, -P, -u, --utf, -U, | --newline, --no-ucp, --om-separator, --output, -P, -u, --utf, -U, a | |||
and --utf-allow-invalid op‐ | nd --utf-allow-invalid op‐ | |||
tions are specific to pcre2grep, as is the use of the --only-matchin | tions are specific to pcre2grep, as is the use of the --only-matc | |||
g option with a capturing | hing option with a capturing | |||
parentheses number. | parentheses number. | |||
Although most of the common options work the same way, a few are dif ferent in pcre2grep. For ex‐ | Although most of the common options work the same way, a few are dif ferent in pcre2grep. For ex‐ | |||
ample, the --include option's argument is a glob for GNU grep, but | ample, the --include option's argument is a glob for GNU grep, but i | |||
in pcre2grep it is a regular | n pcre2grep it is a regular | |||
expression to which the -i option applies. If both the -c and -l opt | expression to which the -i option applies. If both the -c and -l | |||
ions are given, GNU grep | options are given, GNU grep | |||
lists only file names, without counts, but pcre2grep gives the count s as well. | lists only file names, without counts, but pcre2grep gives the count s as well. | |||
OPTIONS WITH DATA | OPTIONS WITH DATA | |||
There are four different ways in which an option with data can be | There are four different ways in which an option with data can be sp | |||
specified. If a short form | ecified. If a short form | |||
option is used, the data may follow immediately, or (with one except | option is used, the data may follow immediately, or (with one ex | |||
ion) in the next command | ception) in the next command | |||
line item. For example: | line item. For example: | |||
-f/some/file | -f/some/file | |||
-f /some/file | -f /some/file | |||
The exception is the -o option, which may appear with or without dat a. Because of this, if data | The exception is the -o option, which may appear with or without dat a. Because of this, if data | |||
is present, it must follow immediately in the same item, for example -o3. | is present, it must follow immediately in the same item, for example -o3. | |||
If a long form option is used, the data may appear in the same com | If a long form option is used, the data may appear in the same comma | |||
mand line item, separated by | nd line item, separated by | |||
an equals character, or (with two exceptions) it may appear in the n | an equals character, or (with two exceptions) it may appear in the | |||
ext command line item. For | next command line item. For | |||
example: | example: | |||
--file=/some/file | --file=/some/file | |||
--file /some/file | --file /some/file | |||
Note, however, that if you want to supply a file name beginning wi | Note, however, that if you want to supply a file name beginning with | |||
th ~ as data in a shell com‐ | ~ as data in a shell com‐ | |||
mand, and have the shell expand ~ to a home directory, you must sepa | mand, and have the shell expand ~ to a home directory, you must sep | |||
rate the file name from the | arate the file name from the | |||
option, because the shell does not treat ~ specially unless it is at the start of an item. | option, because the shell does not treat ~ specially unless it is at the start of an item. | |||
The exceptions to the above are the --colour (or --color) and --only -matching options, for which | The exceptions to the above are the --colour (or --color) and --only -matching options, for which | |||
the data is optional. If one of these options does have data, i t must be given in the first | the data is optional. If one of these options does have data, it mus t be given in the first | |||
form, using an equals character. Otherwise pcre2grep will assume tha t it has no data. | form, using an equals character. Otherwise pcre2grep will assume tha t it has no data. | |||
USING PCRE2'S CALLOUT FACILITY | USING PCRE2'S CALLOUT FACILITY | |||
pcre2grep has, by default, support for calling external programs or | pcre2grep has, by default, support for calling external programs or | |||
scripts or echoing specific | scripts or echoing specific | |||
strings during matching by making use of PCRE2's callout facility. | strings during matching by making use of PCRE2's callout facility. H | |||
However, this support can be | owever, this support can be | |||
completely or partially disabled when pcre2grep is built. You can fi | completely or partially disabled when pcre2grep is built. You can | |||
nd out whether your binary | find out whether your binary | |||
has support for callouts by running it with the --help option. If c | has support for callouts by running it with the --help option. If ca | |||
allout support is completely | llout support is completely | |||
disabled, all callouts in patterns are ignored by pcre2grep. If the | disabled, callouts in patterns are forbidden by pcre2grep. If th | |||
facility is partially dis‐ | e facility is partially dis‐ | |||
abled, calling external programs is not supported, and callouts that request it are ignored. | abled, calling external programs is not supported, and callouts that request it are ignored. | |||
A callout in a PCRE2 pattern is of the form (?C<arg>) where the argu ment is either a number or a | A callout in a PCRE2 pattern is of the form (?C<arg>) where the argu ment is either a number or a | |||
quoted string (see the pcre2callout documentation for details). Numb ered callouts are ignored by | quoted string (see the pcre2callout documentation for details). Numb ered callouts are ignored by | |||
pcre2grep; only callouts with string arguments are useful. | pcre2grep; only callouts with string arguments are useful. | |||
Echoing a specific string | Echoing a specific string | |||
Starting the callout string with a pipe character invokes an echoin | Starting the callout string with a pipe character invokes an echoing | |||
g facility that avoids call‐ | facility that avoids call‐ | |||
ing an external program or script. This facility is always availabl | ing an external program or script. This facility is always avail | |||
e, provided that callouts | able, provided that callouts | |||
were not completely disabled when pcre2grep was built. The r | were not completely disabled when pcre2grep was built. The rest | |||
est of the callout string is | of the callout string is | |||
processed as a zero-terminated string, which means it should not con tain any internal binary ze‐ | processed as a zero-terminated string, which means it should not con tain any internal binary ze‐ | |||
ros. It is written to the output, having first been passed through t he same escape processing as | ros. It is written to the output, having first been passed through t he same escape processing as | |||
text from the --output (-O) option (see above). However, $0 cannot b | text from the --output (-O) option (see above). However, $0 or $ | |||
e used to insert a matched | & cannot be used to insert a | |||
substring because the match is still in progress. Instead, the singl | matched substring because the match is still in progress. Instead, t | |||
e character '0' is inserted. | he single character '0' is | |||
Any syntax errors in the string (for example, a dollar not followed | inserted. Any syntax errors in the string (for example, a dollar n | |||
by another character) causes | ot followed by another char‐ | |||
the callout to be ignored. No terminator is added to the output s | acter) causes the callout to be ignored. No terminator is added to t | |||
tring, so if you want a new‐ | he output string, so if you | |||
line, you must include it explicitly using the escape $n. For exampl | want a newline, you must include it explicitly using the escape $n. | |||
e: | For example: | |||
pcre2grep '(.)(..(.))(?C"|[$1] [$2] [$3]$n")' <some file> | pcre2grep '(.)(..(.))(?C"|[$1] [$2] [$3]$n")' <some file> | |||
Matching continues normally after the string is output. If you want to see only the callout out‐ | Matching continues normally after the string is output. If you want to see only the callout out‐ | |||
put but not any output from an actual match, you should end the patt ern with (*FAIL). | put but not any output from an actual match, you should end the patt ern with (*FAIL). | |||
Calling external programs or scripts | Calling external programs or scripts | |||
This facility can be independently disabled when pcre2grep is built. | This facility can be independently disabled when pcre2grep is bui | |||
It is supported for Win‐ | lt. It is supported for Win‐ | |||
dows, where a call to _spawnvp() is used, for VMS, where lib$spawn( | dows, where a call to _spawnvp() is used, for VMS, where lib$spawn() | |||
) is used, and for any Unix- | is used, and for any Unix- | |||
like environment where fork() and execv() are available. | like environment where fork() and execv() are available. | |||
If the callout string does not start with a pipe (vertical bar) char acter, it is parsed into a | If the callout string does not start with a pipe (vertical bar) ch aracter, it is parsed into a | |||
list of substrings separated by pipe characters. The first substring must be an executable name, | list of substrings separated by pipe characters. The first substring must be an executable name, | |||
with the following substrings specifying arguments: | with the following substrings specifying arguments: | |||
executable_name|arg1|arg2|... | executable_name|arg1|arg2|... | |||
Any substring (including the executable name) may contain escape s | Any substring (including the executable name) may contain escape seq | |||
equences started by a dollar | uences started by a dollar | |||
character. These are the same as for the --output (-O) option docume | character. These are the same as for the --output (-O) option docu | |||
nted above, except that $0 | mented above, except that $0 | |||
cannot insert the matched string because the match is still in prog | or $& cannot insert the matched string because the match is still | |||
ress. Instead, the character | in progress. Instead, the | |||
'0' is inserted. If you need a literal dollar or pipe character in a | character substring, use $$ or $| respectively. Here is an example: | |||
ny substring, use $$ or $| | ||||
respectively. Here is an example: | ||||
echo -e "abcde\n12345" | pcre2grep \ | echo -e "abcde\n12345" | pcre2grep \ | |||
'(?x)(.)(..(.)) | '(?x)(.)(..(.)) | |||
(?C"/bin/echo|Arg1: [$1] [$2] [$3]|Arg2: $|${1}$| ($4)")()' - | (?C"/bin/echo|Arg1: [$1] [$2] [$3]|Arg2: $|${1}$| ($4)")()' - | |||
Output: | Output: | |||
Arg1: [a] [bcd] [d] Arg2: |a| () | Arg1: [a] [bcd] [d] Arg2: |a| () | |||
abcde | abcde | |||
Arg1: [1] [234] [4] Arg2: |1| () | Arg1: [1] [234] [4] Arg2: |1| () | |||
skipping to change at line 863 | skipping to change at line 871 | |||
pcre2pattern(3), pcre2syntax(3), pcre2callout(3), pcre2unicode(3). | pcre2pattern(3), pcre2syntax(3), pcre2callout(3), pcre2unicode(3). | |||
AUTHOR | AUTHOR | |||
Philip Hazel | Philip Hazel | |||
Retired from University Computing Service | Retired from University Computing Service | |||
Cambridge, England. | Cambridge, England. | |||
REVISION | REVISION | |||
Last updated: 22 December 2023 | Last updated: 09 October 2024 | |||
Copyright (c) 1997-2023 University of Cambridge. | Copyright (c) 1997-2023 University of Cambridge. | |||
PCRE2 10.43 22 December 2023 PCRE2GREP(1) | PCRE2 10.45-RC1 09 October 2024 PCRE2GREP(1) | |||
End of changes. 46 change blocks. | ||||
201 lines changed or deleted | 214 lines changed or added | |||
This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |