pcre2jit.3   pcre2jit.3 
skipping to change at line 37 skipping to change at line 37
ARM 64-bit ARM 64-bit
IBM s390x 64 bit IBM s390x 64 bit
Intel x86 32-bit and 64-bit Intel x86 32-bit and 64-bit
LoongArch 64 bit LoongArch 64 bit
MIPS 32-bit and 64-bit MIPS 32-bit and 64-bit
Power PC 32-bit and 64-bit Power PC 32-bit and 64-bit
RISC-V 32-bit and 64-bit RISC-V 32-bit and 64-bit
If --enable-jit is set on an unsupported platform, compilation fails . If --enable-jit is set on an unsupported platform, compilation fails .
A client program can tell if JIT support is available by callin g pcre2_config() with the A client program can tell if JIT support has been compiled by callin g pcre2_config() with the
PCRE2_CONFIG_JIT option. The result is one if PCRE2 was built with JIT support, and zero other‐ PCRE2_CONFIG_JIT option. The result is one if PCRE2 was built with JIT support, and zero other‐
wise. However, having the JIT code available does not guarantee that it will be used for any wise. However, having the JIT code available does not guarantee that it will be used for any
particular match. One reason for this is that there are a number of options and pattern items particular match. One reason for this is that there are a number of options and pattern items
that are not supported by JIT (see below). Another reason is that in some environments JIT is that are not supported by JIT (see below). Another reason is that in some environments JIT is
unable to get memory in which to build its compiled code. The only g unable to get executable memory in which to build its compiled c
uarantee from pcre2_config() ode. The only guarantee from
is that if it returns zero, JIT will definitely not be used. pcre2_config() is that if it returns zero, JIT will definitely not b
e used.
A simple program does not need to check availability in order to us As of release 10.45 there is a more informative way to test for J
e JIT when possible. The API IT support. If pcre2_com‐
pile_jit() is called with the single option PCRE2_JIT_TEST_ALL
OC it returns zero if JIT is
available and has a working allocator. Otherwise it returns PCRE2
_ERROR_NOMEMORY if JIT is
available but cannot allocate executable memory, or PCRE2_ERROR_JI
T_UNSUPPORTED if JIT support
is not compiled. The code argument is ignored, so it can be a NULL v
alue.
A simple program does not need to check availability in order to use
JIT when possible. The API
is implemented in a way that falls back to the interpretive code if JIT is not available or can‐ is implemented in a way that falls back to the interpretive code if JIT is not available or can‐
not be used for a given match. For programs that need the best possi ble performance, there is a not be used for a given match. For programs that need the best poss ible performance, there is a
"fast path" API that is JIT-specific. "fast path" API that is JIT-specific.
SIMPLE USE OF JIT SIMPLE USE OF JIT
To make use of the JIT support in the simplest way, all you have to do is to call pcre2_jit_com‐ To make use of the JIT support in the simplest way, all you have to do is to call pcre2_jit_com‐
pile() after successfully compiling a pattern with pcre2_compile(). pile() after successfully compiling a pattern with pcre2_compile().
This function has two argu‐ This function has two argu‐
ments: the first is the compiled pattern pointer that was returned b ments: the first is the compiled pattern pointer that was returned
y pcre2_compile(), and the by pcre2_compile(), and the
second is zero or more of the following option bits: PCRE2_JIT_COMPL ETE, PCRE2_JIT_PARTIAL_HARD, second is zero or more of the following option bits: PCRE2_JIT_COMPL ETE, PCRE2_JIT_PARTIAL_HARD,
or PCRE2_JIT_PARTIAL_SOFT. or PCRE2_JIT_PARTIAL_SOFT.
If JIT support is not available, a call to pcre2_jit_compil If JIT support is not available, a call to pcre2_jit_compile()
e() does nothing and returns does nothing and returns
PCRE2_ERROR_JIT_BADOPTION. Otherwise, the compiled pattern is passed PCRE2_ERROR_JIT_BADOPTION. Otherwise, the compiled pattern is passe
to the JIT compiler, which d to the JIT compiler, which
turns it into machine code that executes much faster than the n turns it into machine code that executes much faster than the norm
ormal interpretive code, but al interpretive code, but
yields exactly the same results. The returned value from pcre2_jit_c ompile() is zero on success, yields exactly the same results. The returned value from pcre2_jit_c ompile() is zero on success,
or a negative error code. or a negative error code.
There is a limit to the size of pattern that JIT supports, imposed b y the size of machine stack There is a limit to the size of pattern that JIT supports, imposed by the size of machine stack
that it uses. The exact rules are not documented because they may ch ange at any time, in partic‐ that it uses. The exact rules are not documented because they may ch ange at any time, in partic‐
ular, when new optimizations are introduced. If a pattern is too b ig, a call to pcre2_jit_com‐ ular, when new optimizations are introduced. If a pattern is too bi g, a call to pcre2_jit_com‐
pile() returns PCRE2_ERROR_NOMEMORY. pile() returns PCRE2_ERROR_NOMEMORY.
PCRE2_JIT_COMPLETE requests the JIT compiler to generate code for co PCRE2_JIT_COMPLETE requests the JIT compiler to generate code for c
mplete matches. If you want omplete matches. If you want
to run partial matches using the PCRE2_PARTIAL_HARD or PCR to run partial matches using the PCRE2_PARTIAL_HARD or PCRE2
E2_PARTIAL_SOFT options of _PARTIAL_SOFT options of
pcre2_match(), you should set one or both of the other options as pcre2_match(), you should set one or both of the other optio
well as, or instead of ns as well as, or instead of
PCRE2_JIT_COMPLETE. The JIT compiler generates different optimize PCRE2_JIT_COMPLETE. The JIT compiler generates different optimized c
d code for each of the three ode for each of the three
modes (normal, soft partial, hard partial). When pcre2_match() is ca modes (normal, soft partial, hard partial). When pcre2_match() is
lled, the appropriate code called, the appropriate code
is run if it is available. Otherwise, the pattern is matched using i nterpretive code. is run if it is available. Otherwise, the pattern is matched using i nterpretive code.
You can call pcre2_jit_compile() multiple times for the same compi led pattern. It does nothing You can call pcre2_jit_compile() multiple times for the same compile d pattern. It does nothing
if it has previously compiled code for any of the option bits. For e xample, you can call it once if it has previously compiled code for any of the option bits. For e xample, you can call it once
with PCRE2_JIT_COMPLETE and (perhaps later, when you find you need p with PCRE2_JIT_COMPLETE and (perhaps later, when you find you need
artial matching) again with partial matching) again with
PCRE2_JIT_COMPLETE and PCRE2_JIT_PARTIAL_HARD. This time it will i PCRE2_JIT_COMPLETE and PCRE2_JIT_PARTIAL_HARD. This time it will ign
gnore PCRE2_JIT_COMPLETE and ore PCRE2_JIT_COMPLETE and
just compile code for partial matching. If pcre2_jit_compile() is ca just compile code for partial matching. If pcre2_jit_compile() i
lled with no option bits s called with no option bits
set, it immediately returns zero. This is an alternative way of t set, it immediately returns zero. This is an alternative way of test
esting whether JIT is avail‐ ing whether JIT support has
able. been compiled.
At present, it is not possible to free JIT compiled code except when the entire compiled pattern At present, it is not possible to free JIT compiled code except when the entire compiled pattern
is freed by calling pcre2_code_free(). is freed by calling pcre2_code_free().
In some circumstances you may need to call additional functions. The se are described in the sec‐ In some circumstances you may need to call additional functions. The se are described in the sec‐
tion entitled "Controlling the JIT stack" below. tion entitled "Controlling the JIT stack" below.
There are some pcre2_match() options that are not supported by JIT, and there are also some pat‐ There are some pcre2_match() options that are not supported by JIT, and there are also some pat‐
tern items that JIT cannot handle. Details are given below. In both cases, matching automati‐ tern items that JIT cannot handle. Details are given below. In bo th cases, matching automati‐
cally falls back to the interpretive code. If you want to know wheth er JIT was actually used for cally falls back to the interpretive code. If you want to know wheth er JIT was actually used for
a particular match, you should arrange for a JIT callback function to be set up as described in a particular match, you should arrange for a JIT callback function t o be set up as described in
the section entitled "Controlling the JIT stack" below, even if you do not need to supply a non- the section entitled "Controlling the JIT stack" below, even if you do not need to supply a non-
default JIT stack. Such a callback function is called whenever JIT c ode is about to be obeyed. default JIT stack. Such a callback function is called whenever JIT code is about to be obeyed.
If the match-time options are not right for JIT execution, the callb ack function is not obeyed. If the match-time options are not right for JIT execution, the callb ack function is not obeyed.
If the JIT compiler finds an unsupported item, no JIT data is genera ted. You can find out if JIT If the JIT compiler finds an unsupported item, no JIT data is genera ted. You can find out if JIT
compilation was successful for a compiled pattern by calling p cre2_pattern_info() with the compilation was successful for a compiled pattern by calling pcr e2_pattern_info() with the
PCRE2_INFO_JITSIZE option. A non-zero result means that JIT compilat ion was successful. A result PCRE2_INFO_JITSIZE option. A non-zero result means that JIT compilat ion was successful. A result
of 0 means that JIT support is not available, or the pattern was not processed by pcre2_jit_com‐ of 0 means that JIT support is not available, or the pattern was not processed by pcre2_jit_com‐
pile(), or the JIT compiler was not able to handle the pattern. Succ pile(), or the JIT compiler was not able to handle the pattern. Suc
essful JIT compilation does cessful JIT compilation does
not, however, guarantee the use of JIT at match time because there not, however, guarantee the use of JIT at match time because there a
are some match time options re some match time options
that are not supported by JIT. that are not supported by JIT.
MATCHING SUBJECTS CONTAINING INVALID UTF MATCHING SUBJECTS CONTAINING INVALID UTF
When a pattern is compiled with the PCRE2_UTF option, subject string When a pattern is compiled with the PCRE2_UTF option, subject stri
s are normally expected to ngs are normally expected to
be a valid sequence of UTF code units. By default, this is checked be a valid sequence of UTF code units. By default, this is checked a
at the start of matching and t the start of matching and
an error is generated if invalid UTF is detected. The PCRE2_NO_UTF_C HECK option can be passed to an error is generated if invalid UTF is detected. The PCRE2_NO_UTF_C HECK option can be passed to
pcre2_match() to skip the check (for improved performance) if you ar e sure that a subject string pcre2_match() to skip the check (for improved performance) if you ar e sure that a subject string
is valid. If this option is used with an invalid string, the result is undefined. The calling is valid. If this option is used with an invalid string, the resu lt is undefined. The calling
program may crash or loop or otherwise misbehave. program may crash or loop or otherwise misbehave.
However, a way of running matches on strings that may contain inv However, a way of running matches on strings that may contain invali
alid UTF sequences is avail‐ d UTF sequences is avail‐
able. Calling pcre2_compile() with the PCRE2_MATCH_INVALID_UTF optio able. Calling pcre2_compile() with the PCRE2_MATCH_INVALID_UTF opti
n has two effects: it tells on has two effects: it tells
the interpreter in pcre2_match() to support invalid UTF, and, if p the interpreter in pcre2_match() to support invalid UTF, and, if pcr
cre2_jit_compile() is subse‐ e2_jit_compile() is subse‐
quently called, the compiled JIT code also supports invalid UTF. De quently called, the compiled JIT code also supports invalid UTF.
tails of how this support Details of how this support
works, in both the JIT and the interpretive cases, is given in the p cre2unicode documentation. works, in both the JIT and the interpretive cases, is given in the p cre2unicode documentation.
There is also an obsolete option for pcre2_jit_compile() called There is also an obsolete option for pcre2_jit_compile() called P
PCRE2_JIT_INVALID_UTF, which CRE2_JIT_INVALID_UTF, which
currently exists only for backward compatibility. It is superseded currently exists only for backward compatibility. It is supersede
by the pcre2_compile() op‐ d by the pcre2_compile() op‐
tion PCRE2_MATCH_INVALID_UTF and should no longer be used. It may be removed in future. tion PCRE2_MATCH_INVALID_UTF and should no longer be used. It may be removed in future.
UNSUPPORTED OPTIONS AND PATTERN ITEMS UNSUPPORTED OPTIONS AND PATTERN ITEMS
The pcre2_match() options that are supported for JIT matching are The pcre2_match() options that are supported for JIT matching are
PCRE2_COPY_MATCHED_SUBJECT, PCRE2_COPY_MATCHED_SUBJECT,
PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATS PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_AT
TART, PCRE2_NO_UTF_CHECK, START, PCRE2_NO_UTF_CHECK,
PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. The PCRE2_ANCHORED and P CRE2_ENDANCHORED options are PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. The PCRE2_ANCHORED and P CRE2_ENDANCHORED options are
not supported at match time. not supported at match time.
If the PCRE2_NO_JIT option is passed to pcre2_match() it disables th e use of JIT, forcing match‐ If the PCRE2_NO_JIT option is passed to pcre2_match() it disables th e use of JIT, forcing match‐
ing by the interpreter code. ing by the interpreter code.
The only unsupported pattern items are \C (match a single data unit) when running in a UTF mode, The only unsupported pattern items are \C (match a single data unit) when running in a UTF mode,
and a callout immediately before an assertion condition in a conditi onal group. and a callout immediately before an assertion condition in a conditi onal group.
RETURN VALUES FROM JIT MATCHING RETURN VALUES FROM JIT MATCHING
When a pattern is matched using JIT, the return values are the same as those given by the inter‐ When a pattern is matched using JIT, the return values are the same as those given by the inter‐
pretive pcre2_match() code, with the addition of one new error code: PCRE2_ERROR_JIT_STACKLIMIT. pretive pcre2_match() code, with the addition of one new error code: PCRE2_ERROR_JIT_STACKLIMIT.
This means that the memory used for the JIT stack was insufficie nt. See "Controlling the JIT This means that the memory used for the JIT stack was insufficient. See "Controlling the JIT
stack" below for a discussion of JIT stack usage. stack" below for a discussion of JIT stack usage.
The error code PCRE2_ERROR_MATCHLIMIT is returned by the JIT code if searching a very large pat‐ The error code PCRE2_ERROR_MATCHLIMIT is returned by the JIT code if searching a very large pat‐
tern tree goes on for too long, as it is in the same circumstance wh tern tree goes on for too long, as it is in the same circumstance w
en JIT is not used, but the hen JIT is not used, but the
details of exactly what is counted are not the same. The PCRE2_ER details of exactly what is counted are not the same. The PCRE2_ERROR
ROR_DEPTHLIMIT error code is _DEPTHLIMIT error code is
never returned when JIT matching is used. never returned when JIT matching is used.
CONTROLLING THE JIT STACK CONTROLLING THE JIT STACK
When the compiled JIT code runs, it needs a block of memory to use a When the compiled JIT code runs, it needs a block of memory to use
s a stack. By default, it as a stack. By default, it
uses 32KiB on the machine stack. However, some large or complic uses 32KiB on the machine stack. However, some large or complicated
ated patterns need more than patterns need more than
this. The error PCRE2_ERROR_JIT_STACKLIMIT is given when there is no this. The error PCRE2_ERROR_JIT_STACKLIMIT is given when there is n
t enough stack. Three func‐ ot enough stack. Three func‐
tions are provided for managing blocks of memory for use as JIT stac ks. There is further discus‐ tions are provided for managing blocks of memory for use as JIT stac ks. There is further discus‐
sion about the use of JIT stacks in the section entitled "JIT stack FAQ" below. sion about the use of JIT stacks in the section entitled "JIT stack FAQ" below.
The pcre2_jit_stack_create() function creates a JIT stack. Its argu The pcre2_jit_stack_create() function creates a JIT stack. Its argum
ments are a starting size, a ents are a starting size, a
maximum size, and a general context (for memory allocation functions maximum size, and a general context (for memory allocation function
, or NULL for standard mem‐ s, or NULL for standard mem‐
ory allocation). It returns a pointer to an opaque structure of type pcre2_jit_stack, or NULL if ory allocation). It returns a pointer to an opaque structure of type pcre2_jit_stack, or NULL if
there is an error. The pcre2_jit_stack_free() function is used to fr ee a stack that is no longer there is an error. The pcre2_jit_stack_free() function is used to fr ee a stack that is no longer
needed. If its argument is NULL, this function returns immediately, without doing anything. (For needed. If its argument is NULL, this function returns immediately, without doing anything. (For
the technically minded: the address space is allocated by mmap or Vi rtualAlloc.) A maximum stack the technically minded: the address space is allocated by mmap or Vi rtualAlloc.) A maximum stack
size of 512KiB to 1MiB should be more than enough for any pattern. size of 512KiB to 1MiB should be more than enough for any pattern.
The pcre2_jit_stack_assign() function specifies which stack JIT co de should use. Its arguments The pcre2_jit_stack_assign() function specifies which stack JIT code should use. Its arguments
are as follows: are as follows:
pcre2_match_context *mcontext pcre2_match_context *mcontext
pcre2_jit_callback callback pcre2_jit_callback callback
void *data void *data
The first argument is a pointer to a match context. When this is sub sequently passed to a match‐ The first argument is a pointer to a match context. When this is sub sequently passed to a match‐
ing function, its information determines which JIT stack is used. If ing function, its information determines which JIT stack is used. I
this argument is NULL, the f this argument is NULL, the
function returns immediately, without doing anything. There are t function returns immediately, without doing anything. There are thre
hree cases for the values of e cases for the values of
the other two options: the other two options:
(1) If callback is NULL and data is NULL, an internal 32KiB block (1) If callback is NULL and data is NULL, an internal 32KiB block
on the machine stack is used. This is the default when a match on the machine stack is used. This is the default when a match
context is created. context is created.
(2) If callback is NULL and data is not NULL, data must be (2) If callback is NULL and data is not NULL, data must be
a pointer to a valid JIT stack, the result of calling a pointer to a valid JIT stack, the result of calling
pcre2_jit_stack_create(). pcre2_jit_stack_create().
(3) If callback is not NULL, it must point to a function that is (3) If callback is not NULL, it must point to a function that is
called with data as an argument at the start of matching, in called with data as an argument at the start of matching, in
order to set up a JIT stack. If the return from the callback order to set up a JIT stack. If the return from the callback
function is NULL, the internal 32KiB stack is used; otherwise the function is NULL, the internal 32KiB stack is used; otherwise the
return value must be a valid JIT stack, the result of calling return value must be a valid JIT stack, the result of calling
pcre2_jit_stack_create(). pcre2_jit_stack_create().
A callback function is obeyed whenever JIT code is about to be run ; it is not obeyed when A callback function is obeyed whenever JIT code is about to b e run; it is not obeyed when
pcre2_match() is called with options that are incompatible for JIT m atching. A callback function pcre2_match() is called with options that are incompatible for JIT m atching. A callback function
can therefore be used to determine whether a match operation was e xecuted by JIT or by the in‐ can therefore be used to determine whether a match operation was exe cuted by JIT or by the in‐
terpreter. terpreter.
You may safely use the same JIT stack for more than one pattern (eit her by assigning directly or You may safely use the same JIT stack for more than one pattern (eit her by assigning directly or
by callback), as long as the patterns are matched sequentially in th by callback), as long as the patterns are matched sequentially in
e same thread. Currently, the same thread. Currently,
the only way to set up non-sequential matches in one thread is t the only way to set up non-sequential matches in one thread is to us
o use callouts: if a callout e callouts: if a callout
function starts another match, that match must use a different JIT s function starts another match, that match must use a different J
tack to the one used for IT stack to the one used for
currently suspended match(es). currently suspended match(es).
In a multithread application, if you do not specify a JIT stack, o In a multithread application, if you do not specify a JIT stack, or
r if you assign or pass back if you assign or pass back
NULL from a callback, that is thread-safe, because each thread has i NULL from a callback, that is thread-safe, because each thread has
ts own machine stack. How‐ its own machine stack. How‐
ever, if you assign or pass back a non-NULL JIT stack, this must b ever, if you assign or pass back a non-NULL JIT stack, this must be
e a different stack for each a different stack for each
thread so that the application is thread-safe. thread so that the application is thread-safe.
Strictly speaking, even more is allowed. You can assign the same non -NULL stack to a match con‐ Strictly speaking, even more is allowed. You can assign the same no n-NULL stack to a match con‐
text that is used by any number of patterns, as long as they are not used for matching by multi‐ text that is used by any number of patterns, as long as they are not used for matching by multi‐
ple threads at the same time. For example, you could use the sam ple threads at the same time. For example, you could use the same st
e stack in all compiled pat‐ ack in all compiled pat‐
terns, with a global mutex in the callback to wait until the stack i terns, with a global mutex in the callback to wait until the stac
s available for use. How‐ k is available for use. How‐
ever, this is an inefficient solution, and not recommended. ever, this is an inefficient solution, and not recommended.
This is a suggestion for how a multithreaded program that needs to s et up non-default JIT stacks This is a suggestion for how a multithreaded program that needs to s et up non-default JIT stacks
might operate: might operate:
During thread initialization During thread initialization
thread_local_var = pcre2_jit_stack_create(...) thread_local_var = pcre2_jit_stack_create(...)
During thread exit During thread exit
pcre2_jit_stack_free(thread_local_var) pcre2_jit_stack_free(thread_local_var)
Use a one-line callback function Use a one-line callback function
return thread_local_var return thread_local_var
All the functions described in this section do nothing if JIT is not available. All the functions described in this section do nothing if JIT is not available.
JIT STACK FAQ JIT STACK FAQ
(1) Why do we need JIT stacks? (1) Why do we need JIT stacks?
PCRE2 (and JIT) is a recursive, depth-first engine, so it needs a s PCRE2 (and JIT) is a recursive, depth-first engine, so it needs a st
tack where the local data of ack where the local data of
the current node is pushed before checking its child nodes. Allocat the current node is pushed before checking its child nodes. Allo
ing real machine stack on cating real machine stack on
some platforms is difficult. For example, the stack chain needs to some platforms is difficult. For example, the stack chain needs to b
be updated every time if we e updated every time if we
extend the stack on PowerPC. Although it is possible, its updating time overhead decreases per‐ extend the stack on PowerPC. Although it is possible, its updating time overhead decreases per‐
formance. So we do the recursion in memory. formance. So we do the recursion in memory.
(2) Why don't we simply allocate blocks of memory with malloc()? (2) Why don't we simply allocate blocks of memory with malloc()?
Modern operating systems have a nice feature: they can reserve an ad dress space instead of allo‐ Modern operating systems have a nice feature: they can reserve an ad dress space instead of allo‐
cating memory. We can safely allocate memory pages inside this addre ss space, so the stack could cating memory. We can safely allocate memory pages inside this addre ss space, so the stack could
grow without moving memory data (this is important because of pointe rs). Thus we can allocate grow without moving memory data (this is important because of poi nters). Thus we can allocate
1MiB address space, and use only a single memory page (usually 4KiB) if that is enough. However, 1MiB address space, and use only a single memory page (usually 4KiB) if that is enough. However,
we can still grow up to 1MiB anytime if needed. we can still grow up to 1MiB anytime if needed.
(3) Who "owns" a JIT stack? (3) Who "owns" a JIT stack?
The owner of the stack is the user program, not the JIT studied p The owner of the stack is the user program, not the JIT studied patt
attern or anything else. The ern or anything else. The
user program must ensure that if a stack is being used by pcre2_matc user program must ensure that if a stack is being used by pcre2_
h(), (that is, it is as‐ match(), (that is, it is as‐
signed to a match context that is passed to the pattern currently r signed to a match context that is passed to the pattern currently ru
unning), that stack must not nning), that stack must not
be used by any other threads (to avoid overwriting the same memory a be used by any other threads (to avoid overwriting the same memory
rea). The best practice for area). The best practice for
multithreaded programs is to allocate a stack for each thread, and r eturn this stack through the multithreaded programs is to allocate a stack for each thread, and r eturn this stack through the
JIT callback function. JIT callback function.
(4) When should a JIT stack be freed? (4) When should a JIT stack be freed?
You can free a JIT stack at any time, as long as it will not be You can free a JIT stack at any time, as long as it will not be use
used by pcre2_match() again. d by pcre2_match() again.
When you assign the stack to a match context, only a pointer is se When you assign the stack to a match context, only a pointer i
t. There is no reference s set. There is no reference
counting or any other magic. You can free compiled patterns, contex counting or any other magic. You can free compiled patterns, context
ts, and stacks in any order, s, and stacks in any order,
anytime. Just do not call pcre2_match() with a match context poin anytime. Just do not call pcre2_match() with a match context
ting to an already freed pointing to an already freed
stack, as that will cause SEGFAULT. (Also, do not free a stack cur stack, as that will cause SEGFAULT. (Also, do not free a stack curre
rently used by pcre2_match() ntly used by pcre2_match()
in another thread). You can also replace the stack in a context at a in another thread). You can also replace the stack in a context a
ny time when it is not in t any time when it is not in
use. You should free the previous stack before assigning a replaceme nt. use. You should free the previous stack before assigning a replaceme nt.
(5) Should I allocate/free a stack every time before/after calling p cre2_match()? (5) Should I allocate/free a stack every time before/after calling p cre2_match()?
No, because this is too costly in terms of resources. However, you No, because this is too costly in terms of resources. However, you c
could implement some clever ould implement some clever
idea which release the stack if it is not used in let's say two minu idea which release the stack if it is not used in let's say two m
tes. The JIT callback can inutes. The JIT callback can
help to achieve this without keeping a list of patterns. help to achieve this without keeping a list of patterns.
(6) OK, the stack is for long term memory allocation. But what happe ns if a pattern causes stack (6) OK, the stack is for long term memory allocation. But what happe ns if a pattern causes stack
overflow with a stack of 1MiB? Is that 1MiB kept until the stack is freed? overflow with a stack of 1MiB? Is that 1MiB kept until the stack is freed?
Especially on embedded systems, it might be a good idea to rele Especially on embedded systems, it might be a good idea to release
ase memory sometimes without memory sometimes without
freeing the stack. There is no API for this at the moment. Probably freeing the stack. There is no API for this at the moment. Probab
a function call which re‐ ly a function call which re‐
turns with the currently allocated memory for any stack and another turns with the currently allocated memory for any stack and another
which allows releasing mem‐ which allows releasing mem‐
ory (shrinking the stack) would be a good idea if someone needs this . ory (shrinking the stack) would be a good idea if someone needs this .
(7) This is too much of a headache. Isn't there any better solution for JIT stack handling? (7) This is too much of a headache. Isn't there any better solution for JIT stack handling?
No, thanks to Windows. If POSIX threads were used everywhere, we cou ld throw out this compli‐ No, thanks to Windows. If POSIX threads were used everywhere, we could throw out this compli‐
cated API. cated API.
FREEING JIT SPECULATIVE MEMORY FREEING JIT SPECULATIVE MEMORY
void pcre2_jit_free_unused_memory(pcre2_general_context *gcontext); void pcre2_jit_free_unused_memory(pcre2_general_context *gcontext);
The JIT executable allocator does not free all memory when it is po The JIT executable allocator does not free all memory when it is pos
ssible. It expects new allo‐ sible. It expects new allo‐
cations, and keeps some free memory around to improve allocation spe cations, and keeps some free memory around to improve allocation s
ed. However, in low memory peed. However, in low memory
conditions, it might be better to free all possible memory. You conditions, it might be better to free all possible memory. You can
can cause this to happen by cause this to happen by
calling pcre2_jit_free_unused_memory(). Its argument is a general c calling pcre2_jit_free_unused_memory(). Its argument is a genera
ontext, for custom memory l context, for custom memory
management, or NULL for standard memory management. management, or NULL for standard memory management.
EXAMPLE CODE EXAMPLE CODE
This is a single-threaded example that specifies a JIT stack with out using a callback. A real This is a single-threaded example that specifies a JIT stack without using a callback. A real
program should include error checking after all the function calls. program should include error checking after all the function calls.
int rc; int rc;
pcre2_code *re; pcre2_code *re;
pcre2_match_data *match_data; pcre2_match_data *match_data;
pcre2_match_context *mcontext; pcre2_match_context *mcontext;
pcre2_jit_stack *jit_stack; pcre2_jit_stack *jit_stack;
re = pcre2_compile(pattern, PCRE2_ZERO_TERMINATED, 0, re = pcre2_compile(pattern, PCRE2_ZERO_TERMINATED, 0,
&errornumber, &erroffset, NULL); &errornumber, &erroffset, NULL);
skipping to change at line 325 skipping to change at line 331
pcre2_code_free(re); pcre2_code_free(re);
pcre2_match_data_free(match_data); pcre2_match_data_free(match_data);
pcre2_match_context_free(mcontext); pcre2_match_context_free(mcontext);
pcre2_jit_stack_free(jit_stack); pcre2_jit_stack_free(jit_stack);
JIT FAST PATH API JIT FAST PATH API
Because the API described above falls back to interpreted matching w hen JIT is not available, it Because the API described above falls back to interpreted matching w hen JIT is not available, it
is convenient for programs that are written for general use in many environments. However, call‐ is convenient for programs that are written for general use in many environments. However, call‐
ing JIT via pcre2_match() does have a performance impact. Programs ing JIT via pcre2_match() does have a performance impact. Progr
that are written for use ams that are written for use
where JIT is known to be available, and which need the best possi where JIT is known to be available, and which need the best possible
ble performance, can instead performance, can instead
use a "fast path" API to call JIT matching directly instead of calli use a "fast path" API to call JIT matching directly instead of call
ng pcre2_match() (obviously ing pcre2_match() (obviously
only for patterns that have been successfully processed by pcre2_jit _compile()). only for patterns that have been successfully processed by pcre2_jit _compile()).
The fast path function is called pcre2_jit_match(), and it takes e The fast path function is called pcre2_jit_match(), and it takes exa
xactly the same arguments as ctly the same arguments as
pcre2_match(). However, the subject string must be specified with a pcre2_match(). However, the subject string must be specified with
length; PCRE2_ZERO_TERMI‐ a length; PCRE2_ZERO_TERMI‐
NATED is not supported. Unsupported option bits (for example, PCR NATED is not supported. Unsupported option bits (for example, PCRE2
E2_ANCHORED and PCRE2_ENDAN‐ _ANCHORED and PCRE2_ENDAN‐
CHORED) are ignored, as is the PCRE2_NO_JIT option. The return value CHORED) are ignored, as is the PCRE2_NO_JIT option. The return val
s are also the same as for ues are also the same as for
pcre2_match(), plus PCRE2_ERROR_JIT_BADOPTION if a matching mode pcre2_match(), plus PCRE2_ERROR_JIT_BADOPTION if a matching mode (pa
(partial or complete) is re‐ rtial or complete) is re‐
quested that was not compiled. quested that was not compiled.
When you call pcre2_match(), as well as testing for invalid options, When you call pcre2_match(), as well as testing for invalid optio
a number of other sanity ns, a number of other sanity
checks are performed on the arguments. For example, if the sub checks are performed on the arguments. For example, if the subject
ject pointer is NULL but the pointer is NULL but the
length is non-zero, an immediate error is given. Also, unless PCRE2_ length is non-zero, an immediate error is given. Also, unless PCRE
NO_UTF_CHECK is set, a UTF 2_NO_UTF_CHECK is set, a UTF
subject string is tested for validity. In the interests of speed, t subject string is tested for validity. In the interests of speed, th
hese checks do not happen on ese checks do not happen on
the JIT fast path. If invalid UTF data is passed when PCRE2_MATCH_IN the JIT fast path. If invalid UTF data is passed when PCRE2_MATCH
VALID_UTF was not set for _INVALID_UTF was not set for
pcre2_compile(), the result is undefined. The program may crash or pcre2_compile(), the result is undefined. The program may crash or l
loop or give wrong results. oop or give wrong results.
In the absence of PCRE2_MATCH_INVALID_UTF you should call pcre2_jit_ In the absence of PCRE2_MATCH_INVALID_UTF you should call pcre2_jit
match() in UTF mode only if _match() in UTF mode only if
you are sure the subject is valid. you are sure the subject is valid.
Bypassing the sanity checks and the pcre2_match() wrapping can give speedups of more than 10%. Bypassing the sanity checks and the pcre2_match() wrapping can give speedups of more than 10%.
SEE ALSO SEE ALSO
pcre2api(3), pcre2unicode(3) pcre2api(3), pcre2unicode(3)
AUTHOR AUTHOR
Philip Hazel (FAQ by Zoltan Herczeg) Philip Hazel (FAQ by Zoltan Herczeg)
Retired from University Computing Service Retired from University Computing Service
Cambridge, England. Cambridge, England.
REVISION REVISION
Last updated: 21 February 2024 Last updated: 22 August 2024
Copyright (c) 1997-2024 University of Cambridge. Copyright (c) 1997-2024 University of Cambridge.
PCRE2 10.43 21 February 2024 PCRE2JIT(3) PCRE2 10.45-RC1 22 August 2024 PCRE2JIT(3)
 End of changes. 47 change blocks. 
185 lines changed or deleted 197 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/