instead of C, which a match
without the C<\G> anchor would have done. Also note that the final match
did not update C. C is only updated on a C match. If the
final match did indeed match C, it's a good bet that you're running a
very old (pre-5.6.0) version of Perl.
A useful idiom for C-like scanners is C\G.../gc>. You can
combine several regexps like this to process a string part-by-part,
doing different actions depending on which regexp matched. Each
regexp tries to match where the previous one leaves off.
$_ = <<'EOL';
$url = URI::URL->new( "http://example.com/" ); die if $url eq "xXx";
EOL
LOOP: {
print(" digits"), redo LOOP if /\G\d+\b[,.;]?\s*/gc;
print(" lowercase"), redo LOOP if /\G\p{Ll}+\b[,.;]?\s*/gc;
print(" UPPERCASE"), redo LOOP if /\G\p{Lu}+\b[,.;]?\s*/gc;
print(" Capitalized"), redo LOOP if /\G\p{Lu}\p{Ll}+\b[,.;]?\s*/gc;
print(" MiXeD"), redo LOOP if /\G\pL+\b[,.;]?\s*/gc;
print(" alphanumeric"), redo LOOP if /\G[\p{Alpha}\pN]+\b[,.;]?\s*/gc;
print(" line-noise"), redo LOOP if /\G\W+/gc;
print ". That's all!\n";
}
Here is the output (split into several lines):
line-noise lowercase line-noise UPPERCASE line-noise UPPERCASE
line-noise lowercase line-noise lowercase line-noise lowercase
lowercase line-noise lowercase lowercase line-noise lowercase
lowercase line-noise MiXeD line-noise. That's all!
=item m?PATTERN?msixpodualgc
X> X
=item ?PATTERN?msixpodualgc
This is just like the C search, except that it matches
only once between calls to the reset() operator. This is a useful
optimization when you want to see only the first occurrence of
something in each file of a set of files, for instance. Only C
patterns local to the current package are reset.
while (<>) {
if (m?^$?) {
# blank line between header and body
}
} continue {
reset if eof; # clear m?? status for next file
}
Another example switched the first "latin1" encoding it finds
to "utf8" in a pod file:
s//utf8/ if m? ^ =encoding \h+ \K latin1 ?x;
The match-once behavior is controlled by the match delimiter being
C>; with any other delimiter this is the normal C operator.
For historical reasons, the leading C in C is optional,
but the resulting C syntax is deprecated, will warn on
usage and might be removed from a future stable release of Perl (without
further notice!).
=item s/PATTERN/REPLACEMENT/msixpodualgcer
X X X X
X X X X X