You In Here :
/
/
proc /
self /
root /
usr /
share /
perl5 /
pod /=encoding utf8
=head1 NAME
perl5160delta - what is new for perl v5.16.0
=head1 DESCRIPTION
This document describes differences between the 5.14.0 release and
the 5.16.0 release.
If you are upgrading from an earlier release such as 5.12.0, first read
L
, which describes differences between 5.12.0 and
5.14.0.
Some bug fixes in this release have been backported to later
releases of 5.14.x. Those are indicated with the 5.14.x version in
parentheses.
=head1 Notice
With the release of Perl 5.16.0, the 5.12.x series of releases is now out of
its support period. There may be future 5.12.x releases, but only in the
event of a critical security issue. Users of Perl 5.12 or earlier should
consider upgrading to a more recent release of Perl.
This policy is described in greater detail in
L.
=head1 Core Enhancements
=head2 C>
As of this release, version declarations like C now disable
all features before enabling the new feature bundle. This means that
the following holds true:
use 5.016;
# only 5.16 features enabled here
use 5.014;
# only 5.14 features enabled here (not 5.16)
C and higher continue to enable strict, but explicit C and C now override the version declaration, even
when they come first:
no strict;
use 5.012;
# no strict here
There is a new ":default" feature bundle that represents the set of
features enabled before any version declaration or C has
been seen. Version declarations below 5.10 now enable the ":default"
feature set. This does not actually change the behavior of C, because features added to the ":default" set are those that were
traditionally enabled by default, before they could be turned off.
C<< no feature >> now resets to the default feature set. To disable all
features (which is likely to be a pretty special-purpose request, since
it presumably won't match any named set of semantics) you can now
write C<< no feature ':all' >>.
C<$[> is now disabled under C. It is part of the default
feature set and can be turned on or off explicitly with C.
=head2 C<__SUB__>
The new C<__SUB__> token, available under the C feature
(see L) or C, returns a reference to the current
subroutine, making it easier to write recursive closures.
=head2 New and Improved Built-ins
=head3 More consistent C
The C operator sometimes treats a string argument as a sequence of
characters and sometimes as a sequence of bytes, depending on the
internal encoding. The internal encoding is not supposed to make any
difference, but there is code that relies on this inconsistency.
The new C and C features (enabled under C) resolve this. The C feature causes C to treat the string always as Unicode. The C
features provides a function, itself called C, which
evaluates its argument always as a string of bytes.
These features also fix oddities with source filters leaking to outer
dynamic scopes.
See L for more detail.
=head3 C lvalue revamp
=for comment Does this belong here, or under Incompatible Changes?
When C is called in lvalue or potential lvalue context with two
or three arguments, a special lvalue scalar is returned that modifies
the original string (the first argument) when assigned to.
Previously, the offsets (the second and third arguments) passed to
C would be converted immediately to match the string, negative
offsets being translated to positive and offsets beyond the end of the
string being truncated.
Now, the offsets are recorded without modification in the special
lvalue scalar that is returned, and the original string is not even
looked at by C itself, but only when the returned lvalue is
read or modified.
These changes result in an incompatible change:
If the original string changes length after the call to C but
before assignment to its return value, negative offsets will remember
their position from the end of the string, affecting code like this:
my $string = "string";
my $lvalue = \substr $string, -4, 2;
print $$lvalue, "\n"; # prints "ri"
$string = "bailing twine";
print $$lvalue, "\n"; # prints "wi"; used to print "il"
The same thing happens with an omitted third argument. The returned
lvalue will always extend to the end of the string, even if the string
becomes longer.
Since this change also allowed many bugs to be fixed (see
L operator>), and since the behavior
of negative offsets has never been specified, the
change was deemed acceptable.
=head3 Return value of C
The value returned by C on a tied variable is now the actual
scalar that holds the object to which the variable is tied. This
lets ties be weakened with C.
=head2 Unicode Support
=head3 Supports (I) Unicode 6.1
Besides the addition of whole new scripts, and new characters in
existing scripts, this new version of Unicode, as always, makes some
changes to existing characters. One change that may trip up some
applications is that the General Category of two characters in the
Latin-1 range, PILCROW SIGN and SECTION SIGN, has been changed from
Other_Symbol to Other_Punctuation. The same change has been made for
a character in each of Tibetan, Ethiopic, and Aegean.
The code points U+3248..U+324F (CIRCLED NUMBER TEN ON BLACK SQUARE
through CIRCLED NUMBER EIGHTY ON BLACK SQUARE) have had their General
Category changed from Other_Symbol to Other_Numeric. The Line Break
property has changes for Hebrew and Japanese; and because of
other changes in 6.1, the Perl regular expression construct C<\X> now
works differently for some characters in Thai and Lao.
New aliases (synonyms) have been defined for many property values;
these, along with the previously existing ones, are all cross-indexed in
L.
The return value of C is affected by other
changes:
Code point Old Name New Name
U+000A LINE FEED (LF) LINE FEED
U+000C FORM FEED (FF) FORM FEED
U+000D CARRIAGE RETURN (CR) CARRIAGE RETURN
U+0085 NEXT LINE (NEL) NEXT LINE
U+008E SINGLE-SHIFT 2 SINGLE-SHIFT-2
U+008F SINGLE-SHIFT 3 SINGLE-SHIFT-3
U+0091 PRIVATE USE 1 PRIVATE USE-1
U+0092 PRIVATE USE 2 PRIVATE USE-2
U+2118 SCRIPT CAPITAL P WEIERSTRASS ELLIPTIC FUNCTION
Perl will accept any of these names as input, but
C now returns the new name of each pair. The
change for U+2118 is considered by Unicode to be a correction, that is
the original name was a mistake (but again, it will remain forever valid
to use it to refer to U+2118). But most of these changes are the
fallout of the mistake Unicode 6.0 made in naming a character used in
Japanese cell phones to be "BELL", which conflicts with the longstanding
industry use of (and Unicode's recommendation to use) that name
to mean the ASCII control character at U+0007. Therefore, that name
has been deprecated in Perl since v5.14, and any use of it will raise a
warning message (unless turned off). The name "ALERT" is now the
preferred name for this code point, with "BEL" an acceptable short
form. The name for the new cell phone character, at code point U+1F514,
remains undefined in this version of Perl (hence we don't
implement quite all of Unicode 6.1), but starting in v5.18, BELL will mean
this character, and not U+0007.
Unicode has taken steps to make sure that this sort of mistake does not
happen again. The Standard now includes all generally accepted
names and abbreviations for control characters, whereas previously it
didn't (though there were recommended names for most of them, which Perl
used). This means that most of those recommended names are now
officially in the Standard. Unicode did not recommend names for the
four code points listed above between U+008E and U+008F, and in
standardizing them Unicode subtly changed the names that Perl had
previously given them, by replacing the final blank in each name by a
hyphen. Unicode also officially accepts names that Perl had deprecated,
such as FILE SEPARATOR. Now the only deprecated name is BELL.
Finally, Perl now uses the new official names instead of the old
(now considered obsolete) names for the first four code points in the
list above (the ones which have the parentheses in them).
Now that the names have been placed in the Unicode standard, these kinds
of changes should not happen again, though corrections, such as to
U+2118, are still possible.
Unicode also added some name abbreviations, which Perl now accepts:
SP for SPACE;
TAB for CHARACTER TABULATION;
NEW LINE, END OF LINE, NL, and EOL for LINE FEED;
LOCKING-SHIFT ONE for SHIFT OUT;
LOCKING-SHIFT ZERO for SHIFT IN;
and ZWNBSP for ZERO WIDTH NO-BREAK SPACE.
More details on this version of Unicode are provided in
L .
=head3 C is no longer needed for C<\N{I}>
When C<\N{I}> is encountered, the C module is now
automatically loaded when needed as if the C<:full> and C<:short>
options had been specified. See L for more information.
=head3 C<\N{...}> can now have Unicode loose name matching
This is described in the C item in
L below.
=head3 Unicode Symbol Names
Perl now has proper support for Unicode in symbol names. It used to be
that C<*{$foo}> would ignore the internal UTF8 flag and use the bytes of
the underlying representation to look up the symbol. That meant that
C<*{"\x{100}"}> and C<*{"\xc4\x80"}> would return the same thing. All
these parts of Perl have been fixed to account for Unicode:
=over
=item *
Method names (including those passed to C)
=item *
Typeglob names (including names of variables, subroutines, and filehandles)
=item *
Package names
=item *
C
=item *
Symbolic dereferencing
=item *
Second argument to C and C
=item *
Return value of C[
=item *
Subroutine prototypes
=item *
Attributes
=item *
Various warnings and error messages that mention variable names or values,
methods, etc.
=back
In addition, a parsing bug has been fixed that prevented C<*{é}> from
implicitly quoting the name, but instead interpreted it as C<*{+é}>, which
would cause a strict violation.
C<*{"*a::b"}> automatically strips off the * if it is followed by an ASCII
letter. That has been extended to all Unicode identifier characters.
One-character non-ASCII non-punctuation variables (like C<$é>) are now
subject to "Used only once" warnings. They used to be exempt, as they
were treated as punctuation variables.
Also, single-character Unicode punctuation variables (like C<$‰>) are now
supported [perl #69032].
=head3 Improved ability to mix locales and Unicode, including UTF-8 locales
An optional parameter has been added to C]
use locale ':not_characters';
which tells Perl to use all but the C and C
portions of the current locale. Instead, the character set is assumed
to be Unicode. This lets locales and Unicode be seamlessly mixed,
including the increasingly frequent UTF-8 locales. When using this
hybrid form of locales, the C<:locale> layer to the L pragma can
be used to interface with the file system, and there are CPAN modules
available for ARGV and environment variable conversions.
Full details are in L.
=head3 New function C and corresponding escape sequence C<\F> for Unicode foldcase
Unicode foldcase is an extension to lowercase that gives better results
when comparing two strings case-insensitively. It has long been used
internally in regular expression C matching. Now it is available
explicitly through the new C function call (enabled by
S>, or C, or explicitly callable via
C) or through the new C<\F> sequence in double-quotish
strings.
Full details are in L.
=head3 The Unicode C property is now supported.
New in Unicode 6.0, this is an improved C