They added some new scripts.
There were a few changes to the line break algorithm,
most notably there is more rules that require more context than before.
While not major, there was some shuffling and additions to our
implementation to match the new rules.
IDNA test data now disallows the trailing dot/empty root label,
technically to be toggled off by an option that controls a few things,
but we don't have options. For test-data they changed the format a
little - "" is used to mean empty string, while a blank segment is
null/no string, update the parser to read this.
Changes in this cherry-pick:
- Reran tool to resolve conflicts due to
emoji-data not being extracted in this branch
[ChangeLog][Third-Party Code] Updated the Unicode Character Database to
UCD revision 34/Unicode 16.
Fixes: QTBUG-132902
Task-number: QTBUG-132851
Pick-to: 6.5
Change-Id: I4569703659f6fd0f20943110a03301c1cf8cc1ed
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
(cherry picked from commit 85899ff181984a1310cd1ad10cdb0824f1ca5118)
Reviewed-by: Qt Cherry-pick Bot <cherrypick_bot@qt-project.org>
(cherry picked from commit 5985c90d37a096f35b68546f916bec29a218e112)
Add enumerator for the new Unicode version to QChar::UnicodeVersion.
Remap new line breaking classes to their Unicode 15.0 values:
* AK, AP and AS to AL,
* VI and VF to CM.
These are classes for new line breaking support for Indic scripts
that require more work.
Blacklist failing tests for now:
* tst_QUrlUts46::idnaTestV2
* tst_QTextBoundaryFinder::lineBoundariesDefault
* tst_QTextBoundaryFinder::graphemeBoundariesDefault
Regenerate the source files.
Task-number: QTBUG-121529
Change-Id: I869cc9fbaa53765d8ae6265c22cdbef9f19d05bf
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
This corresponds to Unicode version 15.0.0.
Added the following scripts:
* Kawi
* Nag Mundari
Full support of these scripts requires harfbuzz version 5.2.0,
this version adds support for Unicode 15.0:
https://github.com/harfbuzz/harfbuzz/releases/tag/5.2.0
Fixes: QTBUG-106810
Change-Id: Ib06c526e49b0f01ef9f21123bcf875c6b19f2601
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
This corresponds to Unicode version 14.0.0.
Added the following scripts:
* CyproMinoan
* OldUyghur
* Tangsa
* Toto
* Vithkuqi
Full support of these scripts requires harfbuzz version 3.0.0,
this version adds support for Unicode 14.0:
https://github.com/harfbuzz/harfbuzz/releases/tag/3.0.0
With this release 10 test cases in tst_qurluts46 were fixed, one
additional test case is failing in tst_qtextboundaryfinder and
is commented out. In total 62 line break test cases and 44 word
break test cases are failing.
A comment in src/corelib/text/qt_attribution.json was updated to
include the URL of the page containing UCD version number.
Fixes: QTBUG-94359
Change-Id: Iefc9ff13f3df279f91cbdb1246d56f75b20ecb35
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Update the Unicode data processing tool to generate properties
and mapping tables needed to implement UTS #46
(https://unicode.org/reports/tr46/). The implementation extends
the standard to allow usage of underscores in URLs. This is done
for compatibility with DNS-SD and SMB protocols.
The data file needed to generate the new properties was taken from
https://www.unicode.org/Public/idna/13.0.0/IdnaMappingTable.txt
Task-number: QTBUG-85323
Change-Id: I2c303bf8a08aefb18a7491fb9b55385563bfa219
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>