They added some new scripts.
There were a few changes to the line break algorithm,
most notably there is more rules that require more context than before.
While not major, there was some shuffling and additions to our
implementation to match the new rules.
IDNA test data now disallows the trailing dot/empty root label,
technically to be toggled off by an option that controls a few things,
but we don't have options. For test-data they changed the format a
little - "" is used to mean empty string, while a blank segment is
null/no string, update the parser to read this.
Changes in this cherry-pick:
- Reran tool to resolve conflicts due to
emoji-data not being extracted in this branch
[ChangeLog][Third-Party Code] Updated the Unicode Character Database to
UCD revision 34/Unicode 16.
Fixes: QTBUG-132902
Task-number: QTBUG-132851
Pick-to: 6.5
Change-Id: I4569703659f6fd0f20943110a03301c1cf8cc1ed
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
(cherry picked from commit 85899ff181984a1310cd1ad10cdb0824f1ca5118)
Reviewed-by: Qt Cherry-pick Bot <cherrypick_bot@qt-project.org>
(cherry picked from commit 5985c90d37a096f35b68546f916bec29a218e112)
Also remove the now unused license and update the qt_attribution.json
[ChangeLog][Third-Party Code] UCD-generated data files now come under Unicode-3.0
Change-Id: I133b1f20643e29a412053eb08ae4c250d07c561e
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
(cherry picked from commit d0bf0660b17af2545c7566329e4bad621c369fee)
New languages added with v46
- Kara-Kalpak
- Swampy Cree
Several new Chinese-language locales have been added, including one
using Latin script, which invalidated some prior QLocale tests, which
have been adjusted to fit.
Some obsolete time-zone identifiers are now treated as deprecated
aliases. These have lost their AnyTerritory association, implying
changes to QTimeZone tests.
Many redundant likely sub-tag rules for unspecified language have been
dropped, in favor of simpler rules.
[ChangeLog][Third-Party Code] Updated CLDR data, used by QLocale, to
v46.
Task-number: QTBUG-130877
Change-Id: I92cf210422c7759dd829a7ca2f845d20e263d25b
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
(cherry picked from commit e316276b76b9c3768ca4e19a04d03308ef21fe12)
Reviewed-by: Qt Cherry-pick Bot <cherrypick_bot@qt-project.org>
(cherry picked from commit 9413c19cc1f394bc39a9f46d7d12a71fb42c8d1a)
The change adds CPE and PURL keys to all qt_attribution.json files in
the repo.
In case if no sensible CPE or PURL exists, a "Comment" field is added
with the text "no relevant CPE or PURL found". If only one of them
does not exist, it is written as such in the Comment field.
This allows filtering for files that haven't had the information added
yet vs those that were looked up but no relevant information was
found.
For sources that are not hosted on github, a generic PURL is used with
a download_url fragment pointing either to the exact location where
the sources can be downloaded, or to the homepage of the project.
The generic package name was chosen based on the 'Id' key of the
attribution entry where it was present, and is not authoritative.
For PURL github packages, the 'git tag' name was specified into the
'version' part of the PURL, rather than the 'version number', because
SBOM processing tooling handle that better than the version number.
For example for the freetype package, we specify the string
'VER-2-13-3' rather than the tag name '2.13.3'.
We might revisit this in the future.
[ChangeLog][Third-Party Code] Added PURL and CPE information to the
attribution files of 3rd party sources.
Task-number: QTBUG-122899
Task-number: QTBUG-129602
Change-Id: Iad126242cafc3ea0b678c5c36b26f857039b1dbd
Reviewed-by: Alexey Edelev <alexey.edelev@qt.io>
(cherry picked from commit 36dca3c04f759449f74008a3e79021a179b0f35e)
This was in fact present in v44, but we overlooked it somehow. The new
version also fixes some inconsistencies in the data, that I reported
against v44.1; in particular, Tamil no longer claims to override the
root AM/PM markers (probably because it uses 24-hour time so doesn't
need them).
Add the test-file under util to the list of files containing generated
content.
Conflict at 6.8 resolved by regenerating the data; this only changed
the date of generation, not the data. Then hand-edited the date to
match the picked upstream commit, to avoid future conflicts.
[ChangeLog][Third-Party Code] Updated CLDR data, used by QLocale, to
v45.
Task-number: QTBUG-126060
Pick-to: 6.7 6.5 6.2
Change-Id: I81a5bcca49519b55091fc541de6b73b606661bb4
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
(cherry picked from commit f79548e268a496698d77d0e78365334d0e507212)
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
The existing data comes under Unicode-DFS-2016 but future updates
shall come under Unicode-3.0, so update the existing headers with the
former and the generator script with the latter. Leave a note in the
attribution file about this transitional state and how to resolve it.
Replaced UNICODE_LICENSE.txt from src/corelib/text/ with
LICENSES/Unicode-DFS-2016.txt, as fetched using reuse download.
This doesn't look like a rename but only actually adds some irrelevant
lines about where on the Unicode website the upstream files (to which
we do not apply this license) come from and changes some spacing.
Pick-to: 6.7 6.5
Fixes: QTBUG-121653
Change-Id: I50c9f4badc77a9aa402af946561aff58ae9e3e7a
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
Reviewed-by: Kai Köhne <kai.koehne@qt.io>
The SPDX database lists the license as 'Unicode-3.0', and 'Unicode
License v3'. Now, the SPDX standard actually says that
License identifiers (including license exception identifiers) used
in SPDX documents or source code files should be matched in a case-
insensitive manner.
But the website at https://spdx.org/licenses/ doesn't treat it this way,
so the link we generate out of the identifier actually gives a 404. So
it's just easier to use the 'original' capitalization.
Amends 063026cc50
Pick-to: 6.5 6.6 6.7
Change-Id: I826077a914721b7b9499ad62c08fdf20be94e88d
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
(This turns out to be identical to v44, for our purposes.)
The CLDR license has been revised at v44 to "UNICODE LICENSE V3",
which is now included (as LICENSES/UNICODE-3.0.txt) in addition to the
old license (still in use, presumably, by UCD - at least until its
next update). Some new QLocale::Language entries are needed. There is
no change to the time-zone data.
Some tests needed changes:
* Various Arabic locales now use U+0623 (Arabic letter aleph with
hamza above) in exponent separator, replacing plain U+0627 (Arabic
letter aleph); it is still followed by U+0633 (Arabic letter seen).
* Where likely sub-tags used to fill in world, 001, as territory for a
language, they now (e.g. for Prussian and Yiddish) give specific
countries.
* Tamil locales now have something of a mix of inherited and localized
forms for AM/PM, which looks a lot like a mistake in CLDR.
* New likely sub-tag rules fix ctor(und_US) and ctor(und_GB), which
previously failed.
[ChangeLog][Third-Party Code] Updated QLocale's data extracted from
the Unicode Common Locale Data Repository (CLDR) to v44.1. The license
changed to Unicode License V3.
Pick-to: 6.7 6.6 6.5
Fixes: QTBUG-121485
Task-number: QTBUG-121325
Change-Id: Ide1a68016129526d7a5aa3fc67f1a674858696bc
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
This is now the official format for Files, when there's more than one,
rather than using space-joined lists.
Pick-to: 6.5
Change-Id: I4a6247fff0ece8ece2944178af38894fd5a2e1e2
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Joerg Bornemann <joerg.bornemann@qt.io>
Reviewed-by: Kai Köhne <kai.koehne@qt.io>
Replace the old abuse of other fields as comments, to be overwritten
by a later setting to a proper value, with actual Comment fields, now
that we have them.
Added a new comment to the valgrind files to say where they come from
in the upstream.
Pick-to: 6.5
Change-Id: I2edcfa2949fa9e59f3f67d3e578d8e5009854cf6
Reviewed-by: Joerg Bornemann <joerg.bornemann@qt.io>
Reviewed-by: Kai Köhne <kai.koehne@qt.io>
The corelib/text/qt_attribution.json didn't mention the
time/q*calendar_data_p.h files which are also generated from CLDR.
Pick-to: 6.5 6.4 6.2 5.15
Change-Id: I768555d4623204245006897c45af58635761bfa1
Reviewed-by: Kai Köhne <kai.koehne@qt.io>
New languages (and one local for each) added with v42
- Haryanvi
- Moksha
- Northern Frisian
- Obolo
- Pijin
- Rajasthani
- Toki Pona
It also appears that Canada has changed its date format. Modify the
relevant test case to reflect this change.
Task-number: QTBUG-110333
Pick-to: 6.5
Change-Id: Ia8975c2866cd54c9e565543d05bacd52f4987909
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
qtattributionsscanner expects file paths to be separated by a space.
Pick-to: 6.2 6.4
Change-Id: I4c9dfea0f086fc9631cb06f40e2d3cab0a32ca4e
Reviewed-by: Jörg Bornemann <joerg.bornemann@qt.io>
This corresponds to Unicode version 15.0.0.
Added the following scripts:
* Kawi
* Nag Mundari
Full support of these scripts requires harfbuzz version 5.2.0,
this version adds support for Unicode 15.0:
https://github.com/harfbuzz/harfbuzz/releases/tag/5.2.0
Fixes: QTBUG-106810
Change-Id: Ib06c526e49b0f01ef9f21123bcf875c6b19f2601
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
CLDR was updated to version 41 in 59860685a1
but this file was not updated.
Task-number: QTBUG-103663
Change-Id: I163a4a3f6ce16d611c013656fa569be01880e72c
Reviewed-by: Ivan Solovev <ivan.solovev@qt.io>
Update tst_qlocale to take into account "narrow" day representation
change for Russian locales. This version of CLDR changes narrow forms
to one letter. Previously those forms were identical to short forms
(two letter). The new representation is consistent with other languages
and so does not appear to be a bug.
Fixes: QTBUG-94358
Pick-to: 6.2
Change-Id: I9724c281a250685da8232e5c05c9c375a8c79253
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
This corresponds to Unicode version 14.0.0.
Added the following scripts:
* CyproMinoan
* OldUyghur
* Tangsa
* Toto
* Vithkuqi
Full support of these scripts requires harfbuzz version 3.0.0,
this version adds support for Unicode 14.0:
https://github.com/harfbuzz/harfbuzz/releases/tag/3.0.0
With this release 10 test cases in tst_qurluts46 were fixed, one
additional test case is failing in tst_qtextboundaryfinder and
is commented out. In total 62 line break test cases and 44 word
break test cases are failing.
A comment in src/corelib/text/qt_attribution.json was updated to
include the URL of the page containing UCD version number.
Fixes: QTBUG-94359
Change-Id: Iefc9ff13f3df279f91cbdb1246d56f75b20ecb35
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
We updated to v39 in 6235893d54
Task-number: QTBUG-94410
Pick-to: 6.2 6.1 5.15
Change-Id: I73d539d677c9066dc5ceb6b4fc65fb544f39ac7f
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Fresh on the heels of our update to v37, they've released a new version.
No new languages to complicate life, fortunately.
Updated license (year range) and attribution. One test also needed an
update: Catalan's long time format now parenthesizes the zone.
Task-number: QTBUG-87925
Pick-to: 5.15
Change-Id: I54fb9b7f084b5cd019c983c1e3862dc03865a272
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Include WordBreakTest.html, since a test uses sample strings from it,
albeit without actually reading the file.
Had to comment out more of the new tests, as at Revision 24, pending
an update to harfbuzz and the text boundary detection code.
Task-number: QTBUG-79631
Task-number: QTBUG-79418
Task-number: QTBUG-82747
Change-Id: I0082294b09d67ffdc6a9b5c15acf77ad3b86f65f
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
Had to teach the update program to accept category Lm as for
Joining_Transparent, for the sake of a new ArabicShaping.txt entry.
Added three new Unicode versions, several new scripts and a new
word-break class.
Updated UCD's test data for tst_QTextBoundaryFinder. This left 57
tests failing; I have commented out the data rows for those tests,
pending someone with more knowledge addressing this.
Task-number: QTBUG-79631
Task-number: QTBUG-79418
Change-Id: Ic33d3b3551195d47a84d98e84020f57a68f0b201
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>
Released on October 4th.
Adds Windows names for two time zones, Qyzylorda and Volgograd.
Added languages Chickasaw (cic), Muscogee (mus) and Silesian (szl).
Norwegian number formatting has flipped back to using colon rather
than dot as time separator; it's flipped back and forth over the last
several CLDR releases. The dot form is present as a variant, the
colon form was long given as the normal pattern, then went away; but
now it's back as a contributed draft and that's what we pick up.
The MS-Win time-zone ID script was iterating a dict, causing random
reshuffling when new entries are added. Fixed that by doing the
critical iteration in sorted order.
Omitted locales ccp_BD and ccp_IN due to QTBUG-69324.
Task-number: QTBUG-79418
Change-Id: I43869ee1810ecc1fe876523947ddcbcddf4e550a
Reviewed-by: Lars Knoll <lars.knoll@qt.io>