Commit Graph

13 Commits (dbff2edaa169cf33ce78266fd23d3502dadf4fbd)

Author SHA1 Message Date
Mårten Nordheim dbff2edaa1 Update UCD to Unicode 16.0.0
They added some new scripts.

There were a few changes to the line break algorithm,
most notably there is more rules that require more context than before.
While not major, there was some shuffling and additions to our
implementation to match the new rules.

IDNA test data now disallows the trailing dot/empty root label,
technically to be toggled off by an option that controls a few things,
but we don't have options. For test-data they changed the format a
little - "" is used to mean empty string, while a blank segment is
null/no string, update the parser to read this.

Changes in this cherry-pick:
  - Reran tool to resolve conflicts due to
    emoji-data not being extracted in this branch

[ChangeLog][Third-Party Code] Updated the Unicode Character Database to
UCD revision 34/Unicode 16.

Fixes: QTBUG-132902
Task-number: QTBUG-132851
Pick-to: 6.5
Change-Id: I4569703659f6fd0f20943110a03301c1cf8cc1ed
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
(cherry picked from commit 85899ff181984a1310cd1ad10cdb0824f1ca5118)
Reviewed-by: Qt Cherry-pick Bot <cherrypick_bot@qt-project.org>
(cherry picked from commit 5985c90d37a096f35b68546f916bec29a218e112)
2025-02-17 14:39:31 +01:00
Ievgenii Meshcheriakov bfd09ec38c unicode: Import version 15.1 (UCD version 32)
Add enumerator for the new Unicode version to QChar::UnicodeVersion.

Remap new line breaking classes to their Unicode 15.0 values:
* AK, AP and AS to AL,
* VI and VF to CM.
These are classes for new line breaking support for Indic scripts
that require more work.

Blacklist failing tests for now:
* tst_QUrlUts46::idnaTestV2
* tst_QTextBoundaryFinder::lineBoundariesDefault
* tst_QTextBoundaryFinder::graphemeBoundariesDefault

Regenerate the source files.

Task-number: QTBUG-121529
Change-Id: I869cc9fbaa53765d8ae6265c22cdbef9f19d05bf
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2024-02-08 16:43:58 +00:00
Ievgenii Meshcheriakov c4e550703c Update UCD to Revision 30
This corresponds to Unicode version 15.0.0.

Added the following scripts:

    * Kawi
    * Nag Mundari

Full support of these scripts requires harfbuzz version 5.2.0,
this version adds support for Unicode 15.0:

    https://github.com/harfbuzz/harfbuzz/releases/tag/5.2.0

Fixes: QTBUG-106810
Change-Id: Ib06c526e49b0f01ef9f21123bcf875c6b19f2601
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2022-10-11 14:10:59 +00:00
Ievgenii Meshcheriakov 826fc8c9bd Update UCD to Revision 28
This corresponds to Unicode version 14.0.0.

Added the following scripts:

    * CyproMinoan
    * OldUyghur
    * Tangsa
    * Toto
    * Vithkuqi

Full support of these scripts requires harfbuzz version 3.0.0,
this version adds support for Unicode 14.0:

    https://github.com/harfbuzz/harfbuzz/releases/tag/3.0.0

With this release 10 test cases in tst_qurluts46 were fixed, one
additional test case is failing in tst_qtextboundaryfinder and
is commented out. In total 62 line break test cases and 44 word
break test cases are failing.

A comment in src/corelib/text/qt_attribution.json was updated to
include the URL of the page containing UCD version number.

Fixes: QTBUG-94359
Change-Id: Iefc9ff13f3df279f91cbdb1246d56f75b20ecb35
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
2021-10-18 16:45:10 +00:00
Edward Welbourne 54f8be6cc0 Update UCD to Revision 26
Include WordBreakTest.html, since a test uses sample strings from it,
albeit without actually reading the file.

Had to comment out more of the new tests, as at Revision 24, pending
an update to harfbuzz and the text boundary detection code.

Task-number: QTBUG-79631
Task-number: QTBUG-79418
Task-number: QTBUG-82747
Change-Id: I0082294b09d67ffdc6a9b5c15acf77ad3b86f65f
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
2020-03-14 11:26:59 +01:00
Edward Welbourne c3eb521a0f Update UCD data to Unicode 12.1.0's Revision 24
Had to teach the update program to accept category Lm as for
Joining_Transparent, for the sake of a new ArabicShaping.txt entry.
Added three new Unicode versions, several new scripts and a new
word-break class.

Updated UCD's test data for tst_QTextBoundaryFinder.  This left 57
tests failing; I have commented out the data rows for those tests,
pending someone with more knowledge addressing this.

Task-number: QTBUG-79631
Task-number: QTBUG-79418
Change-Id: Ic33d3b3551195d47a84d98e84020f57a68f0b201
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>
2019-10-30 17:38:02 +01:00
Lars Knoll 8bfabb34de Update most Unicode data to version 10.0
The text segmentation data is not being updated in this change,
as it requires additional code changes. Updating those will
come in a follow-up commit.

Change-Id: I5d6b6bc96044e8dd0c25cf6f79756e7f68bf6e7c
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>
2018-01-03 07:46:31 +00:00
Konstantin Ritt a98b541f26 Update Unicode data files to v8.0
Change-Id: I0aa368cb07353924031a9af4f0bdc33692eb1053
Reviewed-by: Lars Knoll <lars.knoll@theqtcompany.com>
2015-11-05 08:24:58 +00:00
Konstantin Ritt ecdd5648bd Update UCD source files to v7.0
Change-Id: I47277963c926128ad0c4ac5141835e767bb440a7
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
2015-03-27 16:39:53 +00:00
Konstantin Ritt a6046be428 Update UCD source files up to Unicode 6.3.0
Change-Id: I9ab58a659af1e758b172a24aa95bce1fea89c33d
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
2014-01-14 15:38:43 +01:00
Konstantin Ritt 2672c4fa91 Update the Unicode Data and Algorithms up to Unicode 6.2
Version 6.2 of the Unicode Standard is a special release
dedicated to the early publication of the newly encoded Turkish lira sign.
In addition, there are some significant changes to the Unicode algorithms
for text segmentation and line breaking to improve breaking for emoji symbols.

For more details, see http://www.unicode.org/versions/Unicode6.2.0/

Change-Id: I21cfd4f307e41b41a19d36cce87f7a44c2661bc2
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
2012-10-09 03:04:41 +02:00
Konstantin Ritt c9100bcce7 Update the Unicode data files up to v6.1.0
Change-Id: I20b94634b1f4ebff10757c2348cfdbbd906e8797
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
2012-06-10 15:57:54 +02:00
Qt by Nokia 38be0d1383 Initial import from the monolithic Qt.
This is the beginning of revision history for this module. If you
want to look at revision history older than this, please refer to the
Qt Git wiki for how to use Git history grafting. At the time of
writing, this wiki is located here:

http://qt.gitorious.org/qt/pages/GitIntroductionWithQt

If you have already performed the grafting and you don't see any
history beyond this commit, try running "git log" with the "--follow"
argument.

Branched from the monolithic repo, Qt master branch, at commit
896db169ea224deb96c59ce8af800d019de63f12
2011-04-27 12:05:43 +02:00