Change-Id: I53eaaea149324d2495e794ba8bd58544e648e48e
Reviewed-by: Janne Koskinen <janne.p.koskinen@qt.io>
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
Add an #include for a header that was only accidentally included
transitively.
Pick-to: 5.15 6.0 6.1
Task-number: QTBUG-92822
Change-Id: Ie29bb0e065f2db712e9cf9539b15124ff0ced349
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
Reviewed-by: Andreas Buhr <andreas.buhr@qt.io>
Reviewed-by: Shawn Rutledge <shawn.rutledge@qt.io>
UAX #29 in Unicode 11 changed the EGC algorithm to its current form.
Although Qt has upgraded the Unicode tables all the way up to
Unicode 13, the algorithm has never been adapted; in other words,
it has been working by chance for years. Luckily, MOST
of the cases were dealt with correctly, but emoji handling
actually manages to break it.
This commit:
* Adds parsing of emoji-data.txt into the unicode table generator.
That is necessary to extract the Extended_Pictographic property,
which is used by the EGC algorithm.
* Regenerates the tables.
* Removes some obsoleted grapheme cluster break properties, and
adds the ones added in the meanwhile.
* Rewrites the EGC algorithm according to Unicode 13. This is
done by simplifying a lot the lookup table. Some rules (GB11,
GB12, GB13) can't be done by the table alone so some hand-rolled
code is necessary in that case.
* Thanks to these fixes, the complete upstream GraphemeBreakTest
now passes. Remove the "edited" version that ignored some rows
(because they were failing).
Change-Id: Iaa07cb2e6d0ab9deac28397f46d9af189d2edf8b
Pick-to: 6.1 6.0 5.15
Fixes: QTBUG-92822
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
Other affected rows have also been fixed.
Change-Id: Ie0a32f724bd2e40e7bfacfaa43a78190b58e4a21
Reviewed-by: Volker Hilsheimer <volker.hilsheimer@qt.io>
Reviewed-by: Edward Welbourne <edward.welbourne@qt.io>
Make QTBF ready for Qt6 by using qsizetype in the API and use
QStringView where it makes sense.
Change the exported API of qunicodetools to use QStringView as
well and use char16_t internally.
Change-Id: I853537bcabf40546a8e60fdf2ee7d751bc371761
Reviewed-by: Mårten Nordheim <marten.nordheim@qt.io>
src/corelib/text/qunicodetools.cpp:1243:13: warning: this statement may fall through [-Wimplicit-fallthrough=]
src/corelib/text/qunicodetools.cpp:1247:55: warning: this statement may fall through [-Wimplicit-fallthrough=]
Change-Id: I441000db46cb6d85a5dcd0534ea2168b39a3f3bd
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
This makes existing calls passing uint or ushort ambiguous, so
fix all the callers. There do not appear to be callers outside
QtBase. In fact, the ...BreakClass() functions appear to be
utterly unused.
Change-Id: I1c2251920beba48d4909650bc1d501375c6a3ecf
Reviewed-by: Lars Knoll <lars.knoll@qt.io>
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
Copy the relevant harfbuzz code over from Harfbuzz into qunicodetools.cpp
This is basically the attribute functions from the different harfbuzz
shapers. Those methods do not require any font support but operate
purely on unicode input data.
Adjusted the code to use Qt's own data structures and enums (QChar::Script
and friends) instead of the harfbuzz equivalents.
The code is 100% copyright The Qt Company, so we can do this without
requiring any attribution.
Change-Id: I8262ba34eae1837f031f07d1b6d9917c0224e160
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
This avoids one additional copy of data that we've been doing before.
Change-Id: I3fae0ebe0cded632b41fdcf7efc01d5c7f2dc181
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
Had to teach the update program to accept category Lm as for
Joining_Transparent, for the sake of a new ArabicShaping.txt entry.
Added three new Unicode versions, several new scripts and a new
word-break class.
Updated UCD's test data for tst_QTextBoundaryFinder. This left 57
tests failing; I have commented out the data rows for those tests,
pending someone with more knowledge addressing this.
Task-number: QTBUG-79631
Task-number: QTBUG-79418
Change-Id: Ic33d3b3551195d47a84d98e84020f57a68f0b201
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@qt.io>