Commit Graph

24 Commits (c83c08d84fe00a4033d7ba61fd95c150f643736d)

Author SHA1 Message Date
Konstantin Ritt 2e0a4b13ad [2/2] Implement Unicode Normalization Form Quick Check (NF QC)
Use QuickCheck data from DerivedNormalizationProps.txt to check
if the input text is already in the desired Normalization Form.
\sa http://www.unicode.org/reports/tr15/#Detecting_Normalization_Forms

Using NF QC makes a significant boost to most operations that rely on
normalized input data, i.e. file path conversions on Mac, where "native"
form is a decomposed Unicode string.

Change-Id: I292a9da479c6beed730528fc7000c45bf1befc34
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
2013-08-14 22:46:58 +02:00
Frederik Gladhorn c608ec8254 Merge remote-tracking branch 'origin/stable' into dev
Conflicts:
	src/corelib/io/qsavefile_p.h
	src/corelib/tools/qregularexpression.cpp
	src/gui/util/qvalidator.cpp
	src/gui/util/qvalidator.h

Change-Id: I58fdf0358bd86e2fad5d9ad0556f3d3f1f535825
2013-01-22 18:40:13 +01:00
Sergio Ahumada 48e0c4df23 Update copyright year in Digia's license headers
Change-Id: Ic804938fc352291d011800d21e549c10acac66fb
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
2013-01-18 09:07:35 +01:00
Konstantin Ritt 9b0fab6b62 Update Qt internals to use QChar::Script
...and remove the outdated QUnicodeTables::Script enum.
QFontEngineData now has one extra slot that never used
(engines[QChar::Script_Inherited]). engines[QChar::Script_Unknown],
if accessed, would be set with a Box engine instance, and could be used
as a minor optimization some time later.

In order to preserve the existing behavior, we map all scripts up to Latin to Common.

Change-Id: Ide4182a0f8447b4bf25713ecc3fe8097b8fed040
Reviewed-by: Pierre Rossi <pierre.rossi@gmail.com>
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
2012-12-21 19:01:35 +01:00
Konstantin Ritt f7639c0a6d Add QChar::Script enum
...where the values are not aliased to Common script.

The old QUnicodeTables::Script enum was retained for compatibility reasons
until Qt internals are updated to use QChar::script().
Using QChar::Script instead of QUnicodeTables::Script would improve both
the text analysis (itemization, boundary finding) and the text shaping quality.
This also a required step for switching to Hurfbuzz-NG.

/* This adds 6668 more .rodata bytes */

Change-Id: I5aa3d12c550528d0052542436990f8d0779ea8e5
Reviewed-by: Eskil Abrahamsen Blomfeldt <eskil.abrahamsen-blomfeldt@digia.com>
2012-12-20 14:48:32 +01:00
Konstantin Ritt 2fbb69a093 QTBF: Fix issue with no splitting the words at "." (FULL STOP)
As of Unicode 5.1, some punctuation marks were mapped to MidLetter and MidNumLet
for better URL and abbreviations handling which caused "hi.there" to be treated
like if it were just a single word;
until we have the Unicode Text Segmentation tailoring mechanism, retain
the old behavior by remapping (some of) those characters back to their old values.

Change-Id: I49dea6064f2ea40a82fc0b1bc3c4f0b4e803919f
Reviewed-by: David Faure <david.faure@kdab.com>
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
2012-11-23 11:59:50 +01:00
Konstantin Ritt 2672c4fa91 Update the Unicode Data and Algorithms up to Unicode 6.2
Version 6.2 of the Unicode Standard is a special release
dedicated to the early publication of the newly encoded Turkish lira sign.
In addition, there are some significant changes to the Unicode algorithms
for text segmentation and line breaking to improve breaking for emoji symbols.

For more details, see http://www.unicode.org/versions/Unicode6.2.0/

Change-Id: I21cfd4f307e41b41a19d36cce87f7a44c2661bc2
Reviewed-by: Thiago Macieira <thiago.macieira@intel.com>
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
2012-10-09 03:04:41 +02:00
Iikka Eklund be15856f61 Change copyrights from Nokia to Digia
Change copyrights and license headers from Nokia to Digia

Change-Id: If1cc974286d29fd01ec6c19dd4719a67f4c3f00e
Reviewed-by: Lars Knoll <lars.knoll@digia.com>
Reviewed-by: Sergio Ahumada <sergio.ahumada@digia.com>
2012-09-22 19:20:11 +02:00
Konstantin Ritt b57e2162ef QUnicodeTables: some internal API renamings
enums GraphemeBreak, WordBreak, and SentenceBreak has been renamed to
GraphemeBreakClass, WordBreakClass, and SentenceBreakClass respectively,
their values has been renamed to contain a '_' as logical enum-value separator
(just like many other nums in Qt, e.g. LineBreakClass);
*BreakFormat has been replaced with *Break_Extend (some format characters are
kind of subtype of the extender characters, not vice versa).

Change-Id: I9ddbcf8848da87409736c2d6d1798a62fa28cab8
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
2012-06-22 09:47:59 +02:00
Konstantin Ritt f00012aa89 Regenerate the Unicode tables
Change-Id: I64b93ba8ec85eff5e308d92c57e98e8745c43d66
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
2012-06-14 05:22:15 +02:00
Konstantin Ritt c631927b76 Regenerate the Unicode tables with UCD 6.1.0
Task-number: QTBUG-1963
Task-number: QTBUG-5472
Task-number: QTBUG-12144
Task-number: QTBUG-18360
Task-number: QTBUG-23654

Change-Id: Ida09ad657c4b012eca654fcb79608b7cdeb5d60d
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
2012-06-10 15:58:17 +02:00
Konstantin Ritt 2b15c1b30f Move ScriptSentinel enum from header to .cpp
Change-Id: Ic74e8e2471e92aa2014735f6ab0bb4f3b88de206
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
2012-05-25 21:48:44 +02:00
Konstantin Ritt 8c0048a377 add some useful methods to QUnicodeTables::
in order to reduce code duplication and prepare the ground for upcoming changes

Change-Id: I980244149f65384c9484bbec4682de8b7b848b08
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
2012-05-10 11:34:25 +02:00
Konstantin Ritt 46b78113b2 add support for non-BMP ligatures
> http://www.unicode.org/versions/Unicode5.2.0/
D. Character Additions:
There are three new characters in the newly-encoded Kaithi script that will
require changes in implementations which make hard-coded assumptions about
composition during normalization. Most new characters added to the standard
with decompositions cannot be generated by the operations toNFC() or toNFKC),
but these three can. Implementers should check their code carefully
to ensure that it handles these three characters correctly.
U+1109A KAITHI LETTER DDDHA
U+1109C KAITHI LETTER RHA
U+110AB KAITHI LETTER VA

UCD 6.1 adds two more of them:
U+1112E CHAKMA VOWEL SIGN O
U+1112F CHAKMA VOWEL SIGN AU

Change-Id: I781a26848078d8b83a182b0fd4e681be2a6d9a27
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
2012-05-04 15:24:52 +02:00
Konstantin Ritt f948bb3c6c qunicodetables generator: improve the output and the generated code
better memory usage report;
an additional asserts with conditions the implementation is depends on;
a namespace for the internal static data;
styling fixes

Change-Id: Id4048ff6104c56b5f590f9ac6fbf7c0bce79ec47
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
Reviewed-by: Konstantin Ritt <ritt.ks@gmail.com>
2012-04-24 21:45:00 +02:00
Konstantin Ritt 73b24486ed UCD-5.0: apply Corrigendum #6
http://unicode.org/versions/corrigendum6.html:
> in Unicode 5.0, the list of characters with the Bidi_Mirrored property
> was made consistent for brackets and quotation marks, in preparation for
> new constraints on bidi mirroring. However, after publication of
> Unicode 5.0.0 it was discovered that this change adversely affected
> several quotation mark characters in deployed data.

Task-number: QTBUG-25169
Change-Id: Id49caf401af2d5a1e6dbcc32b2f350aa20b7f901
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
2012-04-15 03:53:07 +02:00
Konstantin Ritt 3b778df102 minor improvement for NormalizationCorrections
let's don't hardcode the latests affected version value and simply use
the one parsed from NormalizationCorrections.txt

Change-Id: I37021e8238d77deada4c5ba7a2d160c87186b9dd
Reviewed-by: Lars Knoll <lars.knoll@nokia.com>
2012-04-11 01:42:12 +02:00
Oswald Buddenhagen 249489d141 regenerate unicode tables after rittk's patches
Change-Id: I60b416fc2dc2f0ccbcf13288a9ba2a42547269ec
Reviewed-by: Olivier Goffart <ogoffart@woboq.com>
2012-02-21 22:31:00 +01:00
Jason McDonald 5635823e17 Remove "All rights reserved" line from license headers.
As in the past, to avoid rewriting various autotests that contain
line-number information, an extra blank line has been inserted at the
end of the license text to ensure that this commit does not change the
total number of lines in the license header.

Change-Id: I311e001373776812699d6efc045b5f742890c689
Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
2012-01-30 03:54:59 +01:00
Jason McDonald 629d6eda5c Update contact information in license headers.
Replace Nokia contact email address with Qt Project website.

Change-Id: I431bbbf76d7c27d8b502f87947675c116994c415
Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
2012-01-23 04:04:33 +01:00
Jason McDonald 1fdfc2abfe Update copyright year in license headers.
Change-Id: I02f2c620296fcd91d4967d58767ea33fc4e1e7dc
Reviewed-by: Rohan McGovern <rohan.mcgovern@nokia.com>
2012-01-05 06:36:56 +01:00
Olivier Goffart 4bb5167177 Regenerate unicode tables
Required after d17c76feee

Change-Id: I6820b5cbb78636aa5348bdddf8824ea5c78ba934
Reviewed-on: http://codereview.qt.nokia.com/1673
Reviewed-by: Qt Sanity Bot <qt_sanity_bot@ovi.com>
Reviewed-by: Olivier Goffart <olivier.goffart@nokia.com>
2011-07-15 09:49:42 +02:00
Jyri Tahtela f9f395c28b Update licenseheader text in source files for qtbase Qt module
Updated version of LGPL and FDL licenseheaders.
Apply release phase licenseheaders for all source files.

Reviewed-by: Trust Me
2011-05-24 12:34:08 +03:00
Qt by Nokia 38be0d1383 Initial import from the monolithic Qt.
This is the beginning of revision history for this module. If you
want to look at revision history older than this, please refer to the
Qt Git wiki for how to use Git history grafting. At the time of
writing, this wiki is located here:

http://qt.gitorious.org/qt/pages/GitIntroductionWithQt

If you have already performed the grafting and you don't see any
history beyond this commit, try running "git log" with the "--follow"
argument.

Branched from the monolithic repo, Qt master branch, at commit
896db169ea224deb96c59ce8af800d019de63f12
2011-04-27 12:05:43 +02:00