- Jan 29, 2016
-
-
Jungshik Shin authored
* Update the pre-built ICU data files for all platforms source/data/in/icudtl.dat for non-Android platforms {linux,mac}/icudt*.S for linux/mac android/icudtl.dat and android/icudt*.S for Android windows/icudt.dll for Windows * Update Android data trimming script 1. Make sure that 'default' calendar is kept in locales where it's relevant : root, th, fa, ar_SA, etc. 2. Add a minimal region data to work around a bug in ICU with pool.res handling * Update gn and gyp files * And add a TODO comment to update.sh to automate the build file update. * Add it_CH to the locale list. * Add sr_Latn to unit/reslocal.mk (required by sh) and line_normal_fi to brkitr/brklocal.mk (referred to in brkitr/fi.txt) in place of line_fi. * Update and add scripts for data building * Completely rewrite README.chromium * Check-in the prebuilt ICU data files/assembly sources for Linux,Mac,Windows,Chrome OS and Android. BUG=575007 TEST=Blink layout tests, webkit unittests TEST=All bots can build successfully TEST=net_unittests --gtest_filter="*ilenameUtil*" TEST=net_unittests --gtest_filter="*IDN*" (pending bug 336973) TEST=base_unittests --gtest_filter="*Conv*" TEST=browser_tests --gtest_filter="*ncoding*" TEST=base_unittests --gtest_filter="*essage*" TEST=ui_base_unittests --gtest_filter="*ormat*" TEST=ui_base_unittests --gtest_filter="L10n*" R=mark@chromium.org Review URL: https://codereview.chromium.org/1639543006 .
-
- Jan 21, 2015
-
-
Jungshik Shin (jungshik at google) authored
A. Converter update per HTML encoding spec along with changes in the encoding name alias table. B. Remove all the codes for converters Blink and Chromium do not need (SCSU, Lotus, ISO-2022-xx other than JP, BOCU, UTF-7, etc). This is reapplying the following CLs (that we used for ICU 52.1) to ICU 54.1 : https://codereview.chromium.org/598383002 https://codereview.chromium.org/654153002 We have two upstream bugs filed for A and B above: http://www.icu-project.org/trac/ticket/11296 http://www.icu-project.org/trac/ticket/10303 In addiition to A and B, we unified Big5 and Big5-HKSCS per the encoding spec (bug 277868). That also includes properly supporting the four 2-character sequences ( see http://crbug.com/277868#c3 ). big5_gen.sh deviates from the current spec to work around a bug in the spec. (see https://www.w3.org/Bugs/Public/show_bug.cgi?id=27878) Moreover, ucmlocal.mk is added to list only encodings we want to support. Also, tighten the state table for windows-946-2000.ucm that we use for EUC-KR for now. And, drop 'base' map for windows-{936,949}-2000.ucm. Finally, add euc-kr-html.ucm along with scripts/euckr_gen.sh, but it is not yet used pending the resolution of bug 450312. Data size checkpoint: 20,566,864 bytes (the original ICU 54=25,343,024) BUG=277868, 428145, 450312 TEST=net_unittests --gtest_filter="*ilenameUtil*" TEST=base_unittests --gtest_filter="*Conv*" TEST=browser_tests --gtest_filter="*ncoding*" TEST=Blink: fast/encoding/* R=jsbell@chromium.org, mark@chromium.org Review URL: https://codereview.chromium.org/839713003
-
- Oct 13, 2014
-
-
jshin@chromium.org authored
1. Replace the current encoding alias list (heavily patched) with our own HTML5-specific alias list. It's mostly generated from encoding.json, which is in turn derived from the WHATWG Encoding living standard. The most notable difference is that UTF-32 entries are kept until bug 417850 is resolved. Two other differences are: a. Two aliases for iso-8859-8-i (logical and csiso88598i) are not listed. They're dealt with in Blink. b. Chinese (gb*, big5*) aliases are not yet aligned to the encoding spec pending our decision on the unification of Big5 / Big5-HKSCS and GBK / GB18030. 2. Replace all the single-byte mapping tables with what's automatically generated with scripts/single-byte-gen.sh that uses index-* files downloaded from the WHATWG spec site. This will fix the decoding (ToUnicode) of windows-874 and windows-1253 while removing a lot of fallback/spurrious mapping entries in encoding direction ('FromUnicode') in a number of encodings. 3. Regenerate the ICU binary data files for Linux/Mac/Android/Windows/CrOS. 4. Remove now obsolete noop-*ucm files used to make ISO-2022-CN* decoder to turn an empty string. They're not necessary any more because ISO-2022-CN* were made 'replacement' encodings in Blink and our version of ICU does not have any code for ISO-2022-CN* any more. This cuts down the data size by 15kB. On Android, there's virtually no change in the data size because the previous data file on Android accidentally had smaller locale data for nb and ms. BUG=412053 TEST=browser_tests --gtest_filter="*ncoding*" TEST=net_unittest --gtest_filter="*ilenameUtil*" TEST=base_unittests --gtest_filter="*Conv*" TEST=Blink: fast/encoding/* TEST=http://www.w3.org/International/tests/repository/encoding/indexes/results-indexes TEST=http://www.w3.org/International/tests/repository/encoding/indexes/results-aliases TEST=http://www.w3.org/International/tests/repository/run?manifest=encoding/indexes&test=windows-1253_test TEST=http://www.w3.org/International/tests/repository/run?manifest=encoding/indexes&test=windows-874_test R=jsbell@chromium.org Review URL: https://codereview.chromium.org/598383002 git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/icu52@292447 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
-