- Apr 11, 2017
-
-
Jungshik Shin authored
Delete empty units,units{Narrow,Short} blocks after trimming units data. Empty units* blocks in en_GB and a few other locales after trimming causes ICU to fail to fall back to get the duration data for those locales. In addition, fix source/data/translit/root_subset.txt. Rule*Ids block has to be present even though it's empty. When dropping Hans-Hant transform rules, root_subset.txt was changed to be completely empty, which broke "components_unittests --g_test_filter=AutofillProfileComparato*" . With these changes, regenerate ICU data files. The size is slightly smaller. android/icudtl.dat 6573872 => 6573792 common/icudt*dat 10130560 => 10130480 BUG=707515,677043,684609 TEST=components_unittests --gtest_filter=AutofillProfileComparato* TEST=ui_base_unittests --gtest_filter=L10nUtilTest.TimeDurationForm* R=derat@chromium.org Review-Url: https://codereview.chromium.org/2812943003 .
-
- Oct 28, 2016
-
-
Jungshik Shin authored
Follw-up to https://chromium.googlesource.com/chromium/deps/icu/+/5feb9ad5 (due to a rietveld issue, part 1 was manually pushed). Update ICU to 58.1 release from ICU 56.1 part2. Listed below a tiny subset of what's new in 58.1: 1. Unicode 9.0 from Unicode 8.0 - Updated character properties including Emoji data up to 4.0beta. - Updated grapheme/word/line breaking rules for Emoji sequences and others. 2. CLDR 30.0.2 from CLDR 28 - Numerous locale data updates/improvements 3. Spoofing API changes 4. Greek uppercasing support as a part of regular case-mapping API. 5. Line breaking rule file format optimization. This change enables me to add CJ loose line breaking rules back (previously, it's dropped to save space) so that Blink can use it for CJ. See http://site.icu-project.org/download/58 for more details on ICU 58.1 and http://site.icu-project.org/download/57 for more details on ICU 57.1 For CLDR 30, see http://cldr.unicode.org/index/downloads/cldr-30 . The size impact: Non-Android: 10,127,200 => 10,128,624 (delta = 1,424 / 0.014%) Android: 6,563,152 => 6,571,936 (delta = 8,784 / 0.13%) Below are the list of changes made on top of the upstream ICU 58.1 in reverse order. Most of these changes were made in 58staging branch to run trybots and cherry-picked back for this CL. See https://chromium.googlesource.com/chromium/deps/icu/+/log/chromium/58staging https://codereview.chromium.org/2447513002/ : cr+blink update cl with 58staging branch head. * Fix a build on Win without std::string (v8) * Add ms932 alias to Shift_JIS * Apply Google-specific locale data patches * Fix a bug in scriptset * Update windows-1255 mapping * Disable C4333 warning by MSVC (harmless) * Apply and update utf32.patch and README.chromium * Update and apply vscomp.patch stringpiece patch removed. VS2015 seems to be fine with a redefinition. * Update pre-built ICU data files Update *local.mk with a new copyright line * Apply more patches The following patches were applied and updated: data_symb, vscomp, wpo The unnecessary part was dropped from vscomp * Update BUILD.gn and icu.gyp* files * Update android/brkitr.patch * Update and apply more patches * Update and apply cjdict.patch Apply data.build.patch * Delete obsolete patches: cmemory,regex * Update README.chromium and apply brkitr patches - Update README.chromium - Remove obsolete patches - Update linebrk.patch and apply it: add back line_loose_cj * Update wordbrk.patch and apply it * Update and apply khmer-dictbe.patch * Update data trimming - android/patch_locale.sh - scripts/trim_data.sh ExemplarCh* removed charac*Label removed relative/relativeTime removed for daysOfWeek and quarter * Update the following patches android/brkitr.patch patches/linebrk.patch patches/data.build.patch * Update cjdict.patch and linebrk.patch BUG=637001 TEST=Layout tests, all unittests, browser tests, ui tests. R=jsbell@chromium.org, mark@chromium.org Review URL: https://codereview.chromium.org/2442923002 .
-
- Jan 29, 2016
-
-
Jungshik Shin authored
Make the tree ready for the application Google's and Chrome's data and post-56 code patches. 1. Fix trim_data.sh to run from anywhere. 2. Update patch_locale.sh for Android and add en_IN to the locale list 3. Apply data.build.patch 4. Exclude non-UI locale data for unit locale category 5. Add some regional variant locales to locale, unit, zone and coll. 6. Update locale lists for locale, unit, zone, and coll BUG=575007 TEST=None R=mark@chromium.org Review URL: https://codereview.chromium.org/1624643003 .
-
- Feb 19, 2015
-
-
Jungshik Shin (jungshik at google) authored
data/lang/en_GB.txt has an empty "Languages" block leading getDisplay{Name,Language} to fail in en-GB. Update trim_data.sh to remove an empty "Languages" block and run the script to fix data/lang/en_GB.txt and other locales if any. (only en_GB.txt is affected). Rebuild the icu data with the above changes for both Android and non-Android platforms. BUG=428145 TEST=linux_chromeos bots: browser_tests --gtest_filter=*GetUILang* TBR=mark@chromium.org Review URL: https://codereview.chromium.org/930203004
-
- Jan 23, 2015
-
-
Jungshik Shin (jungshik at google) authored
1. Add {coll,curr,lang,locales,rbnf,region,sprep,translit,unit,zone}/*local.mk to exclude locale data for languages/locales that Chromium does not need. 2. Run scripts/trim_data.sh to cut down the data size further by excluding unused entries in each locale files. - Keep the display names for languages/scripts/locales in Chrome's Accept-Language list and remove the display names outside the set. - Minimize the locale data in data/{locales,lang} for non-UI languages in the A-L list. For them, we just need the "native" display name and exemplar character set. - Exclude historic, obscure and otherwise unnecessary currency display names. - Drop unnecessary Chinese collation rules; Big5/GB2312/UniHan. - Keep only the minimal unit data for duration and compound units. 3. Add css3transform.txt to data/translit for Greek upper/lowercasing support. 4. Add the minimal locale data for ckb and ku. 5. The tz db was updated previously to 2014j (the latest) so that no change is made except for README.chromium update. 6. Add the minimal locale data for ckb and ku. 7. Check in the pre-built data (icudtl.dat) shared by all non-Android platforms and assembly files for Linux/Mac The final data size is 10,255,584 bytes, which is about 200kB smaller than that for ICU 52.1. The pristine upstream ICU has the data of 25,343,024 bytes. The remaining steps are to build a smaller data file for Android and to build icudtl.dll for Windows (non-default build option). BUG=428145 TEST=net_unittests --gtest_filter="*ilenameUtil*" TEST=net_unittests --gtest_filter="*IDN*" TEST=base_unittests --gtest_filter="*Conv*" TEST=browser_tests --gtest_filter="*ncoding*" TEST=Blink: layout tests R=mark@chromium.org Review URL: https://codereview.chromium.org/872903002
-
- May 05, 2014
-
-
jshin@chromium.org authored
I was too aggressive in trimming the data and dropped the display names for languages that Chromium needs (for non-UI languages that are in the A-L list). It's not my intention (the comment in trim_data.sh said one thing, but the code did another). Besides, add Norweigian (nb) and Malay (ms) locale data that were not included by mistake. Also update trim_data.sh script NOT to drop 'ALIAS' lines which are used to indicate that a given locale is an alias to another locale. That also required adding ro_MD.txt (null locale which mo.txt is aliased to). The above three adds about 110kB to the icu data (from 10.3MB to 10.4MB). Also update the pre-built icu data files for Linux, Mac and Windows. The Android data will be updated in a follow-up patch. BUG=132145 TEST=When ICU is rolled, unit_tests:ExtensionL10* pass. TBR=mark Review URL: https://codereview.chromium.org/264973016 git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/icu52@268285 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
-
- Apr 22, 2014
-
-
jshin@chromium.org authored
Add 'filter_locale_data' function to trim_data.sh Chromium/Blink do not use most of unit* sections in locale data. Keep only duration and compound sub-sections. Update the icudtl.dat and two assembly source files for Mac/Linux. It saves ~200kB (uncompressed). 7z-compressed size reduction is 34kB. With all these changes (up to this CL) applied, the net increase of the ICU data from icu 46 to 52 is 49kB with 7z-compressed. (3,070,246 vs 3,021,457) and ~ 390kB uncompressed (10,370,656 vs 9,980,368 ). BUG=132145 TEST=None. TBR=mark Review URL: https://codereview.chromium.org/247663002 git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/icu52@265354 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
-
- Apr 18, 2014
-
-
jshin@chromium.org authored
1. {big5,gb2312}han collation data is not used by anybody because they're useless as a sorting order. Add a function to trim_data.sh to remove them from zh.txt 2. Remove remove_unihan.sh and add back unihan rules to coll/{zh,ja,ko}.txt. In ICU 52, tools/genrb does NOT include unihan collation by default so that we don't have to bother to remove it from the rule files. 3. Remove obsolete patch files (locale[23].patch) 4. Add LICENSE file (converted from license.html) 5. Update README.chromium accordingly. 6. Check in the updated data file/assembly files. The net saving in icudtl.dat is ~ 220kB. BUG=132145 TEST=icudtl.dat is 10576480 TBR=mark Review URL: https://codereview.chromium.org/243763002 git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/icu52@264857 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
-
jshin@chromium.org authored
Add a shell script to trim the ICU data further : trim_data.sh along with locale list files. The script does the following: 1. Remove the display names of languages NOT listed in Chrome's Accept-Language list. (800kB) 2. Minimize the locale data for locales listed in the A-L list that are not a UI locale in Chrome. For those locales, exemplar characters, the display name in the native language and layout direction are included. (640kB) 3. Filter the region data to drop numeric region display names other than 419 (Latin-America). (50kB) 4. Filter the currency data (display name and plurals) for historic currencies. (200kB) This CL also checks in icudtl.dat (source/data/in) and icudt_dat.S (mac and linux). Note that I dropped '52' (the version number) in the assembly source file name and icu.gyp was adjusted accordingly. With all these changes, icudtl.dat is ~ 800kB larger than that in ICU 4.6. The 7z compression (as used by the installer) makes the size difference go down to ~ 130kB. BUG=132145 TEST=The icudtl.dat (uncompressed) is about 10.7MB instead of 12.4MB without this CL. R=mark@chromium.org Review URL: https://codereview.chromium.org/239543018 git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/icu52@264811 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
-