Skip to content
Snippets Groups Projects
  1. Jul 17, 2017
  2. May 14, 2017
    • Jungshik Shin's avatar
      Update ICU to 59.1 · 87232d8d
      Jungshik Shin authored
      * Highlights:
        - Emoji 5.0 data (partial; Emoji_Component property not included)
        - CLDR 31.0.1 (http://blog.unicode.org/2017/03/cldr-version-31-released.html)
          UTC and GMT are treated as distinct)
        - New case mapping API for styled text
        - C++ 11 is required
        - char16_t for UChar (UTF-16)
        - Source code is in UTF-8
      
      * Size changes
      
        common: 10,130,560 => 10,175,056
        android: 6,573,872 => 6,616,864
        iOS: 6,562,352 => 6,605,152
      
      On top of ICU 59.1 from the upstream, the following changes were applied.
      See https://chromium.googlesource.com/chromium/deps/icu/+log/chromium/59staging
      
        - Fix C++ 11 string literal assignment issue (upstream bug: 13192)
        - Fix C4229 warning by MSVC
        - Apply utf32.patch and include unistr.h in fuzzer_util
        - Update ICU data files
        - Fix wpo.patch
        - Apply Google locale patch and locale1.patch
        - update readme
        - Apply breakiterator related patches
        - Apply and update wpo.patch
        - Drop unused patch, apply data.build.win.patch, update README.chromium
        - Add /utf-8 flag for Windows/Visual Studio
        - Update BUILD.gn for UChar, stubdata and apply data_sym.patch
        - use stubdata.cpp instead of stubdata.c in icu.gyp
        - Update icu.gyp* files for v8
        - Update BUILD.gn, apply data.build.patch and vscomp.patch
        - Add new files in ICU 59.1
        - Get a fresh copy of ICU 59.1 from the upstream
        - Update update.sh script
      
      TBR=drott@chromium.org, yangguo@chromium.org
      Bug:699469
      TEST: layout tests, all unittests, browser tests
      Change-Id: Ie1e77323aa0c7f872153680c4deca6471a771a5c
      Reviewed-on: https://chromium-review.googlesource.com/505173
      
      
      Reviewed-by: default avatarJungshik Shin <jshin@chromium.org>
      87232d8d
  3. May 05, 2017
    • Jungshik Shin's avatar
      Customize the ICU data for iOS · 4b06aadd
      Jungshik Shin authored
      Add ios/icudtl.dat and ios/patch_locale.sh.
      
      Update README.chromium and BUILD.gn accordingly.
      Update scripts/copy_data.sh to take (ios|common|android).
      
      At the moment, iOS data is almost identical to that of Android, but in the future
      more cuts may be made (e.g. dictionary data for breakiterator).
      
      Bug: 718955
      TEST: iOS Chrome works as before.
      Review-Url: https://codereview.chromium.org/2743123002 .
      4b06aadd
  4. Apr 11, 2017
    • Jungshik Shin's avatar
      Update trim_data to deal with locale fallback failure for units · d5c238dc
      Jungshik Shin authored
      Delete empty units,units{Narrow,Short} blocks after trimming units data.
      Empty units* blocks in en_GB and a few other locales after trimming
      causes ICU to fail to fall back to get the duration data for those
      locales.
      
      In addition, fix source/data/translit/root_subset.txt. Rule*Ids block has
      to be present even though it's empty. When dropping Hans-Hant transform
      rules, root_subset.txt was changed to be completely empty, which broke
      "components_unittests --g_test_filter=AutofillProfileComparato*" .
      
      With these changes, regenerate ICU data files. The size is slightly smaller.
      
      android/icudtl.dat  6573872 => 6573792
      common/icudt*dat    10130560 => 10130480
      
      BUG=707515,677043,684609
      TEST=components_unittests --gtest_filter=AutofillProfileComparato*
      TEST=ui_base_unittests --gtest_filter=L10nUtilTest.TimeDurationForm*
      R=derat@chromium.org
      
      Review-Url: https://codereview.chromium.org/2812943003 .
      d5c238dc
  5. Mar 07, 2017
  6. Feb 21, 2017
  7. Oct 28, 2016
    • Jungshik Shin's avatar
      ICU update to 58 part 2 · e0d9b90c
      Jungshik Shin authored
      Follw-up to https://chromium.googlesource.com/chromium/deps/icu/+/5feb9ad5
      (due to a rietveld issue, part 1 was manually pushed).
      
      Update ICU to 58.1 release from ICU 56.1 part2.
      
      Listed below a tiny subset of what's new in 58.1:
      
        1. Unicode 9.0 from Unicode 8.0
          - Updated character properties including Emoji data up to 4.0beta.
          - Updated grapheme/word/line breaking rules for Emoji sequences and others.
      
        2. CLDR 30.0.2 from CLDR 28
          - Numerous locale data updates/improvements
      
        3. Spoofing API changes
        4. Greek uppercasing support as a part of regular case-mapping API.
        5. Line breaking rule file format optimization. This change enables me
           to add CJ loose line breaking rules back (previously, it's dropped
           to save space) so that Blink can use it for CJ.
      
      See http://site.icu-project.org/download/58 for more details on ICU 58.1
      and http://site.icu-project.org/download/57 for more details on ICU 57.1
      
      For CLDR 30, see http://cldr.unicode.org/index/downloads/cldr-30 .
      
      The size impact:
         Non-Android: 10,127,200 => 10,128,624 (delta = 1,424 / 0.014%)
         Android: 6,563,152 => 6,571,936 (delta = 8,784 / 0.13%)
      
      Below are the list of changes made on top of the upstream ICU 58.1
      in reverse order. Most of these changes were made in 58staging branch
      to run trybots and cherry-picked back for this CL. See
         https://chromium.googlesource.com/chromium/deps/icu/+/log/chromium/58staging
         https://codereview.chromium.org/2447513002/ : cr+blink update cl with
             58staging branch head.
      
      * Fix a build on Win without std::string (v8)
      * Add ms932 alias to Shift_JIS
      
      * Apply Google-specific locale data patches
      
      * Fix a bug in scriptset
      
      * Update windows-1255 mapping
      
      * Disable C4333 warning by MSVC (harmless)
      
      * Apply and update utf32.patch and README.chromium
      
      * Update and apply vscomp.patch
        stringpiece patch removed. VS2015 seems to be fine with a redefinition.
      
      * Update pre-built ICU data files
         Update *local.mk with a new copyright line
      
      * Apply more patches
        The following patches were applied and updated: data_symb, vscomp, wpo
      
        The unnecessary part was dropped from vscomp
      
      * Update BUILD.gn and icu.gyp* files
      
      * Update android/brkitr.patch
      
      * Update and apply more patches
      
      * Update and apply cjdict.patch
         Apply data.build.patch
      
      * Delete obsolete patches: cmemory,regex
      
      * Update README.chromium and apply brkitr patches
      
        - Update README.chromium
        - Remove obsolete patches
        - Update linebrk.patch and apply it: add back line_loose_cj
      
      * Update wordbrk.patch and apply it
      
      * Update and apply khmer-dictbe.patch
      
      * Update data trimming
      
        - android/patch_locale.sh
        - scripts/trim_data.sh
           ExemplarCh* removed
           charac*Label removed
           relative/relativeTime removed for daysOfWeek and quarter
      
      * Update the following patches
      
        android/brkitr.patch
        patches/linebrk.patch
        patches/data.build.patch
      
      * Update cjdict.patch and linebrk.patch
      
      BUG=637001
      TEST=Layout tests, all unittests, browser tests, ui tests.
      R=jsbell@chromium.org, mark@chromium.org
      
      Review URL: https://codereview.chromium.org/2442923002 .
      e0d9b90c
  8. Oct 23, 2016
    • Jungshik Shin's avatar
      Update ICU to 58 part1 · 5feb9ad5
      Jungshik Shin authored
      * Note that this CL will be followed by CLs with local changes.
        Until then, ICU should not be rolled in DEPS. See READ_THIS_FIRST
        for details.
      
      * Adjust scripts/update.sh and scripts/data_files_to_preserve.txt
        - CLDR/ICU added ckb/ast locale data. Drop them from the list to preserve.
        - source/layout does not exist in 58.1 any more.
      
      * Update the tree to ICU 58.1 from the upstream by running
        scripts/update.sh
      
      * Update README.chromium and add READ_THIS_FIRST to warn about the
        status of the tree.
      
      BUG=637001
      TEST=None
      5feb9ad5
  9. Oct 21, 2016
    • Jungshik Shin's avatar
      Delete Visual Studio build files · f8aa31da
      Jungshik Shin authored
      There's no need for VS build files.
      Besides, update scripts/update.sh to post-edit source/configure
      for missing test/ directory.
      
      This clean up is necessary to get 'git cl upload/rietveld' work
      smoothly in an upcoming ICU update.
      
      BUG=637001
      TEST=source/runConfigureICU Linux --disable-tests --disable-layout
      
      Review URL: https://codereview.chromium.org/2443653002 .
      f8aa31da
    • Jungshik Shin's avatar
      Delete source/test · 2e57f555
      Jungshik Shin authored
      We don't use source/test. It's kept to give API usage examples, but
      it got in the way of a version update (git cl upload keeps timing out).
      
      Also, update update.sh to delete source/test after downloading a new
      version from the upstream.
      
      BUG=637001
      TEST=None
      
      Review URL: https://codereview.chromium.org/2435373002 .
      2e57f555
  10. Jul 27, 2016
    • Jungshik Shin's avatar
      Big Endian support part 4 · 3655cfba
      Jungshik Shin authored
      Delete three pre-built assembly source files because they're now
      generated at build-time.
      
      Update data build scripts and README.chromium accordingly.
      
      Update copy_data.sh and copy_data_android.sh so that the assembly
      source files are not copied. Besides, convert the little endian
      data bundle to the big endian data bundle for non-Android platforms.
      
      BUG=v8:4828
      TEST=Rebuild icu data following the procedure in README.chromium
      TEST='gn args <builddir>' with icu_use_data_file set to true or false
      TEST=build base_unittests and run with --gtest_filter=ICU*
      TEST=build base_unittests and run with --gtest_filter=Message*ormat*
      TEST=build 'd8' (v8) and try `(new Date()).toLocaleString("de")`
      
      Review URL: https://codereview.chromium.org/2182883004 .
      3655cfba
  11. Jul 22, 2016
  12. Jul 21, 2016
    • Miran Karic's avatar
      Add big endian support · e7d37b69
      Miran Karic authored
      Add a script that generates an assembly file from a .dat file. This is
      needed for generating big endian assembly file after using icupkg to
      convert little endian icudtl.dat to big endian icudtb.dat. Also the
      icu.gyp file is modified so big endian architectures use appropriate
      files.
      
      Patch by miran.karic@  ( https://codereview.chromium.org/1967523002/)
      with a couple of fixes:
      
      1. Two errors mentioned against PS#9 in the above CL.
      2. Support copying icu data file for Big Endian targets.
      
      Besides, icudtb.dat was added to common. icudtb.dat was created by
      running 'icupkg -tb icudt56l.dat icudt56b.dat' and renaming icudt56b.dat
      to icudtb.dat.
      
      BUG=v8:4828
      TEST='d8' is built correct with icu_use_data_file set to either 0 or 1.
      TEST=run `GYP_DEFINES="target_arch=mips" ./gypfiles/gyp_v8` and make sure
      that ninja files use 'b' data/assembly file for Big Endian.
      
      Review URL: https://codereview.chromium.org/2162393003 .
      
      Patch from Miran Karic <miran.karic@imgtec.com>.
      e7d37b69
  13. May 20, 2016
    • Jungshik Shin's avatar
      Update IANA timezone DB to 2016d · 54f86bb1
      Jungshik Shin authored
      What's new in 2016d is found at
      
        http://mm.icann.org/pipermail/tz-announce/2016-April/000038.html
      
      Rebuilt ICU data/assembly files are checked in (not shown in the
      codereview due to their sizes).
      
      While I'm at it, add scripts/LICENSE file that is identical to
      LICENSE at the top of the Chromium tree. Because LICENSE in
      third_party/icu is for ICU and is not applicable to files in scripts/.
      
      BUG=473288
      TBR=mark
      TEST=In JavaScript console, run the following.
            apr30_2016_1200 = new Date("04/30/2016 12:00Z")
            may01_2016_1200 = new Date("05/01/2016 12:00Z")
            apr30_2016_1200.toLocaleString("en", {timeZone: "America/Caracas"})
            may01_2016_1200.toLocaleString("en", {timeZone: "America/Caracas"})
      
        On April 30, 2016, Caracas is 4:30 behind UTC. On May 1, it's 4:00 behind.
      
      Review URL: https://codereview.chromium.org/1985243002 .
      54f86bb1
  14. Mar 25, 2016
  15. Feb 04, 2016
  16. Jan 29, 2016
    • Jungshik Shin's avatar
      ICU 56 step 6:Check in the pre-built ICU data · d2c18300
      Jungshik Shin authored
      * Update the pre-built ICU data files for all platforms
      
        source/data/in/icudtl.dat for non-Android platforms
        {linux,mac}/icudt*.S for linux/mac
        android/icudtl.dat and android/icudt*.S for Android
        windows/icudt.dll for Windows
      
      * Update Android data trimming script
      
        1. Make sure that 'default' calendar is kept in locales where it's relevant
          : root, th, fa, ar_SA, etc.
        2. Add a minimal region data to work around a bug in ICU with pool.res
           handling
      
      * Update gn and gyp files
      * And add a TODO comment to update.sh to automate the build file update.
      * Add it_CH to the locale list.
      * Add sr_Latn to unit/reslocal.mk (required by sh) and
        line_normal_fi to brkitr/brklocal.mk (referred to in brkitr/fi.txt) in
        place of line_fi.
      
      * Update and add scripts for data building
      * Completely rewrite README.chromium
      * Check-in the prebuilt ICU data files/assembly sources for
        Linux,Mac,Windows,Chrome OS and Android.
      
      BUG=575007
      TEST=Blink layout tests, webkit unittests
      TEST=All bots can build successfully
      TEST=net_unittests --gtest_filter="*ilenameUtil*"
      TEST=net_unittests --gtest_filter="*IDN*" (pending bug 336973)
      TEST=base_unittests --gtest_filter="*Conv*"
      TEST=browser_tests --gtest_filter="*ncoding*"
      TEST=base_unittests --gtest_filter="*essage*"
      TEST=ui_base_unittests --gtest_filter="*ormat*"
      TEST=ui_base_unittests --gtest_filter="L10n*"
      R=mark@chromium.org
      
      Review URL: https://codereview.chromium.org/1639543006 .
      d2c18300
    • Jungshik Shin's avatar
      ICU 56 step 4: Apply post-56 fixes for measure/date format · 825221bb
      Jungshik Shin authored
      1. Apply post-56 patches from the trunk for measure/date format
          http://bugs.icu-project.org/trac/ticket/11986
          http://bugs.icu-project.org/trac/ticket/12031
          http://bugs.icu-project.org/trac/ticket/12030
          http://bugs.icu-project.org/trac/ticket/12041
      2. Generate a combined patch (measure_format.patch) for the above.
      3. Split locale_google.patch into 'locale_google.patch' and
         'relative_date.patch'. The latter is taken from Android.
      4. Update README.chromium
      
      Besides, apply two local patches : {tzdetect,xlit..}.patch and
      adjust gb18030.ucm and the corresponding patch
      
      Also, remove obsolte patches and update README.chromium
      
      BUG=575007
      R=mark@chromium.org
      
      Review URL: https://codereview.chromium.org/1621943002 .
      825221bb
    • Jungshik Shin's avatar
      ICU 56 step 2 · 27b09232
      Jungshik Shin authored
      Make the tree ready for the application Google's and Chrome's data
      and post-56 code patches.
      
      1. Fix trim_data.sh to run from anywhere.
      2. Update patch_locale.sh for Android and add en_IN to the locale list
      3. Apply data.build.patch
      4. Exclude non-UI locale data for unit locale category
      5. Add some regional variant locales to locale, unit, zone and coll.
      6. Update locale lists for locale, unit, zone, and coll
      
      BUG=575007
      TEST=None
      R=mark@chromium.org
      
      Review URL: https://codereview.chromium.org/1624643003 .
      27b09232
  17. Jan 07, 2016
  18. Dec 14, 2015
  19. Jun 04, 2015
  20. Apr 02, 2015
    • Jungshik Shin (jungshik at google)'s avatar
      Update tz db to 2015b · 5f18004f
      Jungshik Shin (jungshik at google) authored
      1. Update the IANA tz db to 2015b.
        - http://mm.icann.org/pipermail/tz-announce/2015-March/000029.html
        - Mongolia decided to observe DST again in 2015 starting on the last
          Sunday in March.
        - Palestine's DST start date is corrected to be March 28 instead of 27th.
      
      2. Add a script to download the tz database files (update_tz.sh)
      
      3. Check in scripts/make_n_copy_data.sh that I've been using to build ICU
         data/assembly files and update README.chromium.
      
      4. Update android/patch_locale.sh to apply android/brkitr.patch as well.
      
      BUG=473288
      TEST=1. In JavaScript console, run the following.
        mar27_2015_1200 = new Date("03/27/2015 12:00Z")
        mar28_2015_1200 = new Date("03/28/2015 12:00Z")
        mar27_2015_1200.toLocaleString("en", {timeZone: "Asia/Gaza"}
        mar28_2015_1200.toLocaleString("en", {timeZone: "Asia/Gaza"}
        apr15_2014_1200 = new Date("04/15/2014 12:00Z")
        apr15_2015_1200 = new Date("04/15/2015 12:00Z")
        apr15_2014_1200.toLocaleString("en", {timeZone: "Asia/Ulan_Bator"}
        apr15_2015_1200.toLocaleString("en", {timeZone: "Asia/Ulan_Bator"}
      
      In Asia/Gaza, Mar 27 12:00Z should be 2PM and mar28 12:00Z should be 3PM.
      In Asia/Ulan_Bator, April 15 12:00Z should be 8PM in 2014 and should be 9PM
      in 2015. Ulan_Bator does not work due to http://crbug.com/364374.
      
      R=mark@chromium.org
      
      Review URL: https://codereview.chromium.org/1051193002
      5f18004f
  21. Mar 19, 2015
    • Jungshik Shin (jungshik at google)'s avatar
      Update CJK converters and their generating scripts · dafa8443
      Jungshik Shin (jungshik at google) authored
      1. Update ucmlocal.mk and convertrs.txt to refer to euc-kr-html.ucm
      instead of windows-949.ucm
      
      2. Tighten up the valid code range for the following converters:
      
         EUC-KR, Shift_JIS, Big5
      
      This is to add back an ASCII range byte to the stream per
      the encoding spec when they're either illegal as a 'trail byte' or
      there's no assigned code point for a "lead + trail" sequence.
      For instance, with this change, '0xF3 0x41' in EUC-KR is converted to
      'U+FFFD U+0041' instead of 'U+FFFD'.
      
      This change requires adding 2 ~ 8 new states to the conversion
      table of each converter mentioned above leading to 6.5kB net increase
      in the final data size.
      
      3. Tighten the trail byte range for 2-byte sequences starting with 0x8E
      from [A1,E2] to [A1,DF] in EUC-JP and update the corresponding generating
      script.
      
      4. Change the substitution characters for EUC-JP and Shift_JIS to
      match other converters. i.e. make them produce U+FFFD when encountering
      an invalid input. Before this chaange, they emitted U+001A.
      
      5. Enable 'U_CHARSET_IS_UTF8' configuration flag.
      Chromium/Blink does not rely on ICU for the code conversion between
      the 'system native encoding' (if it's one of legacy encodings)
      and Unicode. With this configuration, we can cut down the code size
      a bit.
      
      6. Update the icudtl.dat (all platforms) and assembly files (mac,linux)
         and the icudata dll (windows)
      
      See https://codereview.chromium.org/1026453002 for a new blink test
      added ( fast/encoding/char-decoding-invalid-trail.html )
      
      BUG=450312,430823
      TEST=Blink: fast/encoding/char-decoding-{truncated,invalid-trail}.html
      TEST=base_unittests --gtest_filter=*Conv*, browser_tests --gtest_filter=*ncoding*
      R=jsbell@chromium.org, mark@chromium.org
      
      Review URL: https://codereview.chromium.org/984233002
      dafa8443
  22. Mar 02, 2015
    • Jungshik Shin (jungshik at google)'s avatar
      Update the ICU data · db16fd86
      Jungshik Shin (jungshik at google) authored
      Fix the following errors found by jochen@ in
        https://codereview.chromium.org/960263002/
      
      1. brkitr: en_US_POSIX is not supported. Remove it from brklocal.mk
         : We don't use en_US_POSIX and the remaining dependency on it in
         some unittests was already removed. (we may need it back later,
         though, for breaking an FQDN into components.)
      2. coll: Explicitly add id.txt required as the alias/parent of "in" and
         "id_ID". This should not affect the collation in Indonesian locale
         because falling back to the root locale should be fine.
      3. lang: Add 'ro_MD.txt' required as the alias of 'mo.txt'.
      
      Also update make_mac_asseymbly.sh to get it to read off the ICU major
      version automatically.
      
      Besides, update README.chromium to refer to ICU 54 as done by the
      aforementioned CL.
      
      Rebuild the data files and assembly sources (the latter still required
      by stand-alone v8 builds) for all the platforms.
      
      icudtl.dll for Windows will be built and checked in in  another CL.
      
      BUG=428145
      TEST=Usual ICU update tests before rolling DEPS. See https://codereview.chromium.org/878723002
      TBR=jochen@chromium.org
      
      Review URL: https://codereview.chromium.org/962643003
      db16fd86
  23. Feb 19, 2015
    • Jungshik Shin (jungshik at google)'s avatar
      Fix en_GB's language name failure · 8d46830a
      Jungshik Shin (jungshik at google) authored
      data/lang/en_GB.txt has an empty "Languages" block leading
      getDisplay{Name,Language} to fail in en-GB.
      
      Update trim_data.sh to remove an empty "Languages" block and run the
      script to fix data/lang/en_GB.txt and other locales if any. (only
      en_GB.txt is affected).
      
      Rebuild the icu data with the above changes for both Android and non-Android
      platforms.
      
      BUG=428145
      TEST=linux_chromeos bots: browser_tests --gtest_filter=*GetUILang*
      TBR=mark@chromium.org
      
      Review URL: https://codereview.chromium.org/930203004
      8d46830a
  24. Jan 31, 2015
    • Jungshik Shin (jungshik at google)'s avatar
      ICU update to 54.1 step 7 · 26c111a7
      Jungshik Shin (jungshik at google) authored
      1. Fix a Windows build failure due to:
         a. 'signed vs unsigned' comparison
         b. 'possible data loss' in conversion : Apply pkasting's patch at
            http://bugs.icu-project.org/trac/ticket/11104
      
      2. Drop a few currencies to cut down the data size by 50kB for non-Android
         platforms.
      
      2. Build the ICU data for Android and check in.
        - Drop all display names for languages/scripts/regions except for zh-Han{s,t}
          as before. ( ~ 1.2MB reduction)
        - Drop cjdict by applying android/brkitr.patch. (~ 2MB reduction)
        - Include the display names for only 60+ currencies ( ~ 400kB reduction
          from the non-Android data.
        - Minimize the locale data for 9 locales Chrome on Android is not localized
          to. Drop currency names for those 9 locales. ( ~ 150kB reduction)
      
      Size change:
        1. Non-android: 10,255,584 to 10,200,880
        2. Android:
           - Final : 6,270,880
             With 60+ currency names added (for bug 370849) and
             9 unnecessary locale data dropped.
             It's 232,240 bytes larger than ICU 52.1 (6,038,640).
           - Without any currency names but with 9 unnecessary locale data: 6,026,816
           - With 60+ currency names and 9 unnecessary locale data: 6,426,368
      
      BUG=370849,428145
      TEST=Build on Windows. Blink layout tests, webkit unittests.
      R=mark@chromium.org, wangxianzhu@chromium.org
      
      Review URL: https://codereview.chromium.org/877193003
      26c111a7
  25. Jan 23, 2015
    • Jungshik Shin (jungshik at google)'s avatar
      ICU update to 54 - step 6 · b9090ea5
      Jungshik Shin (jungshik at google) authored
      1. Add {coll,curr,lang,locales,rbnf,region,sprep,translit,unit,zone}/*local.mk
      to exclude locale data for languages/locales that Chromium does not need.
      
      2. Run scripts/trim_data.sh to cut down the data size further by excluding
      unused entries in each locale files.
         - Keep the display names for languages/scripts/locales in Chrome's
           Accept-Language list and remove the display names outside the set.
         - Minimize the locale data in data/{locales,lang} for non-UI languages
           in the A-L list. For them,
           we just need the "native" display name and exemplar character set.
         - Exclude historic, obscure and otherwise unnecessary currency display
           names.
         - Drop unnecessary Chinese collation rules; Big5/GB2312/UniHan.
         - Keep only the minimal unit data for duration and compound units.
      
      3. Add css3transform.txt to data/translit for Greek upper/lowercasing support.
      
      4. Add the minimal locale data for ckb and ku.
      
      5. The tz db was updated previously to 2014j (the latest) so that no change
         is made except for README.chromium update.
      
      6. Add the minimal locale data for ckb and ku.
      
      7. Check in the pre-built data (icudtl.dat) shared by all non-Android
         platforms and assembly files for Linux/Mac
      
      The final data size is 10,255,584 bytes, which is about 200kB smaller than
      that for ICU 52.1.  The pristine upstream ICU has the data of
      25,343,024 bytes.
      
      The remaining steps are to build a smaller data file for Android and
      to build icudtl.dll for Windows (non-default build option).
      
      BUG=428145
      TEST=net_unittests --gtest_filter="*ilenameUtil*"
      TEST=net_unittests --gtest_filter="*IDN*"
      TEST=base_unittests --gtest_filter="*Conv*"
      TEST=browser_tests --gtest_filter="*ncoding*"
      TEST=Blink: layout tests
      R=mark@chromium.org
      
      Review URL: https://codereview.chromium.org/872903002
      b9090ea5
  26. Jan 21, 2015
  27. Oct 13, 2014
  28. Sep 25, 2014
    • jshin@chromium.org's avatar
      Turn on UCONFIG_NO_NON_HTML5_CONVERTER · 52e8245c
      jshin@chromium.org authored
      UCONFIG_NO_NON_HTML5_CONVERTER was added earlier to our copy of ICU, but
      it was never set to 1.  It's my oversight.
      
      1. Turns UCON..CONVERTER on in icu.gyp to drop all the encodings not
         required by the Encoding spec. Dropped encodings include
         UTF-7, BOCU, SCSU, CESU, ISCII, ISO-2022-{KR, CN*}, HZ-GB, ISO-2022-JP's
         other than the original.
      
      2. A lot more sections of the ICU converter code are excluded when
         it's set to 1 including the code for LMB (Lotus Multibyte) encodings and
         X11 compound text encoding (icu common).
      
      3. The character encoding detections for encodings excluded are also disabled.
         (icu i18n)
      
      4. ISO-2022-{KR, CN*} and HZ-GB can be dropped now because Blink treats them
         as replacement encoding. The corresponding alias entries from convertrs.txt
         are also removed.
      
      5. ibm-874 was removed. We used to need it before Blink started, but not any
         more. We only need windows-874.
      
      6. A mistaken in convertrs.txt was corrected : Big5-HKSCS was pointing to
         an old mapping table.
      
      7. Per ICU upstream's suggestion, use '-html' suffix instead of '-html5'
      for the encoding tables derived from the WHATWG's encoding spec (ibm866,
      shift_jis and euc-jp).
      
      The static 64-bit release build of Chrome on Linux went down from
      141,596,616 to 141,491,968 bytes (~ 100 kB reduction). Besides, the icu data
      size got smaller by ~ 19 kB ( 10,490,576 to 10,471,008 bytes).
      
      See http://bugs.icu-project.org/trac/ticket/11296 for an upstream bug
      I've filed on the issue.
      
      
      BUG=76328
      TEST=browser_tests --gtest_filter="*ncoding*"
      TEST=net_unittest --gtest_filter="*ilenameUtil*"
      TEST=base_unittests --gtest_filter="*Conv*"
      TEST=Blink: fast/encoding/*
      TEST=With shared library build, the following has no match.
        nm libicuuc.so | egrep  -i '(bocu|scsu|utf7|2022kr|2022cn|iscii)'
        nm libicui18n.so | egrep  -i '(2022kr|2022cn|ibm42)'
      TEST=With static library build, the following has no match.
        nm chrome | egrep -i '(bocu|scsu|utf7|2022kr|2022cn|iscii|ibm42)'
      
      R=jsbell@chromium.org, mark@chromium.org
      
      Review URL: https://codereview.chromium.org/587833004
      
      git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/icu52@292131 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
      52e8245c
  29. Sep 02, 2014
    • jshin@chromium.org's avatar
      Update tz data to 2014f and add SJIS for the encoding spec · ff835309
      jshin@chromium.org authored
      1. Timezone data files (4 of them) in source/data/misc to 2014f (the latest)
         to prepare for an upcoming Russian timezone change.
      2. Add Shift_JIS converter compliant to the WHATWG encoding spec.
      3. Update converters.txt and ucmlocal.mk accordingly
      4. Update the pre-built data files for Linux/Mac/Android/Windows.
         (icudt.dll is not updated in this CL. It's not used in the default
          configuration. It'll be updated in a separate CL).
      5. Fix a typo in ibm866_gen.sh. The acual table used does not need a change.
      
      
      BUG=277062,404445
      TEST=After rolling icu to this revision, the following tests should pass.
      TEST=Blink: fast/encoding/* all pass except for
      fast/encoding/api/ascii-supersets.html that should fail by *passing*
      the test for Shift_JIS, which is expected to fail. Blink layout tests needs
      to be updated.
      TEST=browser_tests --gtest_filter="*ncoding*"
      TEST=In JS console, run the following to check if Europe/Moscow is
      3 hrs ahead of UTC after Oct 26 and 4 hrs ahead before that and
      if Asia/Kamchatka remains 12 hrs ahead of UTC.
        nov1_2014_1500=new Date("11/01/2014 15:00Z")
        nov1_2014_1500.toLocaleString("en", {timeZone: "Europe/Moscow"})
        nov1_2014_1500.toLocaleString("en", {timeZone: "UTC"})
        nov1_2014_1500.toLocaleString("en", {timeZone: "Asia/Kamchatka"})
        oct24_2014_1500=new Date("10/24/2014 15:00Z")
        oct24_2014_1500.toLocaleString("en", {timeZone: "Europe/Moscow"})
        oct24_2014_1500.toLocaleString("en", {timeZone: "UTC"})
        oct24_2014_1500.toLocaleString("en", {timeZone: "Asia/Kamchatka"})
      TEST=net_unittest --gtest_filter="*ilenameUtil*"
      TEST=base_unittests --gtest_filter="*Conv*"
      R=jsbell@chromium.org
      
      Review URL: https://codereview.chromium.org/497543003
      
      git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/icu52@291774 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
      ff835309
  30. May 24, 2014
  31. May 05, 2014
    • jshin@chromium.org's avatar
      Add back display names for non-UI languages in A-L list · 4266d6d1
      jshin@chromium.org authored
      I was too aggressive in trimming the data and dropped the display
      names for languages that Chromium needs (for non-UI languages
      that are in the A-L list). It's not my intention (the comment in
      trim_data.sh said one thing, but the code did another). 
      
      Besides, add Norweigian (nb) and Malay (ms) locale data that were not 
      included by mistake.
      
      Also update trim_data.sh script NOT to drop 'ALIAS' lines which are
      used to indicate that a given locale is an alias to another locale.
      That also required adding ro_MD.txt (null locale which mo.txt is 
      aliased to).
      
      The above three adds about 110kB to the icu data (from 10.3MB to 10.4MB).
      
      Also update the pre-built icu data files for Linux, Mac and Windows.
      The Android data will be updated in a follow-up patch.
      
      BUG=132145
      TEST=When ICU is rolled, unit_tests:ExtensionL10* pass.
      TBR=mark
      
      Review URL: https://codereview.chromium.org/264973016
      
      git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/icu52@268285 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
      4266d6d1
  32. Apr 29, 2014
  33. Apr 28, 2014
    • jshin@chromium.org's avatar
      Add icudt.dll for Windows · 8df7e257
      jshin@chromium.org authored
      1. Generate and add windows/icudt.dll with the procedure outlined
      in README.chromium. It uses a out-of-tree copy of the upstream ICU
      along with our custom-built icudtl.dat and a locally modified version
      of makedata.mak.
      
      We used to have a separate build/ directory for VS solution/project files
      to build icudtl.dll. Maintaining them is rather cumbersom now that we
      want to update our ICU (major version changes) more frequently. 
      
      Note that icudt.dll is not used by default (icu_use_data_file_flag=1). 
      The GN build still uses it by default and we should not break that build.
      
      2. Add scripts/make_mac_assembly.sh to simplify the generation of the icu
      data assembly source file for Mac.
      
      3. Update README.chromium accordingly.
      
      This CL was uploaded and reviewed at 
      
      https://codereview.chromium.org/255943004/
      
      Due to a malfunction at codereview.chromium.org, I'm landing this CL 
      manually in two parts. 
      This check-in is the 2nd part of the CL dealing with #2 and #3
      above.
      
      BUG=132145
      TEST=None until icu is rolled to this version.
      
      
      git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/icu52@266602 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
      8df7e257
  34. Apr 22, 2014
    • jshin@chromium.org's avatar
      Trim unit* sections in data/locales/* · 4a39040d
      jshin@chromium.org authored
      Add 'filter_locale_data' function to trim_data.sh
      
      Chromium/Blink do not use most of unit* sections in locale data. Keep
      only duration and compound sub-sections. 
      
      Update the icudtl.dat and two assembly source files for Mac/Linux.
      
      It saves ~200kB (uncompressed). 7z-compressed size reduction is 34kB.
      
      With all these changes (up to this CL) applied, the net increase of the ICU data from icu 46 to 52 is 49kB with 7z-compressed.
      (3,070,246 vs  3,021,457) and ~ 390kB uncompressed (10,370,656 vs 9,980,368 ). 
      
      BUG=132145
      TEST=None.
      TBR=mark
      
      Review URL: https://codereview.chromium.org/247663002
      
      git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/icu52@265354 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
      4a39040d
  35. Apr 18, 2014
    • jshin@chromium.org's avatar
      Remove {big5,gb2312}han collation data · 991d1f1e
      jshin@chromium.org authored
      1. {big5,gb2312}han collation data is not used by anybody because they're
      useless as a sorting order.
      
        Add a function to trim_data.sh to remove them from zh.txt
      
      2. Remove remove_unihan.sh and add back unihan rules to coll/{zh,ja,ko}.txt.
      In ICU 52, tools/genrb does NOT include unihan collation by default so that 
      we don't have to bother to remove it from the rule files.
      
      3. Remove obsolete patch files (locale[23].patch)
      
      4. Add LICENSE file (converted from license.html)
      
      5. Update README.chromium accordingly.
      
      6. Check in the updated data file/assembly files.
      
      The net saving in icudtl.dat is ~ 220kB.
      
      
      BUG=132145
      TEST=icudtl.dat is 10576480
      TBR=mark
      
      Review URL: https://codereview.chromium.org/243763002
      
      git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/icu52@264857 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
      991d1f1e
    • jshin@chromium.org's avatar
      Trim ICU data to reduce the download size/memory usage · 4e493261
      jshin@chromium.org authored
      Add a shell script to trim the ICU data further : trim_data.sh along with
      locale list files.  The script does the following:
      
      1. Remove the display names of languages NOT listed in Chrome's Accept-Language
         list. (800kB)
      2. Minimize the locale data for locales listed in the A-L list that are
         not a UI locale in Chrome. For those locales, exemplar characters,
         the display name in the native language and layout direction are included.
         (640kB)
      3. Filter the region data to drop numeric region display names other than 419
         (Latin-America). (50kB)
      4. Filter the currency data (display name and plurals) for historic currencies.
         (200kB)
      
      This CL also checks in icudtl.dat (source/data/in) and
      icudt_dat.S (mac and linux). Note that I dropped '52' (the version number)
      in the assembly source file name and icu.gyp was adjusted accordingly.
      
      With all these changes, icudtl.dat is ~ 800kB larger than that in ICU 4.6.
      The 7z compression (as used by the installer) makes the size difference
      go down to ~ 130kB.
      
      BUG=132145
      TEST=The icudtl.dat (uncompressed) is about 10.7MB instead of 12.4MB without this CL.
      R=mark@chromium.org
      
      Review URL: https://codereview.chromium.org/239543018
      
      git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/icu52@264811 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
      4e493261
  36. Apr 07, 2014
    • jshin@chromium.org's avatar
      ICU 52 local changes part1 · 4dfa619c
      jshin@chromium.org authored
      1. Remove all the obsolete patches. There are lots of them because most of
      local patches to ICU 4.6.1 have either been accepted or become obsolete.
      The largest local patch removed is our patches for CJ word breaker because
      they were upstreamed.
      
      Android didn't apply the CJK word breaker patch to ICU 4.6 to reduce the
      data size. In a follow-up CL, we'll have an Android-specific change for this issue.
      
      Besides, we don't include patches for files we locally add because the
      patches for new files are redundant. Instead, they're mentioned in
      README.chromium.
      
      2. We don't need platform-specific headers any more (pmac, plinux, pwin, etc).
      They're combined into a single file and all platforms we care about are
      well-supported except for one issue on Android/QNX. putil.patch takes care
      of it.
      
      
      3. Breakiterator patches for a few remaining issues. We also use
      a much smaller Khmer dictionary (upstream fix pending).
      
      4. Converter
        - Introduced two WHATWG-encoding-standard-compliant mapping tables
          are added (derived directly from the spec with a script) for EUC-JP
          and CP866
        - Disabled various non-HTML5-encodings such as SCSU,BOCU, UTF-7, CESU-8
          saving ~30kB in the code size. Even though we link statically, they're
          still pulled in as a part of uconv.
        - Disabled ISO-2022-JP-[1-4] in ucnv2022.c
        - Removed a number of encoding alias entries in the alias table
          leading to ~40kB data size reduction.
      
      5. Locale data : Haven't yet updated. We need to trim them substantially.
      
      6. Unihan collation removal is now done with a script (scripts/remove_unihan.sh)
      
      7. Updated timezone data to the latest (2014b) as of today.
      
      8. Customized transliterator for Greek uppercasing
      
      9. Updated data build related patches. The windows data build patch has yet
         to be updated.
      
      10. The updated ICU data file/assembly source files are not included in this
          CL. They'll be updated in a separate CL.
          With all the size reduction changes applied, the data size went down
          from > 23MB to 12.4MB. However, it's still 2.5MB larger than ICU 4.6.1
          data. The locale data trimming will bring it down further.
      
      11. Update README.chromium accordingly. The only exceptions are
      item #5 and the android entry in item #3 (breakiterator. see #1 above)
      
      
      
      BUG=259715,76328
      TEST=Following the procedure outlined in README.chromium, one can build
      the icu data file.
      
      R=jsbell@chromium.org, mark@chromium.org
      
      Review URL: https://codereview.chromium.org/224943002
      
      git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/icu52@262192 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
      4dfa619c
Loading