Skip to content
Snippets Groups Projects
  1. Oct 28, 2016
    • Jungshik Shin's avatar
      ICU update to 58 part 2 · e0d9b90c
      Jungshik Shin authored
      Follw-up to https://chromium.googlesource.com/chromium/deps/icu/+/5feb9ad5
      (due to a rietveld issue, part 1 was manually pushed).
      
      Update ICU to 58.1 release from ICU 56.1 part2.
      
      Listed below a tiny subset of what's new in 58.1:
      
        1. Unicode 9.0 from Unicode 8.0
          - Updated character properties including Emoji data up to 4.0beta.
          - Updated grapheme/word/line breaking rules for Emoji sequences and others.
      
        2. CLDR 30.0.2 from CLDR 28
          - Numerous locale data updates/improvements
      
        3. Spoofing API changes
        4. Greek uppercasing support as a part of regular case-mapping API.
        5. Line breaking rule file format optimization. This change enables me
           to add CJ loose line breaking rules back (previously, it's dropped
           to save space) so that Blink can use it for CJ.
      
      See http://site.icu-project.org/download/58 for more details on ICU 58.1
      and http://site.icu-project.org/download/57 for more details on ICU 57.1
      
      For CLDR 30, see http://cldr.unicode.org/index/downloads/cldr-30 .
      
      The size impact:
         Non-Android: 10,127,200 => 10,128,624 (delta = 1,424 / 0.014%)
         Android: 6,563,152 => 6,571,936 (delta = 8,784 / 0.13%)
      
      Below are the list of changes made on top of the upstream ICU 58.1
      in reverse order. Most of these changes were made in 58staging branch
      to run trybots and cherry-picked back for this CL. See
         https://chromium.googlesource.com/chromium/deps/icu/+/log/chromium/58staging
         https://codereview.chromium.org/2447513002/ : cr+blink update cl with
             58staging branch head.
      
      * Fix a build on Win without std::string (v8)
      * Add ms932 alias to Shift_JIS
      
      * Apply Google-specific locale data patches
      
      * Fix a bug in scriptset
      
      * Update windows-1255 mapping
      
      * Disable C4333 warning by MSVC (harmless)
      
      * Apply and update utf32.patch and README.chromium
      
      * Update and apply vscomp.patch
        stringpiece patch removed. VS2015 seems to be fine with a redefinition.
      
      * Update pre-built ICU data files
         Update *local.mk with a new copyright line
      
      * Apply more patches
        The following patches were applied and updated: data_symb, vscomp, wpo
      
        The unnecessary part was dropped from vscomp
      
      * Update BUILD.gn and icu.gyp* files
      
      * Update android/brkitr.patch
      
      * Update and apply more patches
      
      * Update and apply cjdict.patch
         Apply data.build.patch
      
      * Delete obsolete patches: cmemory,regex
      
      * Update README.chromium and apply brkitr patches
      
        - Update README.chromium
        - Remove obsolete patches
        - Update linebrk.patch and apply it: add back line_loose_cj
      
      * Update wordbrk.patch and apply it
      
      * Update and apply khmer-dictbe.patch
      
      * Update data trimming
      
        - android/patch_locale.sh
        - scripts/trim_data.sh
           ExemplarCh* removed
           charac*Label removed
           relative/relativeTime removed for daysOfWeek and quarter
      
      * Update the following patches
      
        android/brkitr.patch
        patches/linebrk.patch
        patches/data.build.patch
      
      * Update cjdict.patch and linebrk.patch
      
      BUG=637001
      TEST=Layout tests, all unittests, browser tests, ui tests.
      R=jsbell@chromium.org, mark@chromium.org
      
      Review URL: https://codereview.chromium.org/2442923002 .
      e0d9b90c
  2. Jan 23, 2015
    • Jungshik Shin (jungshik at google)'s avatar
      ICU update to 54.1 - step 5 · 4a0ebf11
      Jungshik Shin (jungshik at google) authored
      1. Apply Chrome's locale data change on top of Google's locale data
         changes
      
      2. Breakiterator changes
        - Apply brkitr.patch with update to ICU 54.1; line/word.txt
        - Check in a more compact Khmer dictionary along with
          a parameter adjustment in dictbe.cpp
        - Add a few common words to the CJ dictionary
        - Update brklocal.mk (out customized build file) to ICU 54
        - Update android/brkitr.patch and data/brkitr/word_ja.txt for Android
      
      Data size checkpoint:
       * Non-Android: 19,575,216 bytes. ~500kB reduction relative
         to the previous step comes mainly from the compact Khmer dictionary.
       * Android: 17,601,520 bytes. 2MB difference comes from removing cjdict.
      
      BUG=428145
      TEST=net_unittests --gtest_filter="*IDN*"
      TEST=layout tests
      R=mark@chromium.org
      
      Review URL: https://codereview.chromium.org/858363003
      4a0ebf11
  3. Apr 07, 2014
    • jshin@chromium.org's avatar
      ICU 52 local changes part1 · 4dfa619c
      jshin@chromium.org authored
      1. Remove all the obsolete patches. There are lots of them because most of
      local patches to ICU 4.6.1 have either been accepted or become obsolete.
      The largest local patch removed is our patches for CJ word breaker because
      they were upstreamed.
      
      Android didn't apply the CJK word breaker patch to ICU 4.6 to reduce the
      data size. In a follow-up CL, we'll have an Android-specific change for this issue.
      
      Besides, we don't include patches for files we locally add because the
      patches for new files are redundant. Instead, they're mentioned in
      README.chromium.
      
      2. We don't need platform-specific headers any more (pmac, plinux, pwin, etc).
      They're combined into a single file and all platforms we care about are
      well-supported except for one issue on Android/QNX. putil.patch takes care
      of it.
      
      
      3. Breakiterator patches for a few remaining issues. We also use
      a much smaller Khmer dictionary (upstream fix pending).
      
      4. Converter
        - Introduced two WHATWG-encoding-standard-compliant mapping tables
          are added (derived directly from the spec with a script) for EUC-JP
          and CP866
        - Disabled various non-HTML5-encodings such as SCSU,BOCU, UTF-7, CESU-8
          saving ~30kB in the code size. Even though we link statically, they're
          still pulled in as a part of uconv.
        - Disabled ISO-2022-JP-[1-4] in ucnv2022.c
        - Removed a number of encoding alias entries in the alias table
          leading to ~40kB data size reduction.
      
      5. Locale data : Haven't yet updated. We need to trim them substantially.
      
      6. Unihan collation removal is now done with a script (scripts/remove_unihan.sh)
      
      7. Updated timezone data to the latest (2014b) as of today.
      
      8. Customized transliterator for Greek uppercasing
      
      9. Updated data build related patches. The windows data build patch has yet
         to be updated.
      
      10. The updated ICU data file/assembly source files are not included in this
          CL. They'll be updated in a separate CL.
          With all the size reduction changes applied, the data size went down
          from > 23MB to 12.4MB. However, it's still 2.5MB larger than ICU 4.6.1
          data. The locale data trimming will bring it down further.
      
      11. Update README.chromium accordingly. The only exceptions are
      item #5 and the android entry in item #3 (breakiterator. see #1 above)
      
      
      
      BUG=259715,76328
      TEST=Following the procedure outlined in README.chromium, one can build
      the icu data file.
      
      R=jsbell@chromium.org, mark@chromium.org
      
      Review URL: https://codereview.chromium.org/224943002
      
      git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/icu52@262192 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
      4dfa619c
Loading