Skip to content
  • Frank Tang's avatar
    Update ICU to 64.1 + Chromium patches · 69c72a6d
    Frank Tang authored
    What's new in ICU 64.1:
      - Unicode 12: 554 new characters, including 4 new scripts and 61 new
        emoji characters.
      - CLDR 35 locale data
        http://blog.unicode.org/2019/03/unicode-cldr-version-35-languagelocale.html
      - ICU 64 now uses "rearguard" TZ data. (Recent versions have used
        "vanguard" data with certain overrides.) (ICU-20398)
      - ICU data filtering: The ICU4C build accepts an optional filter
        script that specifies a subset of the data to be built, with
        whitelists and blacklists for locales and for resource bundle paths.
        (ICU-10923, design doc)
      - MessageFormat has new pattern syntax for specifying the style of
        a date/time argument via a locale-independent skeleton rather than
        a locale-specific pattern. (ICU-9622)
        * Date/time skeletons use the same "::" prefix as number skeletons.
        * Example MessageFormat pattern string:
          "We close on {closing,date,::MMMMd} at {closing,time,::jm}."
      - Many formatting APIs can now output a new type of result object
        which is-a FormattedValue (Java & C++), or convertible to a
        UFormattedValue (C).
        * These combine the result strings with easy iteration over
          FieldPosition metadata.
      - New C++ class LocaleBuilder for building a Locale from subtags,
        keywords, and extensions. (ICU-20328) Parallel to the existing
        ICU4J ULocale.Builder class.
      - For C++ MeasureUnit instances, there are now additional factory
        methods that return units by value, not by pointer-with-ownership.
        (ICU-20337)
      - Various Out-Of-Memory (OOM) issues have been fixed. (ticket query)
      - See http://site.icu-project.org/download/64 for more details.
    
    The update steps are recorded :
      https://chromium.googlesource.com/chromium/deps/icu/+log/20690c6..6d422ff
    
      - Update update.sh to point to ICU's new repo location
      - Import the pristine copy of ICU 64.1 and update BUILD
        files with update.sh
      - Update and apply locale data patches
    
        1. patches/locale_google.patch:
          * Google's internal ICU locale changes
          * Simpler region names for Hong Kong and Macau in all locales
          * Currency signs in ru and uk locales (do not include 'tr' locale changes)
          * AM/PM, midnight, noon formatting for a few Indian locales
          * Timezone name changes in Korean and Chinese locales
          * Default digit for Arabic locale is European digits.
          - patches/locale1.patch: Minor fixes for Korean
        2. Breakiterator patches
          - patches/wordbrk.patch for word.txt
            a. Move full stops (U+002E, U+FF0E) from MidNumLet to MidNum so that
               FQDN labels can be split at '.'
            b. Move fullwidth digits (U+FF10 - U+FF19) from Ideographic to Numeric.
               See http://unicode.org/cldr/trac/ticket/6555
          - patches/khmer-dictbe.patch
            Adjust parameters to use a smaller Khmer dictionary (khmerdict.txt).
            https://unicode-org.atlassian.net/browse/ICU-9451
          - Add several common Chinese words that were dropped previously to
            source/data/cjdict/brkitr/cjdict.txt
            patch: patches/cjdict.patch
            upstream bug: https://unicode-org.atlassian.net/browse/ICU-10888
        3. Build-related changes
          - patches/configure.patch:
            * Remove a section of configure that will cause breakage while
              running runConfigureICU.
          - patches/wpo.patch (only needed when icudata dll is used).
            upstream bugs : https://unicode-org.atlassian.net/browse/ICU-8043
                            https://unicode-org.atlassian.net/browse/ICU-5701
          - patches/data_symb.patch :
              Put ICU_DATA_ENTRY_POINT(icudtXX_dat) in common when we use
              the icu data file or icudt.dll
          - patches/staticmutex.patch :
              Change the static UMutex code to avoid static_initializers error.
              upstream bug: https://unicode-org.atlassian.net/browse/ICU-20520
          - patches/buildtool.patch :
              Fix the build tool which ommited res_index.res */res_index.res files
              upstream bug: https://unicode-org.atlassian.net/browse/ICU-20529
              upstream PR: https://github.com/unicode-org/icu/pull/571/
        4. Double conversion library build failure
          - patches/double_conversion.patch
          - upstream bugs:
            https://unicode-org.atlassian.net/browse/ICU-13750
            https://github.com/google/double-conversion/issues/66
        5. ISO-2022-JP encoding (fromUnicode) change per WHATWG encoding spec.
          - patches/iso2022jp.patch
          - upstream bug:
            https://unicode-org.atlassian.net/browse/ICU-20251
    
     - ICU data files are rebuilt
       Up to 67kB increase. Since we also save 43K in
       https://chromium-review.googlesource.com/c/v8/v8/+/1478710 ,
       the net increase is only 24KB.
    
    ** ICU Data Size Change **
    Data Size   ICU63   ICU64-1    DIFF
    chromeos  10326064 10378624   52560
      common  10326064 10394816   68752
        cast   5126144  5101616  -24528
     android   6355520  6406256   50736
         ios   6315248  6372016   56768
     flutter    880928   894752   13824
    
    Created by:
    git rev-list --reverse 20690c62..6d422ffa | \
      xargs git cherry-pick --strategy=recursive -X theirs
    
    Bug: chromium:943348
    Change-Id: Ia7f86abfa8625dd24aae2f71456abd679fda3dae
    Reviewed-on: https://chromium-review.googlesource.com/c/chromium/deps/icu/+/1552155
    
    
    Reviewed-by: default avatarJungshik Shin <[email protected]>
    69c72a6d
This project is licensed under the Unicode Terms of Use. Learn more