Skip to content
Snippets Groups Projects
  1. Mar 19, 2015
    • Jungshik Shin (jungshik at google)'s avatar
      Update CJK converters and their generating scripts · dafa8443
      Jungshik Shin (jungshik at google) authored
      1. Update ucmlocal.mk and convertrs.txt to refer to euc-kr-html.ucm
      instead of windows-949.ucm
      
      2. Tighten up the valid code range for the following converters:
      
         EUC-KR, Shift_JIS, Big5
      
      This is to add back an ASCII range byte to the stream per
      the encoding spec when they're either illegal as a 'trail byte' or
      there's no assigned code point for a "lead + trail" sequence.
      For instance, with this change, '0xF3 0x41' in EUC-KR is converted to
      'U+FFFD U+0041' instead of 'U+FFFD'.
      
      This change requires adding 2 ~ 8 new states to the conversion
      table of each converter mentioned above leading to 6.5kB net increase
      in the final data size.
      
      3. Tighten the trail byte range for 2-byte sequences starting with 0x8E
      from [A1,E2] to [A1,DF] in EUC-JP and update the corresponding generating
      script.
      
      4. Change the substitution characters for EUC-JP and Shift_JIS to
      match other converters. i.e. make them produce U+FFFD when encountering
      an invalid input. Before this chaange, they emitted U+001A.
      
      5. Enable 'U_CHARSET_IS_UTF8' configuration flag.
      Chromium/Blink does not rely on ICU for the code conversion between
      the 'system native encoding' (if it's one of legacy encodings)
      and Unicode. With this configuration, we can cut down the code size
      a bit.
      
      6. Update the icudtl.dat (all platforms) and assembly files (mac,linux)
         and the icudata dll (windows)
      
      See https://codereview.chromium.org/1026453002 for a new blink test
      added ( fast/encoding/char-decoding-invalid-trail.html )
      
      BUG=450312,430823
      TEST=Blink: fast/encoding/char-decoding-{truncated,invalid-trail}.html
      TEST=base_unittests --gtest_filter=*Conv*, browser_tests --gtest_filter=*ncoding*
      R=jsbell@chromium.org, mark@chromium.org
      
      Review URL: https://codereview.chromium.org/984233002
      dafa8443
  2. Feb 17, 2015
  3. Feb 14, 2015
    • Jungshik Shin (jungshik at google)'s avatar
      Move stubdata.c from icudata to icuuc · d158fece
      Jungshik Shin (jungshik at google) authored
      This is to fix a linker error when linking icuuc.dll;  ICU_DATA_ENTRY (icudt54_dat) symbol is not found on Windows clean build from the scratch (component=shared_library).
      
      Move stubdata.c to icuuc target from icudata target. Also, make
      U_DATA_API (used for U_ICU_DATA_ENTRYPOINT in common/udata.cpp) to be
      U_EXPORT instead of U_IMPORT when icu_use_data_file_flag = 1 or
      on Windows. On Windows, using the icudt.dll  (i.e. icu_use_data_file_flag=0) also requires this change.
      
      BUG=428145
      TEST=All trybots can build a target that requires ICU.
      R=mark@chromium.org, scottmg@chromium.org
      
      Review URL: https://codereview.chromium.org/926113004
      d158fece
  4. Jan 08, 2015
  5. Oct 30, 2014
  6. Sep 25, 2014
    • jshin@chromium.org's avatar
      Turn on UCONFIG_NO_NON_HTML5_CONVERTER · 52e8245c
      jshin@chromium.org authored
      UCONFIG_NO_NON_HTML5_CONVERTER was added earlier to our copy of ICU, but
      it was never set to 1.  It's my oversight.
      
      1. Turns UCON..CONVERTER on in icu.gyp to drop all the encodings not
         required by the Encoding spec. Dropped encodings include
         UTF-7, BOCU, SCSU, CESU, ISCII, ISO-2022-{KR, CN*}, HZ-GB, ISO-2022-JP's
         other than the original.
      
      2. A lot more sections of the ICU converter code are excluded when
         it's set to 1 including the code for LMB (Lotus Multibyte) encodings and
         X11 compound text encoding (icu common).
      
      3. The character encoding detections for encodings excluded are also disabled.
         (icu i18n)
      
      4. ISO-2022-{KR, CN*} and HZ-GB can be dropped now because Blink treats them
         as replacement encoding. The corresponding alias entries from convertrs.txt
         are also removed.
      
      5. ibm-874 was removed. We used to need it before Blink started, but not any
         more. We only need windows-874.
      
      6. A mistaken in convertrs.txt was corrected : Big5-HKSCS was pointing to
         an old mapping table.
      
      7. Per ICU upstream's suggestion, use '-html' suffix instead of '-html5'
      for the encoding tables derived from the WHATWG's encoding spec (ibm866,
      shift_jis and euc-jp).
      
      The static 64-bit release build of Chrome on Linux went down from
      141,596,616 to 141,491,968 bytes (~ 100 kB reduction). Besides, the icu data
      size got smaller by ~ 19 kB ( 10,490,576 to 10,471,008 bytes).
      
      See http://bugs.icu-project.org/trac/ticket/11296 for an upstream bug
      I've filed on the issue.
      
      
      BUG=76328
      TEST=browser_tests --gtest_filter="*ncoding*"
      TEST=net_unittest --gtest_filter="*ilenameUtil*"
      TEST=base_unittests --gtest_filter="*Conv*"
      TEST=Blink: fast/encoding/*
      TEST=With shared library build, the following has no match.
        nm libicuuc.so | egrep  -i '(bocu|scsu|utf7|2022kr|2022cn|iscii)'
        nm libicui18n.so | egrep  -i '(2022kr|2022cn|ibm42)'
      TEST=With static library build, the following has no match.
        nm chrome | egrep -i '(bocu|scsu|utf7|2022kr|2022cn|iscii|ibm42)'
      
      R=jsbell@chromium.org, mark@chromium.org
      
      Review URL: https://codereview.chromium.org/587833004
      
      git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/icu52@292131 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
      52e8245c
  7. Sep 03, 2014
    • torne@chromium.org's avatar
      Remove invalid link_settings from target condition. · 89831135
      torne@chromium.org authored
      ICU fails to gyp when run with GYP_DEFINES="android_webview_build=1
      use_system_icu=0 use_system_stlport=1" which is a combination of
      settings we're trying to bring up temporarily as we migrate away from
      system libraries. It fails because it's not permitted to specify
      link_settings in a target_condition as the processing is too late.
      
      Remove the invalid link_settings, since we can't do this outside
      target_conditions as it's not valid to use -lgabi++ on the host build of
      ICU. The link dependency on gabi++ will have to be satisfied manually in
      the main libwebviewchromium target instead for this configuration.
      
      BUG=409851
      R=mkosiba@chromium.org
      TBR=jshin@chromium.org
      
      Review URL: https://codereview.chromium.org/527193003
      
      git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/icu52@291781 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
      89831135
  8. Aug 28, 2014
  9. Aug 21, 2014
  10. Aug 01, 2014
  11. Jul 24, 2014
  12. Jun 18, 2014
  13. Jun 10, 2014
  14. Apr 30, 2014
  15. Apr 18, 2014
    • jshin@chromium.org's avatar
      Trim ICU data to reduce the download size/memory usage · 4e493261
      jshin@chromium.org authored
      Add a shell script to trim the ICU data further : trim_data.sh along with
      locale list files.  The script does the following:
      
      1. Remove the display names of languages NOT listed in Chrome's Accept-Language
         list. (800kB)
      2. Minimize the locale data for locales listed in the A-L list that are
         not a UI locale in Chrome. For those locales, exemplar characters,
         the display name in the native language and layout direction are included.
         (640kB)
      3. Filter the region data to drop numeric region display names other than 419
         (Latin-America). (50kB)
      4. Filter the currency data (display name and plurals) for historic currencies.
         (200kB)
      
      This CL also checks in icudtl.dat (source/data/in) and
      icudt_dat.S (mac and linux). Note that I dropped '52' (the version number)
      in the assembly source file name and icu.gyp was adjusted accordingly.
      
      With all these changes, icudtl.dat is ~ 800kB larger than that in ICU 4.6.
      The 7z compression (as used by the installer) makes the size difference
      go down to ~ 130kB.
      
      BUG=132145
      TEST=The icudtl.dat (uncompressed) is about 10.7MB instead of 12.4MB without this CL.
      R=mark@chromium.org
      
      Review URL: https://codereview.chromium.org/239543018
      
      git-svn-id: http://src.chromium.org/svn/trunk/deps/third_party/icu52@264811 4ff67af0-8c30-449e-8e8b-ad334ec8d88c
      4e493261
  16. Apr 01, 2014
Loading