Skip to content
Snippets Groups Projects
  1. Mar 19, 2015
    • Jungshik Shin (jungshik at google)'s avatar
      Update CJK converters and their generating scripts · dafa8443
      Jungshik Shin (jungshik at google) authored
      1. Update ucmlocal.mk and convertrs.txt to refer to euc-kr-html.ucm
      instead of windows-949.ucm
      
      2. Tighten up the valid code range for the following converters:
      
         EUC-KR, Shift_JIS, Big5
      
      This is to add back an ASCII range byte to the stream per
      the encoding spec when they're either illegal as a 'trail byte' or
      there's no assigned code point for a "lead + trail" sequence.
      For instance, with this change, '0xF3 0x41' in EUC-KR is converted to
      'U+FFFD U+0041' instead of 'U+FFFD'.
      
      This change requires adding 2 ~ 8 new states to the conversion
      table of each converter mentioned above leading to 6.5kB net increase
      in the final data size.
      
      3. Tighten the trail byte range for 2-byte sequences starting with 0x8E
      from [A1,E2] to [A1,DF] in EUC-JP and update the corresponding generating
      script.
      
      4. Change the substitution characters for EUC-JP and Shift_JIS to
      match other converters. i.e. make them produce U+FFFD when encountering
      an invalid input. Before this chaange, they emitted U+001A.
      
      5. Enable 'U_CHARSET_IS_UTF8' configuration flag.
      Chromium/Blink does not rely on ICU for the code conversion between
      the 'system native encoding' (if it's one of legacy encodings)
      and Unicode. With this configuration, we can cut down the code size
      a bit.
      
      6. Update the icudtl.dat (all platforms) and assembly files (mac,linux)
         and the icudata dll (windows)
      
      See https://codereview.chromium.org/1026453002 for a new blink test
      added ( fast/encoding/char-decoding-invalid-trail.html )
      
      BUG=450312,430823
      TEST=Blink: fast/encoding/char-decoding-{truncated,invalid-trail}.html
      TEST=base_unittests --gtest_filter=*Conv*, browser_tests --gtest_filter=*ncoding*
      R=jsbell@chromium.org, mark@chromium.org
      
      Review URL: https://codereview.chromium.org/984233002
      dafa8443
  2. Jan 21, 2015
Loading