java - unknown bytes is returned by method getBytes() -


  import java.io.unsupportedencodingexception; import java.util.arrays;  public class main {  public static void main(string[] args)  {   try    {    string s = "s";    system.out.println( arrays.tostring( s.getbytes("utf8") ) );    system.out.println( arrays.tostring( s.getbytes("utf16") ) );    system.out.println( arrays.tostring( s.getbytes("utf32") ) );   }     catch (unsupportedencodingexception e)    {    e.printstacktrace();   }  } }  

console:

 [115] [-2, -1, 0, 115] [0, 0, 0, 115] 

what it?

[-2, -1] - ???

also, noted, if that:

 string s = new string(new char[]{'\u1251'}); system.out.println( arrays.tostring( s.getbytes("utf8") ) ); system.out.println( arrays.tostring( s.getbytes("utf16") ) ); system.out.println( arrays.tostring( s.getbytes("utf32") ) ); 

console:

 [-31, -119, -111] [-2, -1, 18, 81] [0, 0, 18, 81] 

the -2, -1 byte order mark (bom - u+feff) indcates following text encoded in utf-16 format.

you getting because, while there 1 utf8 , utf32 encoding, there 2 utf16 encodings utf16le , utf16be, 2 bytes in 16-bit value stored in big-endian or little endian format.

as values come 0xfe xff, suggests encoding utf16be


Comments

Popular posts from this blog

android - Spacing between the stars of a rating bar? -

html - Instapaper-like algorithm -

c# - How to execute a particular part of code asynchronously in a class -