java - unknown bytes is returned by method getBytes() -
import java.io.unsupportedencodingexception; import java.util.arrays; public class main { public static void main(string[] args) { try { string s = "s"; system.out.println( arrays.tostring( s.getbytes("utf8") ) ); system.out.println( arrays.tostring( s.getbytes("utf16") ) ); system.out.println( arrays.tostring( s.getbytes("utf32") ) ); } catch (unsupportedencodingexception e) { e.printstacktrace(); } } }
console:
[115] [-2, -1, 0, 115] [0, 0, 0, 115]
what it?
[-2, -1] - ???
also, noted, if that:
string s = new string(new char[]{'\u1251'}); system.out.println( arrays.tostring( s.getbytes("utf8") ) ); system.out.println( arrays.tostring( s.getbytes("utf16") ) ); system.out.println( arrays.tostring( s.getbytes("utf32") ) );
console:
[-31, -119, -111] [-2, -1, 18, 81] [0, 0, 18, 81]
the -2, -1 byte order mark (bom - u+feff) indcates following text encoded in utf-16 format.
you getting because, while there 1 utf8 , utf32 encoding, there 2 utf16 encodings utf16le , utf16be, 2 bytes in 16-bit value stored in big-endian or little endian format.
as values come 0xfe xff, suggests encoding utf16be
Comments
Post a Comment