當前位置：首頁 > 编程语言 > java >内容正文

java

java 0x0f_Java - 字节字符

發(fā)布時間：2024/9/18 java 27 豆豆

生活随笔收集整理的這篇文章主要介紹了 java 0x0f_Java - 字节字符小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

1、byte轉換為16進制字符串：

1) 通過Integer.toHexString()方法

public static String bytesToHexString(byte[] src){

StringBuilder hexResult=new StringBuilder("");

if(src==null||src.length==0)

return null;

for(int i=0;i

String hex=Integer.toHexString(src[i]&0xFF);

if(hex.length()<2){

hexResult.append("0");

}

hexResult.append(hex);

}

return hexResult.toString();

}

注意這里 b[i] & 0xFF 將一個byte與0xFF進行了與運算，其運算結果仍然是個int，那么為何要和 0xFF 進行運算？

直接 Integer.toHexString(b[i])，將 byte 強轉為 int 不行嗎？答案是不行的。

原因在于：java中負整數(shù)的二進制采用補碼形式，byte的大小為8bit，int大小為32bit，byte轉化為int時會進行補位

如：byte類型 -1 的補碼為 11111111，轉化為int時補位成 11111111,11111111,11111111,11111111 即0xFFFFFFFF

0xFF 默認為int類型，byte & 0xFF 會先將byte轉化為int類型，這樣結果中的高24bit就會被清0，結果為0xFF

另：補碼 = 反碼 + 1 ，補碼的思想為溢出最高位，如： 1+(-1)= 0 ，00000001 & 11111111 = 1,00000000 即 0

2) 取出字節(jié)的高四位與低四位分別轉化

public static String byteToHexString(byte b){

String map="0123456789ABCDEF";

return map.charAt((b>>4)&0x0F) +""+ map.charAt(b&0x0F);

}

2、字節(jié)流 & 字符流

Java的字符內(nèi)存表現(xiàn)形式為Unicode編碼， char 類型為16bit

InputStream 和 OutputStream 類處理的是字節(jié)流，數(shù)據(jù)流中的最小單位是字節(jié)(8個bit)

Reader 與 Writer處理的是字符流，在處理字符流時涉及了字符編碼的轉換問題

Reader 類能夠將輸入流中采用其他編碼的字符轉換為Unicode編碼，并讀入內(nèi)存

Writer ?類能夠將內(nèi)存中的Unicode字符轉換為其他編碼類型，并寫到輸出流中

import java.io.*;

public class Test {

private static void readBuff(byte[] buff) throws IOException{

ByteArrayInputStream in=new ByteArrayInputStream(buff);

int data;

while((data=in.read())!=-1){

System.out.print(byteToHexString((byte)data)+" ");

}

System.out.println();

in.close();

}

public static void main(String[] args) throws IOException{

System.out.println("內(nèi)存中采用unicode字符編碼：");

char c='好';

byte lowBit=(byte)(c&0xFF);

byte highBit=(byte)((c>>8)&0xFF);

System.out.println(byteToHexString(lowBit)+" "+byteToHexString(highBit));

String s="好";

System.out.println("本地操作系統(tǒng)默認字符編碼：");

readBuff(s.getBytes());

System.out.println("采用GBK字符編碼：");

readBuff(s.getBytes("GBK"));

System.out.println("采用UTF-8字符編碼：");

readBuff(s.getBytes("UTF-8"));

}

private static String byteToHexString(byte b){

String map="0123456789ABCDEF";

return map.charAt((0xF0&b)>>4) +""+ map.charAt(0x0F&b);

}

運行結果：

InputStreamReader - Demo :

import java.io.*;

public class Test {

public static void main(String[] args) throws Exception {

try {

createFile("text",null);

readFile("text",null);

createFile("text_GBK","GBK");

readFile("text_GBK","GBK");

createFile("text_UTF8","UTF-8");

readFile("text_UTF8","UTF-8");

} catch (FileNotFoundException e1) {

} catch (IOException e2) {

}

private static String byteToHexString(byte b) {

String map = "0123456789ABCDEF";

return map.charAt((0xF0 & b) >> 4) + "" + map.charAt(0x0F & b);

}

private static void readFile(String fileName,String encoding) throws Exception {

InputStreamReader reader = null;

// 使用InputStreamReader記得指定字符編碼，不指定字符編碼都是危險的做法，因為不同機器不同系統(tǒng)上的默認編碼可能不同

reader = new InputStreamReader(new FileInputStream(new File(fileName)),encoding==null?"GBK":encoding);

System.out.println("file encoding:" + reader.getEncoding());

int data = -1;

while ((data = reader.read()) != -1) {

System.out.println(Integer.toHexString(data));

}

System.out.println();

if (reader != null)

reader.close();

}

private static void createFile(String fileName,String type) throws Exception {

File file = new File(fileName);

FileOutputStream out = null;

if (!file.exists()) {

file.createNewFile();

}

out = new FileOutputStream(file);

String s = "你好";

if(type==null){

for(int i=0;i

char c=s.charAt(i);

out.write((byte)((c>>8)&0xFF));

out.write((byte)(c&0xFF));

}

}else if(type.equals("GBK")){

out.write(s.getBytes(type));

}else if(type.equals("UTF-8")){

out.write(s.getBytes(type));

}

if (out != null)

out.close();

}

運行結果：

使用BZ小工具查看各文件內(nèi)容：

text　　　　?: 4F 60 59 7D

text_GBK　?: C4 E3 BA C3

text_UTF8 ??: E4 BD A0 E5 A5 BD

使用nodepad++ 將text_UTF8轉為UTF-8編碼格式，文件內(nèi)容為：EF BB BF?E4 BD A0 E5 A5 BD ，讀取為 feff 4f60 597d

如果 text_UTF8 以 GBK 編碼格式進行讀取，獲取到字符序列為：6d63 72b2 30bd (即亂碼)

使用 InputStreamReader/OutputStreamWriter 記得指定字符編碼，因為不同機器不同系統(tǒng)上的默認編碼可能不同，即使你非常肯定就是要用默認編碼，也要顯式地指定使用默認編碼

字節(jié)流是最基本的，所有的InputStream和OutputStream的子類主要用在處理二進制數(shù)據(jù)，它是按字節(jié)來處理的

但實際中很多的數(shù)據(jù)是文本，又提出了字符流的概念，它是按虛擬機的encode來處理，也就是要進行字符集的轉化

這兩個之間通過InputStreamReader,OutputStreamWriter來關聯(lián)，實際上是通過byte[]和String來關聯(lián)

在實際開發(fā)中出現(xiàn)的漢字問題實際上都是在字符流和字節(jié)流之間轉化不統(tǒng)一而造成的

在從字節(jié)流轉化為字符流時，實際上就是byte[]轉化為String時， public String(byte bytes[], String charsetName)

在從字符流轉化為字節(jié)流時，實際上是String轉化為byte[]時， byte[] String.getBytes(String charsetName)

默認字符集編碼為操作系統(tǒng)字符編碼

3、參考鏈接：

總結

以上是生活随笔為你收集整理的java 0x0f_Java - 字节字符的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇：单 C 口至高 135W：联想拯救者 C
下一篇：桂枝茯苓胶囊的作用及禁忌是什么

java

java 0x0f_Java - 字节 字符

總結

java 0x0f_Java - 字节字符