[Java] 문자열 바이트로 자르기 


전문통신을 하다보니


한글의 경우는


바이트수가 바뀌어서 


난감했던 경우가 있다.


예를들어 300바이트 짜리 전문인데


영문과 숫자만 있으면 길이가 300이지만


한글이 섞여있으면


한글 1글자당 길이가 1개씩 줄어든다.


한글은 2바이트를 차지하기 때문인데


이때 바이트 단위로 끊어내고 다시 문자열로 반환해 주는 코드 이다.





public  String getString(String str, int sPoint, int length) throws Exception{
String EncodingLang = "euc-kr";
byte[] bytes = str.getBytes("euc-kr");

byte[] value = new byte[length];

if(bytes.length < sPoint + length){
throw new Exception("Length of bytes is less. length : " + bytes.length + " sPoint : " + sPoint + " length : " + length);
}

for(int i = 0; i < length; i++){
value[i] = bytes[sPoint + i];
}

/* System.out.println("utf-8 -> euc-kr : " + new String(word.getBytes("utf-8"), "euc-kr"));
System.out.println("utf-8 -> ksc5601 : " + new String(word.getBytes("utf-8"), "ksc5601"));
System.out.println("utf-8 -> x-windows-949 : " + new String(word.getBytes("utf-8"), "x-windows-949"));
System.out.println("utf-8 -> iso-8859-1 : " + new String(word.getBytes("utf-8"), "iso-8859-1"));
System.out.println("iso-8859-1 -> euc-kr : " + new String(word.getBytes("iso-8859-1"), "euc-kr"));
System.out.println("iso-8859-1 -> ksc5601 : " + new String(word.getBytes("iso-8859-1"), "ksc5601"));
System.out.println("iso-8859-1 -> x-windows-949 : " + new String(word.getBytes("iso-8859-1"), "x-windows-949"));
System.out.println("iso-8859-1 -> utf-8 : " + new String(word.getBytes("iso-8859-1"), "utf-8"));
System.out.println("euc-kr -> utf-8 : " + new String(word.getBytes("euc-kr"), "utf-8"));
System.out.println("euc-kr -> ksc5601 : " + new String(word.getBytes("euc-kr"), "ksc5601"));
System.out.println("euc-kr -> x-windows-949 : " + new String(word.getBytes("euc-kr"), "x-windows-949"));
System.out.println("euc-kr -> iso-8859-1 : " + new String(word.getBytes("euc-kr"), "iso-8859-1"));
System.out.println("ksc5601 -> euc-kr : " + new String(word.getBytes("ksc5601"), "euc-kr"));
System.out.println("ksc5601 -> utf-8 : " + new String(word.getBytes("ksc5601"), "utf-8"));
System.out.println("ksc5601 -> x-windows-949 : " + new String(word.getBytes("ksc5601"), "x-windows-949"));
System.out.println("ksc5601 -> iso-8859-1 : " + new String(word.getBytes("ksc5601"), "iso-8859-1"));
System.out.println("x-windows-949 -> euc-kr : " + new String(word.getBytes("x-windows-949"), "euc-kr"));
System.out.println("x-windows-949 -> utf-8 : " + new String(word.getBytes("x-windows-949"), "utf-8"));
System.out.println("x-windows-949 -> ksc5601 : " + new String(word.getBytes("x-windows-949"), "ksc5601"));
System.out.println("x-windows-949 -> iso-8859-1 : " + new String(word.getBytes("x-windows-949"), "iso-8859-1"));*/




return new String(value, EncodingLang).trim();
}


+ Recent posts