ZHS16GBK的数据库导入到字符集为AL32UTF8的数据库
生活随笔
收集整理的這篇文章主要介紹了
ZHS16GBK的数据库导入到字符集为AL32UTF8的数据库
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
字符集為ZHS16GBK的數據庫導入到字符集為AL32UTF8的數據庫?
數據庫. 如有我沒想到的其他情況,請大家提建議,我繼續完善測試實驗。
測試環境:
目標庫:英杰的rac測試庫 由
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
sys@rac1>select userenv('language') from dual;
USERENV('LANGUAGE')
----------------------------------
AMERICAN_AMERICA.AL32UTF8
yang@rac1>select * from nls_database_parameters;
PARAMETER???????????????? VALUE
------------------------- ----------------------------------------
NLS_LANGUAGE????????????? AMERICAN
NLS_TERRITORY???????????? AMERICA
NLS_CURRENCY????????????? $
NLS_ISO_CURRENCY????????? AMERICA
NLS_NUMERIC_CHARACTERS??? .,
NLS_CHARACTERSET????????? AL32UTF8
NLS_CALENDAR????????????? GREGORIAN
NLS_DATE_FORMAT?????????? DD-MON-RR
NLS_DATE_LANGUAGE???????? AMERICAN
NLS_SORT????????????????? BINARY
NLS_TIME_FORMAT?????????? HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT????? DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT??????? HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT?? DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY???????? $
NLS_COMP????????????????? BINARY
NLS_LENGTH_SEMANTICS????? BYTE
NLS_NCHAR_CONV_EXCP?????? FALSE
NLS_NCHAR_CHARACTERSET??? AL16UTF16
NLS_RDBMS_VERSION???????? 11.2.0.1.0
20 rows selected.
源庫:
Release 11.1.0.6.0 - Production on Friday, 11 March, 2011 21:11:49
Conneted to: Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
源庫的nls_ 信息:
sys@ORACL> select * from nls_database_parameters;
PARAMETER???????????????? VALUE
------------------------- --------------------------------------------------
NLS_LANGUAGE????????????? AMERICAN
NLS_TERRITORY???????????? AMERICA
NLS_CURRENCY????????????? $
NLS_ISO_CURRENCY????????? AMERICA
NLS_NUMERIC_CHARACTERS??? .,
NLS_CHARACTERSET????????? ZHS16GBK
NLS_CALENDAR????????????? GREGORIAN
NLS_DATE_FORMAT?????????? DD-MON-RR
NLS_DATE_LANGUAGE???????? AMERICAN
NLS_SORT????????????????? BINARY
NLS_TIME_FORMAT?????????? HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT????? DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT??????? HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT?? DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY???????? $
NLS_COMP????????????????? BINARY
NLS_LENGTH_SEMANTICS????? BYTE
NLS_NCHAR_CONV_EXCP?????? FALSE
NLS_NCHAR_CHARACTERSET??? AL16UTF16
NLS_RDBMS_VERSION???????? 11.1.0.6.0
測試步驟:這里主要是測試含有 漢字的數據從ZHS16GBK 導入到 AL32UTF8 .?
在源庫創建含有漢字的表:
yang@ORACL> create table chart ( val varchar2(20));
表已創建。
======創建英文字符數據
yang@ORACL> insert into chart
? 2? select dbms_random.string('l',15)
? 3? from dual
? 4? connect by level <50;
已創建49行。
======創建中文漢字數據
yang@ORACL> insert into chart
? 2? select '阿里云' as val from
? 3? dual connect by level <50;
已創建49行。
yang@ORACL> select * from chart;
VAL
--------------------
阿里云
已選擇98行。
yang@ORACL> commit;
提交完成。
yang@ORACL> select dump('阿里云') from dual;
DUMP('阿里云')
-------------------------------------
Typ=96 Len=6: 176,162,192,239,212,198 ===三個 漢字 占用6個字節
===================導出操作===================================
Microsoft Windows [版本 6.1.7600]
版權所有 (c) 2009 Microsoft Corporation。保留所有權利。
C:\Users\aaaa>set nls_lang=american_america.al32utf8 這里后面測試不加也可以成功導入al32utf8
C:\Users\aaaa>expdp yang/yang tables=chart directory=dumpdir? dumpfile=zhs16gbk.dmp
Export: Release 11.1.0.6.0 - Production on Friday, 11 March, 2011 21:19:04
Copyright (c) 2003, 2007, Oracle.? All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "YANG"."SYS_EXPORT_TABLE_01":? yang/******** tables=chart directory=dumpdir dumpfile=zhs16g
bk.dmp
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 64 KB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "YANG"."CHART"????????????????????????????? 6.484 KB????? 98 rows
Master table "YANG"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for YANG.SYS_EXPORT_TABLE_01 is:
? D:\DUMP\ZHS16GBK.DMP
Job "YANG"."SYS_EXPORT_TABLE_01" successfully completed at 21:19:20
==============================導入到目標庫=========================================
oracle@rac1:rac1 /tmp/dump>export NLS_LANG=AMERICAN_AMERICA.AL32UTF8 ==.bash_profile 文件里已經聲明了,這里是強調一下。
oracle@rac1:rac1 /tmp/dump>impdp yang/yang tables=chart directory=dumpdir dumpfile=zhs16gbk.dmp log=zhs16gbk_to_al32utf8.log
Import: Release 11.2.0.1.0 - Production on Fri Mar 11 21:23:13 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates.? All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
Legacy Mode Active due to the following parameters:
Legacy Mode Parameter: "log=zhs16gbk_to_al32utf8.log" Location: Command Line, Replaced with: "logfile=zhs16gbk_to_al32utf8.log"
Master table "YANG"."SYS_IMPORT_TABLE_01" successfully loaded/unloaded
Starting "YANG"."SYS_IMPORT_TABLE_01":? yang/******** tables=chart directory=dumpdir dumpfile=zhs16gbk.dmp?
logfile=zhs16gbk_to_al32utf8.log?
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
. . imported "YANG"."CHART"????????????????????????????? 6.484 KB????? 98 rows
Job "YANG"."SYS_IMPORT_TABLE_01" successfully completed at 21:23:17
==============================在目標庫中驗證=======================================
sys@rac1>select dump('阿里云') from dual;
DUMP('阿里云')
-------------------------------------------------
Typ=96 Len=9: 233,152,191,233,135,140,228,186,145? 阿里云 三個漢字 占用 9個字節。
yang@rac1>select table_name from user_tables;
TABLE_NAME
------------------------------
CHARSET
CHART
yang@rac1>select val from chart;
VAL
----------------
ogqodcrhyqaesks
98 rows selected.
小結:字符集子集向其超集轉換是可行的,如此例 ZHS16GBK轉換為AL32UTF8。
導出使用的字符集將會記錄在導出文件中,當文件導入時,將會檢查導出時使用的字符集設置,如果這個字符集不同于導入客戶端的NLS_LANG
設置,字符集將根據導入客戶端NLS_LANG設置進行轉換,如果必要,在數據插入數據庫之前會進行進一步轉換。
通常在導出時最好把客戶端字符集設置得和數據庫端相同,這樣可以避免在導出時發生不必要的數據轉換,導出文件將和數據庫具有相同的字符集。
即使將來會把導出文件導入到不同字符集的數據庫中,這樣做也可以把轉換延緩至導入時刻。
當進行數據導入時,主要存在以下兩種情況:
1.源數據庫和目標數據庫具有相同字符集設置
這時,只需要設置NLS_LANG等于數據庫字符集即可導入(前提是,導出使用的是和源數據庫相同字符集,即三者相同)
2.源數據庫和目標數據庫字符集不同
如果我們導出時候使用的NLS_LANG是和源數據庫相同的字符集,那么導入時就可以設置客戶端NLS_LANG等于導出時使用的字符集,這
樣轉換只發生在數據庫端,而且只發生一次。
例如:
如果進行從ZHS16GBK到UTF8的轉換
1)使用NLS_LANG=AMERICAN_AMERICA.ZHS16GBK導出數據庫。
這時創建的導出文件包含ZHS16GBK的數據
2)導入時使用NLS_LANG=AMERICAN_AMERICA.ZHS16GBK
這時轉換僅發生在insert數據到UTF8的數據庫中。
以上假設的轉換只在目標數據庫字符集是源數據庫字符集的超集時才能轉換。
參考文章:
http://www.itpub.net/thread-538197-1-1.html
http://www.itpub.net/viewthread.php?tid=276524&extra=page%3D1&page=1
http://www.eygle.com/archives/2004/09/nls_character_set_04.html 含有漢字的固定字符由ZHS16GBK數據庫導入到AL32UTF8的數據庫?
對于英文:
對英文,在al32utf8中仍然和zhs16gbk一樣用1個字節表示,因此導入固定長度英文字符數據時不會出錯。
對于中文:
例如在字符集為zhs16gbk 數據庫中創建表時指定字段 val char(15),該字段含有數據 ‘阿里云計算公司’在字符集為zhs16gbk 數據庫中占用14個字
節,而在字符集為al32utf8 數據庫中占用21個字節 大于 char(15)所指定的長度15.此時導入數據就會失敗。
下面對于上述情況給以說明。
=======================遷移 英文固定長度字符數據====================
=======================創建表,含有英文固定長度字符數據=============
yang@ORACL> create table fixed_char ( val char(15) );
表已創建。
yang@ORACL> insert into fixed_char select
? 2? dbms_random.string('l',15) from dual
? 3? connect by level <125;
已創建124行。
yang@ORACL> select val from fixed_char;
VAL
---------------
oxmjgwgzbjhthcr
elbdlcpkfajsmrc
jrwccslaywxpiwj
...............
ljkowkocmdqnkgj
idialvaxohrahah
zwrqynvtevfujao
funkxaokotsblww
snyetpafaneicjm
kgrcrpbwlvtotcv
knhcazjkgotzvmg
myqgvjqnsingmxv
klthqehltsyzrxe
voucbpykpnsbopx
vtvavjddyafwqxt
omcnkpvlhlxdvvg
ccfpttivbdvursz
已選擇124行。
yang@ORACL> spool d:\fixed_char.txt
yang@ORACL> insert into fixed_char select
? 2? dbms_random.string('l',15) from dual
? 3? connect by level <125;
已創建124行。
yang@ORACL> select val from fixed_char where rownum <10;
VAL
---------------
oxmjgwgzbjhthcr
elbdlcpkfajsmrc
jrwccslaywxpiwj
clizpikggppgfwy
yqqxljlscqaiqli
lrwoayaxyjzgdhy
rkqpujyupltmrqb
qvycepfgtwipwat
iccsgvrpfxwligq
已選擇9行。
yang@ORACL> commit;
提交完成。
=========================導出數據================================
C:\Users\aaaa>expdp yang/yang tables=fixed_char directory=dumpdir dumpfile=fixed_char_zhs16gbk.dmp l
ogfile=fixed_char.log
Export: Release 11.1.0.6.0 - Production on 星期六, 12 3月, 2011 12:25:55
Copyright (c) 2003, 2007, Oracle.? All rights reserved.
連接到: Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
啟動 "YANG"."SYS_EXPORT_TABLE_01":? yang/******** tables=fixed_char directory=dumpdir dumpfile=fixed
_char_zhs16gbk.dmp logfile=fixed_char.log
正在使用 BLOCKS 方法進行估計...
處理對象類型 TABLE_EXPORT/TABLE/TABLE_DATA
使用 BLOCKS 方法的總估計: 64 KB
處理對象類型 TABLE_EXPORT/TABLE/TABLE
. . 導出了 "YANG"."FIXED_CHAR"???????????????????????? 9.851 KB???? 248 行
已成功加載/卸載了主表 "YANG"."SYS_EXPORT_TABLE_01"
******************************************************************************
YANG.SYS_EXPORT_TABLE_01 的轉儲文件集為:
? D:\DUMP\FIXED_CHAR_ZHS16GBK.DMP
作業 "YANG"."SYS_EXPORT_TABLE_01" 已于 12:27:32 成功完成
=====================================導入數據=============================================
oracle@rac1:rac1 /tmp/dump>impdp yang/yang tables=fixed_char directory=dumpdir dumpfile=fixed_char_zhs16gbk.dmp?
log=fixedchar_zhs16gbk_to_al32utf8.log
Import: Release 11.2.0.1.0 - Production on Sat Mar 12 12:31:34 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates.? All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
Legacy Mode Active due to the following parameters:
Legacy Mode Parameter: "log=fixedchar_zhs16gbk_to_al32utf8.log" Location: Command Line, Replaced with:?
"logfile=fixedchar_zhs16gbk_to_al32utf8.log"
Master table "YANG"."SYS_IMPORT_TABLE_01" successfully loaded/unloaded
Starting "YANG"."SYS_IMPORT_TABLE_01":? yang/******** tables=fixed_char directory=dumpdir dumpfile=fixed_char_zhs16gbk.dmp?
logfile=fixedchar_zhs16gbk_to_al32utf8.log?
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
. . imported "YANG"."FIXED_CHAR"???????????????????????? 9.851 KB???? 248 rows
Job "YANG"."SYS_IMPORT_TABLE_01" successfully completed at 12:31:38
==================================在目標庫里驗證數據====================================
yang@rac1>select table_name from user_tables;
TABLE_NAME
------------------------------
CHARSET
CHART
TCHAR
FIXED_CHAR
yang@rac1>desc fixed_char
Name???????????????????? Null???? Type
----------------------- -------- -------
VAL????????????????????????????? CHAR(15)
yang@rac1>select val from fixed_char where rownum <12;
VAL
---------------
oxmjgwgzbjhthcr
elbdlcpkfajsmrc
jrwccslaywxpiwj
clizpikggppgfwy
yqqxljlscqaiqli
lrwoayaxyjzgdhy
rkqpujyupltmrqb
qvycepfgtwipwat
iccsgvrpfxwligq
zrtwmeviosnsgmv
chwsezhjfgnsjll
11 rows selected.
yang@rac1>select dump(val) from fixed_char where rownum <10;
DUMP(VAL)
--------------------------------------------------------------------------
Typ=96 Len=15: 111,120,109,106,103,119,103,122,98,106,104,116,104,99,114
Typ=96 Len=15: 101,108,98,100,108,99,112,107,102,97,106,115,109,114,99
Typ=96 Len=15: 106,114,119,99,99,115,108,97,121,119,120,112,105,119,106
Typ=96 Len=15: 99,108,105,122,112,105,107,103,103,112,112,103,102,119,121
Typ=96 Len=15: 121,113,113,120,108,106,108,115,99,113,97,105,113,108,105
Typ=96 Len=15: 108,114,119,111,97,121,97,120,121,106,122,103,100,104,121
Typ=96 Len=15: 114,107,113,112,117,106,121,117,112,108,116,109,114,113,98
Typ=96 Len=15: 113,118,121,99,101,112,102,103,116,119,105,112,119,97,116
Typ=96 Len=15: 105,99,99,115,103,118,114,112,102,120,119,108,105,103,113
9 rows selected.
對于英文字符可以實現由zhs16gbk 到 al32utf8的轉換。
解釋:用UTF-8,UNICODE的2字節字符用變長個(1-3個字節)表示:
1. 對英文,仍然和ASCII一樣用1個字節表示,這個字節的值小于128(\x80);
2. 擴展的ASCII字符(主要是西歐),第一字節用C2 - DF之間的范圍,雙字節表示。
3.對其他語言,比如亞洲語系,還有各種特殊符號,使用3個字節表示;
因此,在應用中程序處理過程中所有字符都是16位(雙字節),但在存取轉換成字節流時使用UTF-8格式轉換,對于英文字符來說和原來用ASCII方式存取
時相比大小仍然是一樣的,而對中文來說和原來的GB2312編碼方式相比,大小為:(3字節/2字節)=1.5倍,這也是下面導入數據失敗的原因。
==================================遷移含有漢字 固定字符數據=============================
=====================創建表,含有漢字的固定字符數據======================================
yang@ORACL> create table fixed( val1 char(15),val2 char(15) );
表已創建。
yang@ORACL> insert into fixed select
? 2? dbms_random.string('l',15) val1,
? 3? '阿里云算公司' as val2
? 4? from dual
? 5? connect by level <500;
已創建499行。
yang@ORACL> select dump('阿里云計算公司') from dual;
DUMP('阿里云計算公司')
----------------------------------------------------------------------
Typ=96 Len=14: 176,162,192,239,212,198,188,198,203,227,185,171,203,190
yang@ORACL> commit;
提交完成。
yang@ORACL> select count(*) from fixed;
? COUNT(*)
----------
?????? 499
yang@ORACL>
========================導出操作===========================================
C:\Users\aaaa>expdp yang/yang tables=fixed directory=dumpdir dumpfile=fixed_zhs16gbk.dmp logfile=fixed.log
Export: Release 11.1.0.6.0 - Production on 星期六, 12 3月, 2011 12:50:05
Copyright (c) 2003, 2007, Oracle.? All rights reserved.
連接到: Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
啟動 "YANG"."SYS_EXPORT_TABLE_01":? yang/******** tables=fixed directory=dumpdir dumpfile=fixed_zhs1
6gbk.dmp logfile=fixed.log
正在使用 BLOCKS 方法進行估計...
處理對象類型 TABLE_EXPORT/TABLE/TABLE_DATA
使用 BLOCKS 方法的總估計: 64 KB
處理對象類型 TABLE_EXPORT/TABLE/TABLE
. . 導出了 "YANG"."FIXED"????????????????????????????? 22.94 KB???? 499 行
已成功加載/卸載了主表 "YANG"."SYS_EXPORT_TABLE_01"
******************************************************************************
YANG.SYS_EXPORT_TABLE_01 的轉儲文件集為:
? D:\DUMP\FIXED_ZHS16GBK.DMP
作業 "YANG"."SYS_EXPORT_TABLE_01" 已于 12:50:53 成功完成
C:\Users\aaaa>
========================導入操作=====================================================================
oracle@rac1:rac1 /tmp/dump>impdp yang/yang tables=fixed directory=dumpdir dumpfile=fixed_zhs16gbk.dmp?
log=fixed_zhs16gbk_to_al32utf8.log
Import: Release 11.2.0.1.0 - Production on Sat Mar 12 12:52:38 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates.? All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
Legacy Mode Active due to the following parameters:
Legacy Mode Parameter: "log=fixed_zhs16gbk_to_al32utf8.log" Location: Command Line, Replaced with:?
"logfile=fixed_zhs16gbk_to_al32utf8.log"
Master table "YANG"."SYS_IMPORT_TABLE_01" successfully loaded/unloaded
Starting "YANG"."SYS_IMPORT_TABLE_01":? yang/******** tables=fixed directory=dumpdir dumpfile=fixed_zhs16gbk.dmp?
logfile=fixed_zhs16gbk_to_al32utf8.log?
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
ORA-02374: conversion error loading table "YANG"."FIXED"
ORA-12899: value too large for column VAL2 (actual: 21, maximum: 15)===> 需要21個字節但是 char(15)規定val2字段長度是15
ORA-02372: data for row: VAL2 : 0X'B0A2C0EFD4C6CBE3B9ABCBBE202020'
ORA-02374: conversion error loading table "YANG"."FIXED"
ORA-12899: value too large for column VAL2 (actual: 21, maximum: 15)
ORA-02372: data for row: VAL2 : 0X'B0A2C0EFD4C6CBE3B9ABCBBE202020'
ORA-02374: conversion error loading table "YANG"."FIXED"
ORA-12899: value too large for column VAL2 (actual: 21, maximum: 15)
ORA-02372: data for row: VAL2 : 0X'B0A2C0EFD4C6CBE3B9ABCBBE202020'
ORA-02374: conversion error loading table "YANG"."FIXED"
ORA-12899: value too large for column VAL2 (actual: 21, maximum: 15)
ORA-02372: data for row: VAL2 : 0X'B0A2C0EFD4C6CBE3B9ABCBBE202020'
ORA-02374: conversion error loading table "YANG"."FIXED"
ORA-12899: value too large for column VAL2 (actual: 21, maximum: 15) 五、試驗總結
1.當源端字符編碼為ZHS16GBK,目標端編碼為AL32UTF8,客戶端隨便為其中的一種編碼,遷移數據不會出現亂碼,但是會出現列長度不夠現象。反過來不行,因為utf8中的部分字符轉換到gbk中肯定會不支持
2.設置了源端客戶端編碼,僅僅是導出來的dmp文件頭部有編碼字符標示不一樣,存儲數據還是按照服務端存儲
3.打破神話,exp/imp導入要不亂碼,導出和導入的客戶端編碼要一致 五、原因分析,解決建議
在導入過程中,最多會發生三次編碼轉換:
1、執行exp時,數據庫中數據的編碼會轉換為導出客戶端編碼
2、執行imp時,dmp文件的編碼轉換為導入客戶端編碼
3、導入客戶端編碼轉換為目標端數據庫的數據庫編碼
在exp/imp操作的過程中,經常出現亂碼的原因就是編碼的相互轉換的過程中出現了丟失或者相互不能轉換導致。要解決這個問題,最好的辦法就是通過NLS_LANG的靈活設置,減少編碼轉換的次數(如果相鄰的轉換操作編碼一致,那么不會發生編碼轉換,如試驗中的ZHS16GBK編碼測試,就沒有轉換發生),或者使得相互的轉換能夠兼容,可以最大程度的減少亂碼的出現。
如果已經有了exp導出的dmp文件,然后在導入的過程中,出現亂碼,一般的處理建議是nls_lang的編碼設置和dmp文件的一致,讓轉換發生在導入客戶端和數據庫服務器間(要求:編碼可以相互轉換)
?
?
相信大家都對字符集有相當的了解了,廢話就不多說了!直接步入正題:這里主要是測試含有 漢字的數據從ZHS16GBK的數據庫導入到字符集為AL32UTF8?數據庫. 如有我沒想到的其他情況,請大家提建議,我繼續完善測試實驗。
測試環境:
目標庫:英杰的rac測試庫 由
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
sys@rac1>select userenv('language') from dual;
USERENV('LANGUAGE')
----------------------------------
AMERICAN_AMERICA.AL32UTF8
yang@rac1>select * from nls_database_parameters;
PARAMETER???????????????? VALUE
------------------------- ----------------------------------------
NLS_LANGUAGE????????????? AMERICAN
NLS_TERRITORY???????????? AMERICA
NLS_CURRENCY????????????? $
NLS_ISO_CURRENCY????????? AMERICA
NLS_NUMERIC_CHARACTERS??? .,
NLS_CHARACTERSET????????? AL32UTF8
NLS_CALENDAR????????????? GREGORIAN
NLS_DATE_FORMAT?????????? DD-MON-RR
NLS_DATE_LANGUAGE???????? AMERICAN
NLS_SORT????????????????? BINARY
NLS_TIME_FORMAT?????????? HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT????? DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT??????? HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT?? DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY???????? $
NLS_COMP????????????????? BINARY
NLS_LENGTH_SEMANTICS????? BYTE
NLS_NCHAR_CONV_EXCP?????? FALSE
NLS_NCHAR_CHARACTERSET??? AL16UTF16
NLS_RDBMS_VERSION???????? 11.2.0.1.0
20 rows selected.
源庫:
Release 11.1.0.6.0 - Production on Friday, 11 March, 2011 21:11:49
Conneted to: Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
源庫的nls_ 信息:
sys@ORACL> select * from nls_database_parameters;
PARAMETER???????????????? VALUE
------------------------- --------------------------------------------------
NLS_LANGUAGE????????????? AMERICAN
NLS_TERRITORY???????????? AMERICA
NLS_CURRENCY????????????? $
NLS_ISO_CURRENCY????????? AMERICA
NLS_NUMERIC_CHARACTERS??? .,
NLS_CHARACTERSET????????? ZHS16GBK
NLS_CALENDAR????????????? GREGORIAN
NLS_DATE_FORMAT?????????? DD-MON-RR
NLS_DATE_LANGUAGE???????? AMERICAN
NLS_SORT????????????????? BINARY
NLS_TIME_FORMAT?????????? HH.MI.SSXFF AM
NLS_TIMESTAMP_FORMAT????? DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT??????? HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT?? DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY???????? $
NLS_COMP????????????????? BINARY
NLS_LENGTH_SEMANTICS????? BYTE
NLS_NCHAR_CONV_EXCP?????? FALSE
NLS_NCHAR_CHARACTERSET??? AL16UTF16
NLS_RDBMS_VERSION???????? 11.1.0.6.0
測試步驟:這里主要是測試含有 漢字的數據從ZHS16GBK 導入到 AL32UTF8 .?
在源庫創建含有漢字的表:
yang@ORACL> create table chart ( val varchar2(20));
表已創建。
======創建英文字符數據
yang@ORACL> insert into chart
? 2? select dbms_random.string('l',15)
? 3? from dual
? 4? connect by level <50;
已創建49行。
======創建中文漢字數據
yang@ORACL> insert into chart
? 2? select '阿里云' as val from
? 3? dual connect by level <50;
已創建49行。
yang@ORACL> select * from chart;
VAL
--------------------
阿里云
已選擇98行。
yang@ORACL> commit;
提交完成。
yang@ORACL> select dump('阿里云') from dual;
DUMP('阿里云')
-------------------------------------
Typ=96 Len=6: 176,162,192,239,212,198 ===三個 漢字 占用6個字節
===================導出操作===================================
Microsoft Windows [版本 6.1.7600]
版權所有 (c) 2009 Microsoft Corporation。保留所有權利。
C:\Users\aaaa>set nls_lang=american_america.al32utf8 這里后面測試不加也可以成功導入al32utf8
C:\Users\aaaa>expdp yang/yang tables=chart directory=dumpdir? dumpfile=zhs16gbk.dmp
Export: Release 11.1.0.6.0 - Production on Friday, 11 March, 2011 21:19:04
Copyright (c) 2003, 2007, Oracle.? All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Starting "YANG"."SYS_EXPORT_TABLE_01":? yang/******** tables=chart directory=dumpdir dumpfile=zhs16g
bk.dmp
Estimate in progress using BLOCKS method...
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 64 KB
Processing object type TABLE_EXPORT/TABLE/TABLE
. . exported "YANG"."CHART"????????????????????????????? 6.484 KB????? 98 rows
Master table "YANG"."SYS_EXPORT_TABLE_01" successfully loaded/unloaded
******************************************************************************
Dump file set for YANG.SYS_EXPORT_TABLE_01 is:
? D:\DUMP\ZHS16GBK.DMP
Job "YANG"."SYS_EXPORT_TABLE_01" successfully completed at 21:19:20
==============================導入到目標庫=========================================
oracle@rac1:rac1 /tmp/dump>export NLS_LANG=AMERICAN_AMERICA.AL32UTF8 ==.bash_profile 文件里已經聲明了,這里是強調一下。
oracle@rac1:rac1 /tmp/dump>impdp yang/yang tables=chart directory=dumpdir dumpfile=zhs16gbk.dmp log=zhs16gbk_to_al32utf8.log
Import: Release 11.2.0.1.0 - Production on Fri Mar 11 21:23:13 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates.? All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
Legacy Mode Active due to the following parameters:
Legacy Mode Parameter: "log=zhs16gbk_to_al32utf8.log" Location: Command Line, Replaced with: "logfile=zhs16gbk_to_al32utf8.log"
Master table "YANG"."SYS_IMPORT_TABLE_01" successfully loaded/unloaded
Starting "YANG"."SYS_IMPORT_TABLE_01":? yang/******** tables=chart directory=dumpdir dumpfile=zhs16gbk.dmp?
logfile=zhs16gbk_to_al32utf8.log?
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
. . imported "YANG"."CHART"????????????????????????????? 6.484 KB????? 98 rows
Job "YANG"."SYS_IMPORT_TABLE_01" successfully completed at 21:23:17
==============================在目標庫中驗證=======================================
sys@rac1>select dump('阿里云') from dual;
DUMP('阿里云')
-------------------------------------------------
Typ=96 Len=9: 233,152,191,233,135,140,228,186,145? 阿里云 三個漢字 占用 9個字節。
yang@rac1>select table_name from user_tables;
TABLE_NAME
------------------------------
CHARSET
CHART
yang@rac1>select val from chart;
VAL
----------------
ogqodcrhyqaesks
98 rows selected.
小結:字符集子集向其超集轉換是可行的,如此例 ZHS16GBK轉換為AL32UTF8。
導出使用的字符集將會記錄在導出文件中,當文件導入時,將會檢查導出時使用的字符集設置,如果這個字符集不同于導入客戶端的NLS_LANG
設置,字符集將根據導入客戶端NLS_LANG設置進行轉換,如果必要,在數據插入數據庫之前會進行進一步轉換。
通常在導出時最好把客戶端字符集設置得和數據庫端相同,這樣可以避免在導出時發生不必要的數據轉換,導出文件將和數據庫具有相同的字符集。
即使將來會把導出文件導入到不同字符集的數據庫中,這樣做也可以把轉換延緩至導入時刻。
當進行數據導入時,主要存在以下兩種情況:
1.源數據庫和目標數據庫具有相同字符集設置
這時,只需要設置NLS_LANG等于數據庫字符集即可導入(前提是,導出使用的是和源數據庫相同字符集,即三者相同)
2.源數據庫和目標數據庫字符集不同
如果我們導出時候使用的NLS_LANG是和源數據庫相同的字符集,那么導入時就可以設置客戶端NLS_LANG等于導出時使用的字符集,這
樣轉換只發生在數據庫端,而且只發生一次。
例如:
如果進行從ZHS16GBK到UTF8的轉換
1)使用NLS_LANG=AMERICAN_AMERICA.ZHS16GBK導出數據庫。
這時創建的導出文件包含ZHS16GBK的數據
2)導入時使用NLS_LANG=AMERICAN_AMERICA.ZHS16GBK
這時轉換僅發生在insert數據到UTF8的數據庫中。
以上假設的轉換只在目標數據庫字符集是源數據庫字符集的超集時才能轉換。
參考文章:
http://www.itpub.net/thread-538197-1-1.html
http://www.itpub.net/viewthread.php?tid=276524&extra=page%3D1&page=1
http://www.eygle.com/archives/2004/09/nls_character_set_04.html 含有漢字的固定字符由ZHS16GBK數據庫導入到AL32UTF8的數據庫?
?
此文章是對于上一個實驗的補充,上一次實驗僅僅考慮的 varchar2 的情況。這次考慮到對于char類型的含有中文數據的情況。對于英文:
對英文,在al32utf8中仍然和zhs16gbk一樣用1個字節表示,因此導入固定長度英文字符數據時不會出錯。
對于中文:
例如在字符集為zhs16gbk 數據庫中創建表時指定字段 val char(15),該字段含有數據 ‘阿里云計算公司’在字符集為zhs16gbk 數據庫中占用14個字
節,而在字符集為al32utf8 數據庫中占用21個字節 大于 char(15)所指定的長度15.此時導入數據就會失敗。
下面對于上述情況給以說明。
=======================遷移 英文固定長度字符數據====================
=======================創建表,含有英文固定長度字符數據=============
yang@ORACL> create table fixed_char ( val char(15) );
表已創建。
yang@ORACL> insert into fixed_char select
? 2? dbms_random.string('l',15) from dual
? 3? connect by level <125;
已創建124行。
yang@ORACL> select val from fixed_char;
VAL
---------------
oxmjgwgzbjhthcr
elbdlcpkfajsmrc
jrwccslaywxpiwj
...............
ljkowkocmdqnkgj
idialvaxohrahah
zwrqynvtevfujao
funkxaokotsblww
snyetpafaneicjm
kgrcrpbwlvtotcv
knhcazjkgotzvmg
myqgvjqnsingmxv
klthqehltsyzrxe
voucbpykpnsbopx
vtvavjddyafwqxt
omcnkpvlhlxdvvg
ccfpttivbdvursz
已選擇124行。
yang@ORACL> spool d:\fixed_char.txt
yang@ORACL> insert into fixed_char select
? 2? dbms_random.string('l',15) from dual
? 3? connect by level <125;
已創建124行。
yang@ORACL> select val from fixed_char where rownum <10;
VAL
---------------
oxmjgwgzbjhthcr
elbdlcpkfajsmrc
jrwccslaywxpiwj
clizpikggppgfwy
yqqxljlscqaiqli
lrwoayaxyjzgdhy
rkqpujyupltmrqb
qvycepfgtwipwat
iccsgvrpfxwligq
已選擇9行。
yang@ORACL> commit;
提交完成。
=========================導出數據================================
C:\Users\aaaa>expdp yang/yang tables=fixed_char directory=dumpdir dumpfile=fixed_char_zhs16gbk.dmp l
ogfile=fixed_char.log
Export: Release 11.1.0.6.0 - Production on 星期六, 12 3月, 2011 12:25:55
Copyright (c) 2003, 2007, Oracle.? All rights reserved.
連接到: Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
啟動 "YANG"."SYS_EXPORT_TABLE_01":? yang/******** tables=fixed_char directory=dumpdir dumpfile=fixed
_char_zhs16gbk.dmp logfile=fixed_char.log
正在使用 BLOCKS 方法進行估計...
處理對象類型 TABLE_EXPORT/TABLE/TABLE_DATA
使用 BLOCKS 方法的總估計: 64 KB
處理對象類型 TABLE_EXPORT/TABLE/TABLE
. . 導出了 "YANG"."FIXED_CHAR"???????????????????????? 9.851 KB???? 248 行
已成功加載/卸載了主表 "YANG"."SYS_EXPORT_TABLE_01"
******************************************************************************
YANG.SYS_EXPORT_TABLE_01 的轉儲文件集為:
? D:\DUMP\FIXED_CHAR_ZHS16GBK.DMP
作業 "YANG"."SYS_EXPORT_TABLE_01" 已于 12:27:32 成功完成
=====================================導入數據=============================================
oracle@rac1:rac1 /tmp/dump>impdp yang/yang tables=fixed_char directory=dumpdir dumpfile=fixed_char_zhs16gbk.dmp?
log=fixedchar_zhs16gbk_to_al32utf8.log
Import: Release 11.2.0.1.0 - Production on Sat Mar 12 12:31:34 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates.? All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
Legacy Mode Active due to the following parameters:
Legacy Mode Parameter: "log=fixedchar_zhs16gbk_to_al32utf8.log" Location: Command Line, Replaced with:?
"logfile=fixedchar_zhs16gbk_to_al32utf8.log"
Master table "YANG"."SYS_IMPORT_TABLE_01" successfully loaded/unloaded
Starting "YANG"."SYS_IMPORT_TABLE_01":? yang/******** tables=fixed_char directory=dumpdir dumpfile=fixed_char_zhs16gbk.dmp?
logfile=fixedchar_zhs16gbk_to_al32utf8.log?
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
. . imported "YANG"."FIXED_CHAR"???????????????????????? 9.851 KB???? 248 rows
Job "YANG"."SYS_IMPORT_TABLE_01" successfully completed at 12:31:38
==================================在目標庫里驗證數據====================================
yang@rac1>select table_name from user_tables;
TABLE_NAME
------------------------------
CHARSET
CHART
TCHAR
FIXED_CHAR
yang@rac1>desc fixed_char
Name???????????????????? Null???? Type
----------------------- -------- -------
VAL????????????????????????????? CHAR(15)
yang@rac1>select val from fixed_char where rownum <12;
VAL
---------------
oxmjgwgzbjhthcr
elbdlcpkfajsmrc
jrwccslaywxpiwj
clizpikggppgfwy
yqqxljlscqaiqli
lrwoayaxyjzgdhy
rkqpujyupltmrqb
qvycepfgtwipwat
iccsgvrpfxwligq
zrtwmeviosnsgmv
chwsezhjfgnsjll
11 rows selected.
yang@rac1>select dump(val) from fixed_char where rownum <10;
DUMP(VAL)
--------------------------------------------------------------------------
Typ=96 Len=15: 111,120,109,106,103,119,103,122,98,106,104,116,104,99,114
Typ=96 Len=15: 101,108,98,100,108,99,112,107,102,97,106,115,109,114,99
Typ=96 Len=15: 106,114,119,99,99,115,108,97,121,119,120,112,105,119,106
Typ=96 Len=15: 99,108,105,122,112,105,107,103,103,112,112,103,102,119,121
Typ=96 Len=15: 121,113,113,120,108,106,108,115,99,113,97,105,113,108,105
Typ=96 Len=15: 108,114,119,111,97,121,97,120,121,106,122,103,100,104,121
Typ=96 Len=15: 114,107,113,112,117,106,121,117,112,108,116,109,114,113,98
Typ=96 Len=15: 113,118,121,99,101,112,102,103,116,119,105,112,119,97,116
Typ=96 Len=15: 105,99,99,115,103,118,114,112,102,120,119,108,105,103,113
9 rows selected.
對于英文字符可以實現由zhs16gbk 到 al32utf8的轉換。
解釋:用UTF-8,UNICODE的2字節字符用變長個(1-3個字節)表示:
1. 對英文,仍然和ASCII一樣用1個字節表示,這個字節的值小于128(\x80);
2. 擴展的ASCII字符(主要是西歐),第一字節用C2 - DF之間的范圍,雙字節表示。
3.對其他語言,比如亞洲語系,還有各種特殊符號,使用3個字節表示;
因此,在應用中程序處理過程中所有字符都是16位(雙字節),但在存取轉換成字節流時使用UTF-8格式轉換,對于英文字符來說和原來用ASCII方式存取
時相比大小仍然是一樣的,而對中文來說和原來的GB2312編碼方式相比,大小為:(3字節/2字節)=1.5倍,這也是下面導入數據失敗的原因。
==================================遷移含有漢字 固定字符數據=============================
=====================創建表,含有漢字的固定字符數據======================================
yang@ORACL> create table fixed( val1 char(15),val2 char(15) );
表已創建。
yang@ORACL> insert into fixed select
? 2? dbms_random.string('l',15) val1,
? 3? '阿里云算公司' as val2
? 4? from dual
? 5? connect by level <500;
已創建499行。
yang@ORACL> select dump('阿里云計算公司') from dual;
DUMP('阿里云計算公司')
----------------------------------------------------------------------
Typ=96 Len=14: 176,162,192,239,212,198,188,198,203,227,185,171,203,190
yang@ORACL> commit;
提交完成。
yang@ORACL> select count(*) from fixed;
? COUNT(*)
----------
?????? 499
yang@ORACL>
========================導出操作===========================================
C:\Users\aaaa>expdp yang/yang tables=fixed directory=dumpdir dumpfile=fixed_zhs16gbk.dmp logfile=fixed.log
Export: Release 11.1.0.6.0 - Production on 星期六, 12 3月, 2011 12:50:05
Copyright (c) 2003, 2007, Oracle.? All rights reserved.
連接到: Oracle Database 11g Enterprise Edition Release 11.1.0.6.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
啟動 "YANG"."SYS_EXPORT_TABLE_01":? yang/******** tables=fixed directory=dumpdir dumpfile=fixed_zhs1
6gbk.dmp logfile=fixed.log
正在使用 BLOCKS 方法進行估計...
處理對象類型 TABLE_EXPORT/TABLE/TABLE_DATA
使用 BLOCKS 方法的總估計: 64 KB
處理對象類型 TABLE_EXPORT/TABLE/TABLE
. . 導出了 "YANG"."FIXED"????????????????????????????? 22.94 KB???? 499 行
已成功加載/卸載了主表 "YANG"."SYS_EXPORT_TABLE_01"
******************************************************************************
YANG.SYS_EXPORT_TABLE_01 的轉儲文件集為:
? D:\DUMP\FIXED_ZHS16GBK.DMP
作業 "YANG"."SYS_EXPORT_TABLE_01" 已于 12:50:53 成功完成
C:\Users\aaaa>
========================導入操作=====================================================================
oracle@rac1:rac1 /tmp/dump>impdp yang/yang tables=fixed directory=dumpdir dumpfile=fixed_zhs16gbk.dmp?
log=fixed_zhs16gbk_to_al32utf8.log
Import: Release 11.2.0.1.0 - Production on Sat Mar 12 12:52:38 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates.? All rights reserved.
Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Data Mining and Real Application Testing options
Legacy Mode Active due to the following parameters:
Legacy Mode Parameter: "log=fixed_zhs16gbk_to_al32utf8.log" Location: Command Line, Replaced with:?
"logfile=fixed_zhs16gbk_to_al32utf8.log"
Master table "YANG"."SYS_IMPORT_TABLE_01" successfully loaded/unloaded
Starting "YANG"."SYS_IMPORT_TABLE_01":? yang/******** tables=fixed directory=dumpdir dumpfile=fixed_zhs16gbk.dmp?
logfile=fixed_zhs16gbk_to_al32utf8.log?
Processing object type TABLE_EXPORT/TABLE/TABLE
Processing object type TABLE_EXPORT/TABLE/TABLE_DATA
ORA-02374: conversion error loading table "YANG"."FIXED"
ORA-12899: value too large for column VAL2 (actual: 21, maximum: 15)===> 需要21個字節但是 char(15)規定val2字段長度是15
ORA-02372: data for row: VAL2 : 0X'B0A2C0EFD4C6CBE3B9ABCBBE202020'
ORA-02374: conversion error loading table "YANG"."FIXED"
ORA-12899: value too large for column VAL2 (actual: 21, maximum: 15)
ORA-02372: data for row: VAL2 : 0X'B0A2C0EFD4C6CBE3B9ABCBBE202020'
ORA-02374: conversion error loading table "YANG"."FIXED"
ORA-12899: value too large for column VAL2 (actual: 21, maximum: 15)
ORA-02372: data for row: VAL2 : 0X'B0A2C0EFD4C6CBE3B9ABCBBE202020'
ORA-02374: conversion error loading table "YANG"."FIXED"
ORA-12899: value too large for column VAL2 (actual: 21, maximum: 15)
ORA-02372: data for row: VAL2 : 0X'B0A2C0EFD4C6CBE3B9ABCBBE202020'
ORA-02374: conversion error loading table "YANG"."FIXED"
ORA-12899: value too large for column VAL2 (actual: 21, maximum: 15) 五、試驗總結
1.當源端字符編碼為ZHS16GBK,目標端編碼為AL32UTF8,客戶端隨便為其中的一種編碼,遷移數據不會出現亂碼,但是會出現列長度不夠現象。反過來不行,因為utf8中的部分字符轉換到gbk中肯定會不支持
2.設置了源端客戶端編碼,僅僅是導出來的dmp文件頭部有編碼字符標示不一樣,存儲數據還是按照服務端存儲
3.打破神話,exp/imp導入要不亂碼,導出和導入的客戶端編碼要一致 五、原因分析,解決建議
在導入過程中,最多會發生三次編碼轉換:
1、執行exp時,數據庫中數據的編碼會轉換為導出客戶端編碼
2、執行imp時,dmp文件的編碼轉換為導入客戶端編碼
3、導入客戶端編碼轉換為目標端數據庫的數據庫編碼
在exp/imp操作的過程中,經常出現亂碼的原因就是編碼的相互轉換的過程中出現了丟失或者相互不能轉換導致。要解決這個問題,最好的辦法就是通過NLS_LANG的靈活設置,減少編碼轉換的次數(如果相鄰的轉換操作編碼一致,那么不會發生編碼轉換,如試驗中的ZHS16GBK編碼測試,就沒有轉換發生),或者使得相互的轉換能夠兼容,可以最大程度的減少亂碼的出現。
如果已經有了exp導出的dmp文件,然后在導入的過程中,出現亂碼,一般的處理建議是nls_lang的編碼設置和dmp文件的一致,讓轉換發生在導入客戶端和數據庫服務器間(要求:編碼可以相互轉換)
轉載于:https://www.cnblogs.com/zfox2017/p/9374306.html
創作挑戰賽新人創作獎勵來咯,堅持創作打卡瓜分現金大獎總結
以上是生活随笔為你收集整理的ZHS16GBK的数据库导入到字符集为AL32UTF8的数据库的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 碧蓝航线心情怎么恢复
- 下一篇: spring 入门 1