Sunday 15 August 2010

sql - ORA-29275: partial multibyte character -


I have input data from a flat file in which a column contains English, Japanese, Chinese characters. I staging these values In the table column whose schema definition is VARCHAR2 (250CRAR), definition in main table column is VARCHAR2 (250) which I can not change. Therefore, I am doing a SUBSTR on this column. When I loaded the table, I

  the table from SELECT *  

... I found this error:

ORA-29275: Partial multibyte character

If I select other columns then there is no problem.

250 CHAR column in your 250 byte column , "Text">

you should use SUBSTRB . This function will output only full characters (you will not get the full Unicode character):

  SQL & gt; Substab ('中华 人', 1, 9) ch9, 2 substrb ('中华 人', 1, 8) ch8, 3 substruments ('中华 人', 1, 7) ch7, 4 substrb ('中华 人', of Select 1, 6) C6, 5 substrate ('中华 人', 1, 5) Chrome 6 6 from dual; CH9 CH8 CH7 CH 6 CH 5 --------- -------- ------- ------ ----- 中华 人 中华 中华 中华中  

Edit:

An interesting comment related to the actual length of the resulting string, and as a result, can be an invalid order of bytes in the string. Consider the following on an AL32UTF8 DB:

  SQL & gt; Choose double length ('ÏÏÏ'), 2 length (substrate ('ÏÏÏÏÏÏ', 1, 5)), 3 dump ('ÏÏÏ'), 4 dump (substrate ('ÏÏÏÏÏÏ', 1, 5)) 5; Le le dump ('ÏÏÏ') dump (SUBBR ('ÏÏÏÏÏÏ', 1,5)) - - ------------------------- - ---------- ------------------------------- 6 5 types = 96 lanes = 6: 195,143,195,143,195,143 type = 1 lane = 5: 195,143,195,143,32  

As you can see the last byte of the substrb string, it is not special character but a valid character (In this character set, the first 128 characters are similar to the ASCIIIUs character set), it uses the recommended RTRIM in another answer, encodes the '' space character. Tax).

In addition, I have found this letter that AL16UTF16:

  sql> Length (n ''), 2 dump (n '') dump, 3 length (substrate (n '' ', 1, 3)) length_substust, 4 dump (substrate (n' '', 1, 3 Select))) Dump_substr 5 with double; LE DUMP LENGTH_SUBSTR DUMP_SUBSTR ---------- ----------------------- ------------- ----------------- 4 types = 96 lanes = 4: 1,8,1,8 2 types = 1 lane = 2: 1,8  

In this case Oracle has chosen to cut the string after the second byte because there is no legal one-byte character in the AL16UTF16 character set. As a result, there are only 2 bytes instead of string 3.

It will require more testing and no rigorous performance but I am still standing with my first humps that substrb will return a valid order of bytes which is valid for a character String encodes.


No comments:

Post a Comment