[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[mule-ja-2009:09610] Re: Emacs22 =?ISO-2022-JP?B?GyRCJEgbKEI=?= Emacs23 =?ISO-2022-JP?B?GyRCJEcbKEI=?= char-after =?ISO-2022-JP?B?GyRCJE43azJMJCwwYyQmGyhC?=
- To: Takashi Masuda (=?iso-2022-jp?B?GyRCQX1FRDUuO04bKEI=?=) <masutaka@xxxxxxxxx>
- Cc: mule-ja-2009@xxxxxxxx
- From: Kenichi Handa <handa@xxxxxxxx>
- Subject: [mule-ja-2009:09610] Re: Emacs22 =?ISO-2022-JP?B?GyRCJEgbKEI=?= Emacs23 =?ISO-2022-JP?B?GyRCJEcbKEI=?= char-after =?ISO-2022-JP?B?GyRCJE43azJMJCwwYyQmGyhC?=
- Date: Sat, 11 Jul 2009 14:56:09 +0900
In article <20090711.144725.243103832.masutaka@xxxxxxxxx>, Takashi Masuda (=?iso-2022-jp?B?GyRCQX1FRDUuO04bKEI=?=) <masutaka@xxxxxxxxx> writes:
> ちょっと調べてみたところ、Emacs22 までは
> 0xC000 + JIS上位バイト*128 + JIS下位バイト
> といった独自のコードで持っているようですね。
> # "次" という文字の JIS コードは 0x3C21 なので
> # 0xC000 + (0x3C * 0x80) + 0x21 = 0xDE21 -> 56865
Emacs 22 の文字コードは正確には以下のようになっていました
(excerpt from src/charset.h)。
Emacs uses 19 bits for a character code. The bits are divided into
3 fields: FIELD1(5bits):FIELD2(7bits):FIELD3(7bits).
A character code of DIMENSION1 character uses FIELD2 to hold charset
and FIELD3 to hold POSITION-CODE-1. A character code of DIMENSION2
character uses FIELD1 to hold charset, FIELD2 and FIELD3 to hold
POSITION-CODE-1 and POSITION-CODE-2 respectively.
More precisely...
FIELD2 of DIMENSION1 character (except for ascii, eight-bit-control,
and eight-bit-graphic) is "charset - 0x70". This is to make all
character codes except for ASCII and 8-bit codes greater than 256.
So, the range of FIELD2 of DIMENSION1 character is 0, 1, or
0x11..0x7F.
FIELD1 of DIMENSION2 character is "charset - 0x8F" for official
charset and "charset - 0xE0" for private charset. So, the range of
FIELD1 of DIMENSION2 character is 0x01..0x1E.
-----------------------------------------------------------------------------
charset FIELD1 (5-bit) FIELD2 (7-bit) FIELD3 (7-bit)
-----------------------------------------------------------------------------
ascii 0 0 0x00..0x7F
eight-bit-control 0 1 0x00..0x1F
eight-bit-graphic 0 1 0x20..0x7F
DIMENSION1 0 charset - 0x70 POSITION-CODE-1
DIMENSION2(o) charset - 0x8F POSITION-CODE-1 POSITION-CODE-2
DIMENSION2(p) charset - 0xE0 POSITION-CODE-1 POSITION-CODE-2
-----------------------------------------------------------------------------
"(o)": official, "(p)": private
-----------------------------------------------------------------------------
---
半田@AIST