[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Mule-UCS 0.84 (KOUGETSUDAI) release.
- To: mule@xxxxxxxx
- Subject: Re: Mule-UCS 0.84 (KOUGETSUDAI) release.
- From: Dave Love <d.love@xxxxxxxx>
- Date: 06 Dec 2002 16:45:45 +0000
- List-help: <mailto:mule-ctl@m17n.org?body=help>
- List-id: mule.m17n.org
- List-owner: <mailto:mule-admin@m17n.org>
- List-post: <mailto:mule@m17n.org>
- List-software: fml [fml 4.0.1]
- List-unsubscribe: <mailto:mule-ctl@m17n.org?body=unsubscribe>
- References: <20020724.192706.74753392.05@tats.iris.ne.jp><20021104.103826.71116326.05@tats.iris.ne.jp><rzqk7jpgoph.fsf@albion.dl.ac.uk><20021108.073705.74740027.05@tats.iris.ne.jp>
- Reply-to: mule@xxxxxxxx
- User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2
[Sorry for the late responses. This and following messages got lost
somewhere.]
Tatsuya Kinoshita <tats@xxxxxxxxxxxxxx> writes:
> On November 7, 2002, [mule:03316],
> Dave Love <d.love@xxxxxxxx> wrote:
>
> > Instead of using a constant replacement character for data which can't
> > be decoded, it would be much better to use the same technique as Emacs
> > 21, and maintain the byte sequence of the input data using eight-bit
> > charsets. As it is, Mule-UCS can corrupt correct utf-8 data. (I
> > realize this is a fair amount of work.)
>
> Actually, Mule-UCS corrupts correct utf-8 character if a code is
> larger than 24bit. I think this is current Mule-UCS's limitation.
That's one issue, but what I meant is that unicodes which aren't
representable in the Emacs built-in charsets get decoded into a single
replacement character. That loses information. In contrast, the
Emacs 21 mule-utf-8 coding system preserves the original utf-8 byte
sequence.