[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Mule-UCS 0.84 (KOUGETSUDAI) release.



[Sorry for the late responses.  This and following messages got lost
somewhere.]

Tatsuya Kinoshita <tats@xxxxxxxxxxxxxx> writes:

> On November 7, 2002, [mule:03316],
> Dave Love <d.love@xxxxxxxx> wrote:
> 
> > Instead of using a constant replacement character for data which can't
> > be decoded, it would be much better to use the same technique as Emacs
> > 21, and maintain the byte sequence of the input data using eight-bit
> > charsets.  As it is, Mule-UCS can corrupt correct utf-8 data.  (I
> > realize this is a fair amount of work.)
> 
> Actually, Mule-UCS corrupts correct utf-8 character if a code is
> larger than 24bit.  I think this is current Mule-UCS's limitation.

That's one issue, but what I meant is that unicodes which aren't
representable in the Emacs built-in charsets get decoded into a single
replacement character.  That loses information.  In contrast, the
Emacs 21 mule-utf-8 coding system preserves the original utf-8 byte
sequence.