MOO-cows Mailing List Archive


Re: Support for alt. language

> From: moo-cows-errors
> To: Kondratiev Dima
> Cc: moo-cows
> Subject: Re: Support for alt. language
> Date: 8 February 1996 . 9:57
> Kondratiev Dima writes:
> > I am looking for is a 'clean' way to support alternative character sets in 
> > strings, Cyrillic for example.
> > I need  to support alternative language in string properties such as
> > object names and descriptions.
> > As I understand this problem now (correct me please, if I am wrong) parser 
> > should pass
> > extended ASCII chars in strings. Do I need to hack parser code to achieve
> > this ?
> Actually, it has nothing to do with the parser; it has to do with the 
> network
> module's input handler.  Right now, for non-binary connections, it discards
> any
> input character that isn't an ASCII printing character, space, or tab.  You
> could change that test to allow other characters and things would work fine.
> For example, you might use
>         if ((c & 0x7F) >= 0x20  &&  (c & 0x7F) != 0x7F) {...}
> if you wanted to allow all of the eight-bit non-control characters, such as
> all
> of those in the Latin-1 character set.  Of course, the interpretation of 
> those
> bytes as characters is partially fixed by the MOO language, which believes
> that
> it knows the identity of most codes corresponding to printing ASCII
> characters.
> Also, your client program is going to need to understand the `proper'
> interpretation of the codes.
> However, if your intended character set only *adds* to ASCII, presumably by
> giving meaning to the codes in the upper half of the one-byte space, then 
> you
> should be in good shape.  NOTE: you *must not* allow input of the character
> codes 0x00 (the null character) or 0x0A (the line-feed/newline character); 
> if
> either of those appears in a MOO string, you will definitely lose.
> Now, if you're trying to handle multi-byte character sets, like Japanese, 
> then
> you have even more problems, like support for discovering the true length 
> (in
> characters, not bytes) of strings, etc.
>         Pavel
Thanks for detailed explanation. 
I don't want to hack network input handler myself, because the resulting
code will be incompatible with main Lambda release.
Maybe next Lambda release will support extended ASCII sets (1 octet long) ?
Supporting unicode, which is 2 octets requires much more work.

What you think about the possibility to include this support only by means that MOO provides ?
For example using the following approach :
Add property to generic player object, which is a flag that says what languge
is selected now. Add new verbs to generic player obj., such as @lat and @alt.
These verbs will trigger current language selection. Rewrite all verbs in generic
player dealing with string properties in such a way, that if language flag
is set to "alt" value , string values are stored in "alt" string properties.
Of course, verbs retriving string values in this case should also be
rewritten to get values from "alt" string properties.
Providing  that player has special client programm which can switch to
"alt" language and back, as well as encode alt-language chars to standard ASCII table
and back, maybe using special ESC sequences such as '/x', there will be
no need to rewrite Lambda source code.
This approach, if possible at all, has advantage of bilingual MOO.
The obviouse limitation : not being able to have mixed language strings.
The main question here is how much MOO code I'll have to rewrite
to go that way ?



Home | Subject Index | Thread Index