|
is any way Absolute Telnet could do an internal translation between UTF-8 2-byte characters to 1-byte (0:255) values? We still strongly prefer to be able to store a character in a byte.I believe you may have a misunderstanding about the terminal's role in your application's design. Absolute doesn't care how you store things, or how you sort things, convert things, etc. It simply takes the data in any of the supported formats (your choice) and displays it. If your database only supports single-byte characters, then you're stuck with one of the single-byte legacy encodings as we discussed before. This approach will probably require you to create a custom sorting algorithm, as sorting by binary values will give the wrong order. So be it. That may just be work you can't avoid. The downside to this is that as you add additional languages, you'll have to support additional legacy encodings and special sorts, etc. This is the kind of work that Unicode was designed to help you avoid.
There isn't any algorithm that can take 36 unicode characters and store it in 36 bytes unless you convert them to some single-byte legacy encoding. Then, you're back to square one.
Typically, a Unicode application does not store data internally in UTF8 format. UTF8 is *not* a 2-byte encoding. It is a variable length multi-byte encoding that can store a character in 1, 2, 3, or even 4 bytes! This variability makes it a poor choice for data storage, as it becomes very difficult to determine the lengths of strings necessary to store a certain number of characters or how many characters can fit in your 36-byte field. Applications tend to store data in the UCS2 format where every unicode character takes exactly two bytes. Of course, this requires quite a bit of application modification and extra storage as you said. However, once this work is done, adding new languages is trivial.
Regardless of how you store it, when you send the data to the terminal, it has to be in one of the encodings the terminal supports (Win1256, ISO8859-6, etc)
|