Re: UTF-8 editing
On 2 Oct, spam2011m@... typed:
> In message <0861d8f654.boase@...>
> on 21 Aug 2015 Bernard Boase wrote:
>> Have I missed a vital extra setting, or is UTF-8 support in MPro's
>> Editor precluded by that lack of UTF-8 support in the Wimp?
> I don't have any experience with a recent enough version of EmailEdit,
> but it's not a problem fundamentally to do with the Wimp. There is
> nothing stopping a custom editor like EmailEdit displaying Unicode
> correctly. You might have trouble typing characters if your alphabet
> is still set to Latin 1, but you'd probably have trouble typing the
> Unicode characters anyway.
> You didn't exactly say how it couldn't cope. Does it display
> gibberish in the editor window if you quote the message when replying?
> Or is it displaying OK but cannot be edited?
My EmailEdit is version 2.02 (14-Feb-2015).
The incoming email had the Greek word 'laoutzikos' shown correctly in
Greek characters. If I reply quoting the message, the word is
displayed in EmailEdit as a sequence of 10 character pairs with the
first of each pair being a capital I circumflex (&CE) or I diaresis
(&CF) as you'd probably expect. Thus the initial lamda is &CEBB.
Key F8 to put the email into Edit and the lamda is shown as =CE=BB,
presumably MPro's internal way of representing UTF-8.
In the same email there was a pair of sexed double quotes which
displayed correctly but, in EmailEdit's reply, became the searate
characters &E2 &80 &9C and &E2 &80 &9D, plausibly correct UTF-8 when
strung together.
I don't wish to add any non-ASCII characters (yet!) but would like the
quoted message to remain in UTF-8.
Rob Sprowson, speaking at ROUGOL last month, indicated that support
for UTF-8 already exists in RISC OS 5 but there is work to do in the
printing system and some applications such as Edit and the text areas
in Draw. He added that setting *Alphabet UTF8 already enables anything
using the Wimp for text input to work correctly.
So I tried that, and then Reply to sender rendered the UTF-8
characters correctly, with an exception: tau (&CF84) produced a black
lozenge question mark (font DejaVu's 'unknown'). Is this because &84
(among others) is undefined in Acorns's Latin1 ISO 8859/1?
And, as expected, I can't actually edit in Unicode until a version of
Chars or XChars appears from which to pick Unicode characters.
There is a corresponding issue with UTF-8 in headers, but that is
addressed in a different thread here about the Subject line.
--
Bernard
______________________________________________________________________
This message was sent via the messenger-l mailing list
To unsubscribe, mail messenger-l+unsubscribe@...
|