Weird result decoding koi-8r in CDO 1.21, Ex2003 store sync sink
I have a very strange problem indeed with a non-western charset (koi8r -
Russian/Cyrillic) in a store Save event sink with CDOEX 1.2.1. It
occurs in both MIME and "plain" RFC822 messages.
The ADO record passed to the message is certainly correct -- I dump its
stream (by the way, when I change that stream's Charset property, the
dumped file changes accordingly; again, I am saving the stream data,
rather than using the SaveToFile method). The header indicates
charset=koi-8r, and the text inside is correctly encoded. Also, the
message, after the sink runs (I do not modify the message) can be
perfectly read with Outlook on the client machine. Finally, I am sure I
did not mess up the CDO libraries on the server box.
I create a CDO Message and then Open() it, passing the ADO record as a
parameter. Then, I can access all properties and the whole hierarchy of
subobjects of the CDO message, and save its body stream (or those of
MIME parts). While the streams are coming OK from ASCII messages, the
koi-8r looks broken weirdly.
First, when I obtain a stream from CDO, its Charset property is set to
"koi8-r". If I save the stream not touching any property, all
characters are decoded incorrectly. I deciphered the code :), but it did
not make things clearer even a tiny bit. The code is (oh hold on your
cerebella folk):
- Take the encoded koi-8 message;
- Decode it correctly into the CP1251 (why 1251???),
- then run it through the same table once again (!!!!!)
When I set the CDO part stream to binary mode and save it, the message
is indeed decoded into 8-bit code points of the CP1251. By
experimenting, I found that all the scene machinery works as if:
- The message part was being kept in the CDO object already translated
into the code page 1251;
- But the CDO stream of the part was considering it still in koi-8r,
and applied another translation to the code points *from koi-8r*
to whatever charset I specify in the stream's Charset property.
Thus, using the binary stream was the result that I obtained so far --
correct 8-bit text in the codepage 1251 -- but I could not even
understand where that charset was coming from! Yes, both koi-8r and
ANSI 1251 are cyrillic code pages, but nowhere in the properties of any
object could I see a reference to the code page 1251.
Anyone can suggest what I am doing wrong, or what else to try?
-kkm
date: Wed, 30 Nov 2005 13:44:10 -0800
author: Kirill 'Big K' Katsnelson