|
|
|
date: Fri, 6 Jun 2008 09:51:01 -0700,
group: microsoft.public.win32.programmer.international
back
How to find only one invalid char in src buffer with MultiByteToWi
Hi all,
I am filling char buffer with 0-127 range characters along with é
character, then MultiByteToWideChar API failed. If I include two times é
character, it is getting success. Please find the below code snippet. Please
correct If I am wrong. My system settings are United States, English. VC++
6.0, Windows XP.
CHAR szData[100] = {0};
strcpy(szData, "1é2345");
INT nWideCharBufferLen = MultiByteToWideChar(CP_UTF8, MB_PRECOMPOSED,
szData, -1, 0, 0 );
// Here issue is there.
I think, MultiByteToWideChar api should return 0. But it is returning 5
always. How?
Here this api will not be able to convert "é" character why because it is
alreaded encoded.
One more thing is, it é char is there twice (for ex: "1éééé2345");, then it
returns ZERO. then I can find, some invalid characters are there in source
string.
If one time (é character) is there in string, then also i need to identify.
How?
Thanks in advance.
--
Thanks & Regards,
Bill.
date: Fri, 6 Jun 2008 09:51:01 -0700
author: Bill
Re: How to find only one invalid char in src buffer with MultiByteToWi
Why on earth do you expect consistent and correct results from an API that
you feed with incorrect data? Why do you tell the API that you give it an
UTF-8 string even though it's not UTF-8? Adding more éééé's just modifies
the UTF-8 stream. Maybe sometimes, it's correct, maybe sometimes it isn't.
Let me copy/paste my answer to your previous post:
*** Your literal ANSI string is encoded using the default codepage, not
UTF8. ***
HTH,
Serge.
http://www.apptranslator.com - Localization tool for your C++ applications
"Bill" wrote in message
news:B3A53086-2B04-451E-BABB-E5A5D03A700C@microsoft.com...
> Hi all,
>
> I am filling char buffer with 0-127 range characters along with é
> character, then MultiByteToWideChar API failed. If I include two times é
> character, it is getting success. Please find the below code snippet.
> Please
> correct If I am wrong. My system settings are United States, English. VC++
> 6.0, Windows XP.
>
> CHAR szData[100] = {0};
> strcpy(szData, "1é2345");
> INT nWideCharBufferLen = MultiByteToWideChar(CP_UTF8, MB_PRECOMPOSED,
> szData, -1, 0, 0 );
>
> // Here issue is there.
>
> I think, MultiByteToWideChar api should return 0. But it is returning 5
> always. How?
> Here this api will not be able to convert "é" character why because it is
> alreaded encoded.
>
> One more thing is, it é char is there twice (for ex: "1éééé2345");, then
> it
> returns ZERO. then I can find, some invalid characters are there in source
> string.
>
> If one time (é character) is there in string, then also i need to
> identify.
> How?
>
> Thanks in advance.
>
> --
> Thanks & Regards,
> Bill.
date: Mon, 9 Jun 2008 13:13:45 +0200
author: Serge Wautier
|
|