A Taiwanese user of ReactOS came into the IRC channel to report a bug. If
you set ReactOS's code page to 936, the function RtlMultiByteToUnicodeSize
will crash during startup.
I can't code a fix for it, but I can say how. The algorithm should work
like this:
- If the code page is not DBCS, don't bother, and just set *UnicodeSize to
MbSize * sizeof(WCHAR). This is already done.
- Begin counting with a length of 0.
- While MbSize is not zero:
-- Grab a byte and decrement MbSize.
-- Determine whether it is a DBCS lead byte for the code page.
-- If it is a lead byte:
--- If MbSize is now zero, increment length, set *UnicodeSize to your length
* sizeof(WCHAR) and return STATUS_SUCCESS. The broken half-character is
counted.
--- Decrement MbSize and increment your length. Two DBCS bytes just became
a single Unicode character. We ignore the value of the second byte.
-- If it is not:
--- Increment length.
- Set *UnicodeSize to length * sizeof(WCHAR) and return STATUS_SUCCESS.
Is it possible for a DBCS character's mapping to be a UTF-16 surrogate? If
so, the routine becomes more complicated.
I personally think ReactOS should support UTF-8 as a default code page, but
I doubt that others agree. This function is one of the many that would have
to change...
Melissa