utf 8 - C++ character encoding UTF-8 -


this question has answer here:

i've got following code converts unicode appropriate character e.g. when user enters úsername browser %fasername returned code converts úsername.

however when browser encoding set utf-8 value passed code %c3%basername converted úsername wrong value expected authentication. how can modify code make utf-8 compatible?

no answer

there couple of things wrong. ú has unicode number u+00fa, or developers say: 0x00fa. unicode has 3x2^16 characters. in utf-8 multi-byte sequences used. 7-bit pure ascii unicode = ascii. u+00fa more 1 byte needed.

%c3%ba seems correct, %xx byte, url encoded. u+0109, ĉ, single byte, %fa not do.

for utf-8 decoding/encoding wide char string there exist sufficient code snippets.

i afraid handling has change.


normal procedure

one receives url encoded string: %xx.

char* url_decode(const char*) // translate %xx char. 

now have byte stream, arrived utf-8: multi-byte utf-8 string.

wchar_t* utf8_decode(const char* bytes) // translate bytes text. 

resolves multi-byte sequences string of utf-16 characters.


Comments

Popular posts from this blog

html - How to style widget with post count different than without post count -

How to remove text and logo OR add Overflow on Android ActionBar using AppCompat on API 8? -

IIS->Tomcat Redirect: multiple worker with default -