c# - How to read string from HttpRequest form data in correct encoding -
today have done service receive emails sendgrid , have sent email text "at long last", first time in non-english language during testing. unfortunately, encoding has become problem cannot fix.
in servicestack service have string property (in input object posted service sendgrid) in encoding different utf8 or unicode (koi8-r in case).
public class senggridemail : ireturn<senggridemailresponse> { public string text { get; set; } } when try convert string utf8 ????s, because when access text property converted unicode (.net's internal string representation). this question , answer illustrate issue.
my question how original koi8-r bytes within servicestack service or asp.net mvc controller, convert utf8 text?
update:
accessing base.request.formdata["text"] doesn't help
var originalencoding = encoding.getencoding("koi8-r"); var originalbytes = originalencoding.getbytes(base.request.formdata["text"]); but if take base64 string original sent mail , convert byte[], , convert bytes utf8 string - works. either base.request.formdata["text"] in unicode .net string format, or (less likely) on sendgrid side.
update 2: here unit test shows happening:
[test] public void encodingtest() { const string originalstring = "наконец-то\r\n"; const string base64koi = "zshlz87fwy3uzw0k"; const string charset = "koi8-r"; var originalbytes = base64koi.frombase64string(); // koi bytes var originalencoding = encoding.getencoding(charset); // koi encoding var originaltext = originalencoding.getstring(originalbytes); // initial string correctly converted .net representation assert.areequal(originalstring, originaltext); var unicodeencoding = encoding.utf8; var originalwrongstring = unicodeencoding.getstring(originalbytes); // how koi string represented in .net, equals base.request.formdata["text"] var originalwrongbytes = originalencoding.getbytes(originalwrongstring); var unicodebytes = encoding.convert(originalencoding, unicodeencoding, originalbytes); var result = unicodeencoding.getstring(unicodebytes); var unicodewrongbytes = encoding.convert(originalencoding, unicodeencoding, originalwrongbytes); var wrongresult = unicodeencoding.getstring(unicodewrongbytes); // see in db assert.areequal(originalstring, result); assert.areequal(originalstring, wrongresult); // want pass! }
discovered 2 underlying problems problem.
the first sendgrid - post multi-part data without specifying content-type non-unicode elements.
the second servicestack - doesn't support encoding other utf-8 multi-part data.
update:
sendgrid helpdesk promised issue, servicestack support custom charsets in multi-part data.
as initial question itself, 1 access buffered stream in servicestack described here: can servicestack runner request body?.
Comments
Post a Comment