[webkit-qt] Unable to enforce charset for decoding an Html document

Sharma, Ashish ashish.sharma3 at hp.com
Thu Nov 17 05:40:33 PST 2011


Hi,

I use qt web kit api to convert html files to pdf.

The problem that I am facing is of character corruption of south east asian characters.

Right now I am manually setting the character set for the html files in following way:

	QWebSettings objWebSettings = objQWebPage.settings();
	objWebSettings.setDefaultTextEncoding("GB18030");

but the above code fails for html files of the following type:

<html>
<head>
</head>
<body class='hmmessage'><div dir='ltr'>
<br><br><div><hr id="stopSpelling">From: sunbeam_is_me at hotmail.com<br>To: sunbeam0606 at gmail.com<br>Subject: <br>Date: Thu, 10 Nov 2011 14:53:17 +0800<br><br>

<meta http-equiv="Content-Type" content="text/html; charset=unicode">
<meta name="Generator" content="Microsoft SafeHTML">
<style>
.ExternalClass .ecxhmmessage P
{padding:0px;}
.ExternalClass body.ecxhmmessage
{font-size:10pt;font-family:Tahoma;}

</style>

<div dir="ltr">
逆势大;你是我的阿</div></div></div></body>
</html>

To me it looks like the webkit engine gives precedence to the 'meta' header that specifies the charset and ignores the encoding passed by me.

Is there a way I can enforce my encoding on webkit engine?

Thanks
Ashish


More information about the webkit-qt mailing list