UTF-8 issue when try to create a DOM document

I have a fetched page by CURL, what charset is windows-1250, and doctype is 

`<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">`

I change the encoding of my string, check it, and replace the meta charset in string:

`$html = str_replace('windows-1250', 'UTF-8', mb_convert_encoding($result, 'UTF-8'));
var_dump(mb_detect_encoding($html, "UTF-8, ASCII, ISO-8859-1, windows-1250"));
$Doc = \phpQuery::newDocumentHTML($html, 'UTF-8');
echo pq($Doc)->html();`

All the UTF-8 characters are messy. var_dump says, its UTF-8, `content-type="text/plain; charset=UTF-8"`. 

When I `var_dump($Doc);` I see, the DOMDocument encoding and xmlencoding values are nulls.

But if I am using:

`$Dom = new \DOMDocument();
$Dom->loadHTML($html);` 

and var_dump it, then everyhing is fine, the characters are ok.

I've checked the `createDocumentWrapper` and the `$contentType` is ok.

If I set the static $debug to true I've get this:

`string 'Load markup for content type text/html;charset=utf-8' (length=52)

string 'Loading HTML, content type 'text/html;charset=utf-8'' (length=52)

string 'Full markup load (HTML): 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head id="Head1"><meta http-equiv="Content-Language" content="hu" />' (length=275)

string 'DOC: UTF-8 REQ: UTF-8' (length=21)

string 'Full markup load (HTML), documentCreate('utf-8')' (length=48)

string 'Selecting document '52280a0c077ec7c5fb2f2350db12f22c' as default one' (length=68)`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

UTF-8 issue when try to create a DOM document #18

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

UTF-8 issue when try to create a DOM document #18

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions