Note that there are some explanatory texts on larger screens.

plurals
  1. POPHP HTML encoding
    primarykey
    data
    text
    <p>I'm trying to parse a HTML page, but the encoding is messing my results. After some research I found a very popular solution using <code>utf8_encode()</code> and <code>utf8_decode()</code>, but it doesn't change anything. In the following lines, you can check my code and the output.</p> <h2>Code</h2> <pre><code>$str_html = $this-&gt;curlHelper-&gt;file_get_contents_curl($page); $str_html = utf8_encode($str_html); $dom = new DOMDocument(); $dom-&gt;resolveExternals = true; $dom-&gt;substituteEntities = false; @$dom-&gt;loadHTML($str_html); $xpath = new DomXpath($dom); (...) $profile = array(); for ($index = 0; $index &lt; $table_lines-&gt;length; $index++) { $desc = utf8_decode($table_lines-&gt;item($index)-&gt;firstChild-&gt;nodeValue); } </code></pre> <h2>Output</h2> <pre><code>Testar é bom </code></pre> <p><strong>Should be</strong></p> <pre><code>Testar é bom </code></pre> <h2>What I've tried</h2> <ul> <li><p>htmlentities():</p> <p><code>htmlentities($table_lines-&gt;item($index)-&gt;lastChild-&gt;nodeValue, ENT_NOQUOTES, ini_get('ISO-8859-1'), false);</code></p></li> <li><p>htmlspecialchars(): </p> <p><code>htmlspecialchars($table_lines-&gt;item($index)-&gt;lastChild-&gt;nodeValue, ENT_NOQUOTES, 'ISO- 8859-1', false);</code></p></li> <li><p>Change my file's charset as decribed <a href="https://stackoverflow.com/a/6306027/1488993">here</a>.</p></li> </ul> <h2>Some more information</h2> <ul> <li>Website encoding: <code>&lt;meta http-equiv="content-type" content="text/html; charset=ISO-8859-1" /&gt;</code></li> </ul> <p>Thanks in advance!</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload