Note that there are some explanatory texts on larger screens.

plurals
  1. POPHP - DOM class - numbered entities and encodings problem
    text
    copied!<p>I'm having some difficult with PHP DOM class.</p> <p>I am making a sitemap script, and I need the output of $doc->saveXML() to be like</p> <pre><code>&lt;?xml version="1.0" encoding="UTF-8"?&gt; &lt;root&gt; &lt;url&gt; &lt;loc&gt;http://www.somesite.com/servi&amp;#xE7;os/redesign&lt;/loc&gt; &lt;/url&gt; &lt;/root&gt; </code></pre> <p>or</p> <pre><code>&lt;?xml version="1.0" encoding="UTF-8"?&gt; &lt;root&gt; &lt;url&gt; &lt;loc&gt;http://www.somesite.com/servi&amp;#231;os/redesign&lt;/loc&gt; &lt;/url&gt; &lt;/root&gt; </code></pre> <p>but I am getting:</p> <pre><code>&lt;?xml version="1.0" encoding="UTF-8"?&gt; &lt;root&gt; &lt;url&gt; &lt;loc&gt;http://www.somesite.com/servi&amp;amp;#xE7;os/redesign&lt;/loc&gt; &lt;/url&gt; &lt;/root&gt; </code></pre> <p>This is the closet I could get, using a replace named to numbered entities function.</p> <p>I was also able to reproduce</p> <pre><code>&lt;?xml version="1.0" ?&gt; &lt;root&gt; &lt;url&gt; &lt;loc&gt;http://www.somesite.com/servi&amp;amp;#xE7;os/redesign&lt;/loc&gt; &lt;/url&gt; &lt;/root&gt; </code></pre> <p>But without the encoding specified.</p> <p>The best solution (the way I think the code should be written) would be:</p> <pre><code>&lt;?php $myArray = array(); // do some stuff to populate the with URL strings $doc = new DOMDocument('1.0', 'UTF-8'); // here we modify some property. Maybe is the answer I am looking for... $urlset = doc-&gt;createElement("urlset"); $urlset = $doc-&gt;appendChild($urlset); foreach($myArray as $address) { $url = $doc-&gt;createElement("url"); $url = $urlset-&gt;appendChild($url); $loc = $doc-&gt;createElement("loc"); $loc = $url-&gt;appendChild($loc); $valueContent = $doc-&gt;createTextNode($value); $valueContent = $loc-&gt;appendChild($address); } echo $doc-&gt;saveXML(); ?&gt; </code></pre> <p>Notes:</p> <ul> <li>Server response header contains charset as UTF-8;</li> <li>PHP script is saved in UTF-8;</li> <li>URLs read are UTF-8 strings;</li> <li>Above script contains encoding declaration on DOMDocument constructor, and does not use any convert functions, like htmlentities, urlencode, utf8_encode...</li> </ul> <p>I've tried changing the DOMDocument properties <em>DOMDocument::$resolveExternals</em> and <em>DOMDocument::$substituteEntities</em> values. None combinations worked.</p> <p>And yes, I know I can made all process without specifying the character set on DOMDocument constructor, dump string content into a variable and make a very simple string substitution with string replace functions. This works. But I would like to know where I am slipping, how can this be made using native API's and settings, or even if this is possible.</p> <p>Thanks in advance.</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload