Note that there are some explanatory texts on larger screens.

plurals
  1. POHow can I process dirty HTML and extract URL using Xpath (or another process)?
    primarykey
    data
    text
    <p>So I have been working with Wordpress 3.x and trying to build a custom display of content already stored in the WP MySQL db. I need to parse the WP post content itself for the mp3 URL in each so I can reuse it elsewhere in my code. There is a built-in function in Wordpress called get_the_content() which grabs the post content data for use in my code.</p> <p>I think I have reduced my code to literally 2 lines (I am trying to be as efficient as possible as this code all appears within a while loop) plus the echo statement to display the data captured in the array created by xpath. Problem seems that I keep getting an <em>PHP Fatal error: Call to a member function xpath() on a non-object</em> and the echo call returns no results. The first error shows on the shortcode [be-linked-title-info] and then again with the MP3 URL from the href.</p> <p>This makes me think the post content is "dirty". Amazon Cloud inserts a "+" for every space when files are uploaded and I cannot count on the filenames to be void of spaces as I have a team of people uploading content, so I want to be able to provide for fact that some might upload files with names that include spaces (that are subsequently converted to + symbols) </p> <p>As mentioned, this error occurs with the shortcode as well as the href I am after, which could be b/c of the use of special characters? All I am after here is the MP3 URl in the href and nothing more. How can I clean this data so it parses properly (if that in fact is the issue)? Or maybe eliminate everything being parsed except the tag and its attribute? Can someone provide me some advice on what I am doing incorrectly please?</p> <p>This is an example of what every post looks like in terms of format &amp; contents (this is exactly what is returned by the get_the_content() function):</p> <pre><code>&lt;img class="myclass" title="mytitle" src="http://www.mydomain.com/myfolder/mypic.jpg" alt="myalt" width="552" height="414" /&gt; [be-linked-title-info] &lt;a title="mytitle" href="https://s3.amazonaws.com/myfolder/published/RJD2+-+SEVEN+LIGHT+YEARS+(INSTRUMENTAL).mp3"&gt;Song Name and Artist&lt;/a&gt; The written plain text post entry describing this music track goes here and says blah blah blah </code></pre> <p>This is the code I am using that returns the error:</p> <pre><code>$xml = simplexml_load_string(get_the_content()); $list = $xml-&gt;xpath("//a[contains(@href,'mp3')]/@href"); </code></pre> <p>And later on I want to use this to return the URL captured from the href:</p> <pre><code>&lt;?php $list[0]; ?&gt; </code></pre>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload