Note that there are some explanatory texts on larger screens.

plurals
  1. POReading <content:encoded> tags using BeautifulSoup 4
    primarykey
    data
    text
    <p>I'm using BeautifulSoup 4 (bs4) to read an XML RSS feed, and have come across the following entry. I'm trying to read the content enclosed in the <code>&lt;content:encoded&gt;&lt;![CDATA[...]]&lt;/content&gt;</code> tag:</p> <pre><code>&lt;item&gt; &lt;title&gt;Foobartitle&lt;/title&gt; &lt;link&gt;http://www.acme.com/blah/blah.html&lt;/link&gt; &lt;category&gt;&lt;![CDATA[mycategory]]&gt;&lt;/category&gt; &lt;description&gt;&lt;![CDATA[The quick brown fox jumps over the lazy dog]]&gt;&lt;/description&gt; &lt;content:encoded&gt; &lt;![CDATA[&lt;p&gt;&lt;img class="feature" src="http://www.acme.com/images/image.jpg" alt="" /&gt;&lt;/p&gt;]]&gt; &lt;/content:encoded&gt; &lt;/item&gt; </code></pre> <p>As I understand it, this format is part of the <a href="https://developer.mozilla.org/en-US/docs/RSS/Article/Why_RSS_Content_Module_is_Popular_-_Including_HTML_Contents" rel="nofollow noreferrer">RSS content module</a> and is pretty common.</p> <p>I'd like to isolate the <code>&lt;content:encoded&gt;</code> tag and then read the CDATA contents. For the avoidance of doubt, the result would be <code>&lt;p&gt;&lt;img class="feature" src="http://www.acme.com/images/image.jpg" alt="" /&gt;&lt;/p&gt;</code>.</p> <p>I've looked at <a href="https://stackoverflow.com/questions/2032172/how-can-i-grab-cdata-out-of-beautifulsoup/2032252#2032252">this</a>, <a href="https://stackoverflow.com/questions/2032172/how-can-i-grab-cdata-out-of-beautifulsoup">this</a>, and <a href="https://stackoverflow.com/questions/13961831/why-is-beautifulsoup-unable-to-correctly-read-parse-this-rss-xml-document">this</a> stackoverflow post but I've not been able to figure out how to get the job done since they are not directly related to my case.</p> <p>I am using the <a href="http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser" rel="nofollow noreferrer">lxml XML</a> parser with bs4.</p> <p>Any suggestions? thanks!</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload