Note that there are some explanatory texts on larger screens.

plurals
  1. POParse HTML with Python and BeautifulSoup - get text both inside and outside the <a> tags
    primarykey
    data
    text
    <p>I have html with a number of tags, and then text which is outside those tags. The text I'm trying to get is in <br> tags except the first instance, which is I guess just part of the tag. But if I try to get the text of the tag (like td.text or something like that) then it also gives me all the text in all the and <br> tags.</p> <pre><code> &lt;td align="left"&gt; &lt;a class="playerLink" href="http://bbroto.baseball.cbssports.com/players/playerpage/1740935"&gt; Garcia, Leury &lt;/a&gt; SS CHW - Traded from Royal Disappointments &lt;br&gt; &lt;a class="playerLink" href="http://bbroto.baseball.cbssports.com/players/playerpage/1813191"&gt; Almonte, Abraham &lt;/a&gt; OF SEA - Traded from Royal Disappointments &lt;br&gt; &lt;a class="playerLink" href="http://bbroto.baseball.cbssports.com/players/playerpage/2046044"&gt; Pillar, Kevin &lt;/a&gt; OF TOR - Traded from Royal Disappointments &lt;br&gt; &lt;a class="playerLink" href="http://bbroto.baseball.cbssports.com/players/playerpage/1666824"&gt; Sierra, Moises &lt;/a&gt; LF TOR - Traded from Royal Disappointments &lt;br&gt; &lt;a class="playerLink" href="http://bbroto.baseball.cbssports.com/players/playerpage/580599"&gt; Paulino, Felipe &lt;/a&gt; SP KC &lt;span title="Felipe Paulino off 60-day DL"&gt; &lt;a class="playerLink" href="http://bbroto.baseball.cbssports.com/players/playerpage/580599" subtab="Update"&gt; &lt;img border="0" height="10" src="http://sports.cbsimg.net/images/news-note-recent.gif" width="10"/&gt; &lt;/a&gt; &lt;/span&gt; - Traded from Royal Disappointments &lt;/br&gt; &lt;/br&gt; &lt;/br&gt; &lt;/br&gt; &lt;/td&gt; </code></pre> <p>Basically I want (as separate values) each text in an a tag, followed by each text outside the a tag. So the end result would be:</p> <p>Garcia, Leury</p> <p>SS CHW - Traded from Royal Disappointments</p> <p>Almonte, Abraham</p> <p>OF SEA - Traded from Royal Disappointments</p> <p>Pillar, Kevin</p> <p>OF TOR - Traded from Royal Disappointments</p> <p>Sierra, Moises</p> <p>LF TOR - Traded from Royal Disappointments</p> <p>Paulino, Felipe</p> <p>SP KC - Traded from Royal Disappointments</p> <p>So far I only have the code for the text from the a tags:</p> <pre><code> pl = psoup.findAll('a',{'class': 'playerLink'}) for a in pl: print a.text </code></pre> <p>I really have no idea how to approach the rest of it.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload