StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POUsing regex in python to remove blank line in an XML?
primarykey
Id
14427863
data
AcceptedAnswerId
14428914
AnswerCount
2
ClosedDate
CommentCount
1
CommunityOwnedDate
CreationDate
2013-01-20T18:37:19.943
FavoriteCount
0
LastActivityDate
2013-01-20T20:46:48.893
LastEditDate
LastEditorUserId
0
OwnerUserId
1995132
ParentId
0
PostTypeId
1
Score
0
ViewCount
1145
LastEditorDisplayName
text
Body
Sorry if this has been asked before but I cannot find the answer anywhere.. I am trying to use regex to extract element values but the xml being pulled contains a blank line and this seems to be causing errors. Here is one of the elements in the XML: <pre><code><entry> <id>http://feeds.rasset.ie/rteavgen/player/videos/show/?id=10103822</id> <showid>10103822</showid> <platform>iptv</platform> <published>2013-01-19T21:45:00+00:00</published> <updated>2013-01-19T23:41:00+00:00</updated> <title type="text">The Saturday Night Show</title> <content type="text">Chat show, presented by journalist and broadcaster Brendan O'Connor, featuring comedy, celebrity guests and live musical performances.</content> <category term="RTÉ One" rte:type="channel"/> <category term="Entertainment" rte:type="genre"/> <category term="None" rte:type="series"/> <category term="None" rte:type="episode"/> <category term="None" rte:type="ranking"/> <category term="1024" rte:type="genrelist"/> <category term="None" rte:type="keywordlist"/> <category term="1668" rte:type="progid"/> <link rel="self" type="application/atom+xml" href="http://feeds.rasset.ie/rteavgen/player/playlist?showId=10103822"/> <link rel="alternate" type="text/html" href="http://www.rte.ie/player/#v=10103822"/> <rte:valid start="2013-01-19T21:52:12+00:00" end="2013-02-09T21:52:12+00:00"/> <rte:duration ms="4201061" formatted="1:10"/> <rte:statistics views="194"/> <media:title type="plain">The Saturday Night Show</media:title> <media:description type="plain">Chat show, presented by journalist and broadcaster Brendan O'Connor, featuring comedy, celebrity guests and live musical performances.</media:description> <media:player url="http://feeds.rasset.ie/rteavgen/player/player/?id=" width="400" height="300"/> <media:thumbnail url="http://img.rasset.ie/0006e56a.jpg" time="00:00:00+00:00"/> <media:restriction relationship="allow" type="country"/> <media:restriction relationship="disallow" type="country"/> <media:copyright>RTÉ</media:copyright> </entry> </code></pre> You can see between the two "link rel=" elements there is a blank line. When I try to use this regex command it throws the Timeout! error as I'm not handling this properly (Excuse me also as my regex knowledge is almost zero). <pre><code>links = (re.compile ('<showid>(.+?)</showid>\n ' \ '<platform>.+?</platform>\n ' \ '<published>(.+?)</published>\n ' \ '<updated>.+?</updated>\n ' \ '<title type="text">(.+?)</title>\n ' \ '<content type="text">(.+?)</content>\n ' \ '<category term="(.+?)" rte:type="channel"/>\n ' \ '<category term=".+?" rte:type="genre"/>\n ' \ '<category term=".+?" rte:type="series"/>\n ' \ '<category term=".+?" rte:type="episode"/>\n ' \ '<category term=".+?" rte:type="ranking"/>\n ' \ '<category term=".+?" rte:type="genrelist"/>\n ' \ '<category term=".+?" rte:type="keywordlist"/>\n ' \ '<category term=".+?" rte:type="progid"/>\n ' \ '<link rel="self" type=".+?" href=".+?" />\n ' \ '<link rel="alternate" type=".+?" href=".+?" />').findall(data)) </code></pre> I only actually want a few of the fields but I can't seem to find a regex command that allows me to just select the individual element names I want, it makes me enter each one in sequence (again, my lack of regex knowledge is the issue). There are fields that I require beyond the second "link rel=" element that I require but as it keeps falling over on this one I have left them out for now. Anyone know what regex command I need to skip the blank line and also perhaps to tidy up the expression to only extract the elements that I require? Thanks for your help folks, I hope...
Tags
<python><xml><regex><spaces>
Title
Using regex in python to remove blank line in an XML?
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. USmcquaim
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POUsing regex in python to remove blank line in an XML?
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTDownMod
2. VO
 singulars
 PostPostId
 POUsing regex in python to remove blank line in an XML?
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. COis there any reason you aren't using a library to parse the XML like expat or elementtree?
 singulars
 PostPostId
 POUsing regex in python to remove blank line in an XML?
 UserUserId
 USmgoffin

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.