StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POHow to exclude table rows that contain specific strings between start and end tag from matching?
primarykey
Id
5174579
data
AcceptedAnswerId
5174917
AnswerCount
2
ClosedDate
CommentCount
2
CommunityOwnedDate
CreationDate
2011-03-02T22:53:24.303
FavoriteCount
0
LastActivityDate
2011-03-03T01:25:40.377
LastEditDate
2017-05-23T11:55:24.110
LastEditorUserId
-1
OwnerUserId
529778
ParentId
0
PostTypeId
1
Score
0
ViewCount
393
LastEditorDisplayName
text
Body
Context The case is screen scraping web content using QuotaXML SDK 1.6 to finally display the data on the dashboard and on the iPhone. This QuotaXML tool offers regex for extracting table data only. QuotaXML does parse html tables using a three step approach. 1. First it identifies the table, for example using "<code>(?si)<table.*?>(.*?)</table></code>" 2. Second within this parsed table it identifies rows, like "<code>(?si)<tr.*?>(.*?)</tr></code>" 3. Third within this row scope, individual cells are identified like "<code>(?si)<tr.*?>(.*?)</tr></code>" The problem The source html contains some rows that are not relevant data like lines or images that span full table width using a colspan. Or tables contain data cells which are not relevant to the data lines needed, like call detail records which also contain calls to freephones which are not substracted from the minutes in your plan, in this case 0800 and 00800 numbers. In other words <code>(.*?)</code> may not match ' colspan="' neither '>0800' neither '>00800'. In code: <pre><code>exclude:<tr><td colspan="2"></td></tr> include:<tr><td>Date</td><td>Time</td></tr> exclude:<tr><td>05-01-2011</td><td>08004913</td></tr> include:<tr><td>05-01-2011</td><td>0123456789</td></tr> </code></pre> Homework done Even trying my first (start simple) tries to only exclude colspan are all failing: <ol> <li><code>(?si)<tr.*?>(?!colspan)(.*?)</tr></code></li> <li><code>(?si)<tr.*?>(.*?)(?!colspan)</tr></code></li> <li><code>(?si)<tr.*?>.*?[^colspan].*?</tr></code></li> <li><code>(?si)<tr(\s[^>]*)?>.*?(?!colspan).*?</tr></code></li> <li><code>(?si)<tr(\s[^>]*)?>.*?(!colspan).*?</tr></code></li> <li><code>(?si)<tr(\s[^>]*)?>(.*?)(?!colspan)</tr></code></li> <li><code>(?si)<tr.*?>^(?!.*?colspan=").*?</tr></code> <a href="https://stackoverflow.com/questions/1240275/how-to-negate-specific-word-in-regex">How to negate specific word in regex?</a> seems related though these suggestions don't result in a match at all.</li> <li><code>(?si)<tr.*?>(.(?<!colspan))*?</tr></code></li> <li><code>(?si)<tr.*?>(?!.*colspan).*</tr></code> Neither do give do positive and negative lookarounds using <a href="http://www.regular-expressions.info/lookaround.html" rel="nofollow noreferrer">http://www.regular-expressions.info/lookaround.html</a> the clue.</li> </ol> How should I correctly write this regex?
Tags
<html><regex>
Title
How to exclude table rows that contain specific strings between start and end tag from matching?
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USCommunity
UserOwnerUserId
1. USPro Backup
plurals
PostLinksPostIdRelatedPostId
1. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
2. PL
 singulars
 LinkTypeLinkTypeId
 LTLinked
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POHow to exclude table rows that contain specific strings between start and end tag from matching?
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTApproveEditSuggestion
CommentsPostId
1. COJust **don't** parse HTML with regexp: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags
 singulars
 PostPostId
 POHow to exclude table rows that contain specific strings between start and end tag from matching?
 UserUserId
 USOndrej Tucny
2. COI am aware that I shouldn't parse HTML with a regex. As explained in the question the tool does not give other options than using regex.
 singulars
 PostPostId
 POHow to exclude table rows that contain specific strings between start and end tag from matching?
 UserUserId
 USPro Backup

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.