StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POCorrectly parsing string literals with python's re module
primarykey
Id
14366401
data
AcceptedAnswerId
14366532
AnswerCount
3
ClosedDate
CommentCount
2
CommunityOwnedDate
CreationDate
2013-01-16T19:38:03.653
FavoriteCount
0
LastActivityDate
2016-06-21T03:58:50.660
LastEditDate
2013-01-16T20:31:05.910
LastEditorUserId
677283
OwnerUserId
677283
ParentId
0
PostTypeId
1
Score
4
ViewCount
2268
LastEditorDisplayName
text
Body
I'm trying to add some light markdown support for a javascript preprocessor which I'm writing in Python. For the most part it's working, but sometimes the regex I'm using is acting a little odd, and I think it's got something to do with raw-strings and escape sequences. The regex is: <code>(?<!\\)\"[^\"]+\"</code> Yes, I am aware that it only matches strings beginning with a <code>"</code> character. However, this project is born out of curiosity more than anything, so I can live with it for now. To break it down: <pre><code>(?<\\)\" # The group should begin with a quotation mark that is not escaped [^\"]+ # and match any number of at least one character that is not a quotation mark (this is the biggest problem, I know) \" # and end at the first quotation mark it finds </code></pre> That being said, I (obviously) start hitting problems with things like this: <code>"This is a string with an \"escaped quote\" inside it"</code> I'm not really sure how to say "Everything but a quotation mark, unless that mark is escaped". I tried: <pre><code>([^\"]|\\\")+ # a group of anything but a quote or an escaped quote </code></pre> , but that lead to very strange results. I'm fully prepared to hear that I'm going about this all wrong. For the sake of simplicity, let's say that this regex will always start and end with double quotes (<code>"</code>) to avoid adding another element in the mix. I really want to understand what I have so far. Thanks for any assistance. EDIT As a test for the regex, I'm trying to find all string literals in the minified jQuery script with the following code (using the unutbu's pattern below): <pre><code>STRLIT = r'''(?x) # verbose mode (?<!\\) # not preceded by a backslash " # a literal double-quote .*? # non-greedy 1-or-more characters (?<!\\) # not preceded by a backslash " # a literal double-quote ''' f = open("jquery.min.js","r") jq = f.read() f.close() literals = re.findall(STRLIT,jq) </code></pre> The answer below fixes almost all issues. The ones that do arise are within jquery's own regular expressions, which is a very edge case. The solution no longer misidentifies valid javascript as markdown links, which was really the goal. 
Tags
<python><regex>
Title
Correctly parsing string literals with python's re module
singulars
PostAcceptedAnswerId
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
PostParentId
1. This table or related slice is empty.
PostTypePostTypeId
1. PTQuestion
UserLastEditorUserId
1. USTom Thorogood
UserOwnerUserId
1. USTom Thorogood
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. PO
 singulars
 PostTypePostTypeId
 PTAnswer
2. PO
 singulars
 PostTypePostTypeId
 PTAnswer
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 POCorrectly parsing string literals with python's re module
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 POCorrectly parsing string literals with python's re module
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 POCorrectly parsing string literals with python's re module
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. COIs there a reason you're trying to write your own Markdown parser instead of using one that's already debugged?
 singulars
 PostPostId
 POCorrectly parsing string literals with python's re module
 UserUserId
 USkindall
2. COBecause I want to learn.
 singulars
 PostPostId
 POCorrectly parsing string literals with python's re module
 UserUserId
 USTom Thorogood

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.