StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
2635094
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
2
CommunityOwnedDate
CreationDate
2010-04-14T05:31:54.960
FavoriteCount
0
LastActivityDate
2010-04-14T06:26:31.013
LastEditDate
2010-04-14T06:26:31.013
LastEditorUserId
276101
OwnerUserId
276101
ParentId
2635082
PostTypeId
2
Score
9
ViewCount
0
LastEditorDisplayName
text
Body
Using <code>split</code> to count isn't the most efficient, but if you insist on doing that, the proper way is this: <pre><code>haystack.split(needle, -1).length -1 </code></pre> If you don't set <code>limit</code> to <code>-1</code>, <code>split</code> defaults to <code>0</code>, which removes trailing empty strings, which messes up your count. From <a href="http://java.sun.com/javase/6/docs/api/java/lang/String.html#split%28java.lang.String,%20int%29" rel="noreferrer">the API</a>: <blockquote> The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. [...] If <code>n</code> is zero then [...] trailing empty strings will be discarded. </blockquote> You also need to subtract 1 from the <code>length</code> of the array, because <code>N</code> occurrences of the delimiter splits the string into <code>N+1</code> parts. <hr> As for the regex itself (i.e. the <code>needle</code>), you can use <code>\b</code> the word boundary anchors around the <code>word</code>. If you allow <code>word</code> to contain metacharacters (e.g. count occurrences of <code>"$US"</code>), you may want to <a href="http://java.sun.com/javase/6/docs/api/java/util/regex/Pattern.html#quote%28java.lang.String%29" rel="noreferrer"><code>Pattern.quote</code></a> it. <hr> <blockquote> I've come up with this: <pre><code>numThe += line.split("[^a-zA-Z][Tt]he[^a-zA-Z]", -1).length - 1; </code></pre> Though still getting some strange numbers. I was able to get an accurate general count (without the regular expression), now my issue is with the regexp. </blockquote> Now the issue is that you're not counting <code>[Tt]he</code> that appears as the first or last word, because the regex says that it has to be preceded/followed by some character, something that matches <code>[^a-zA-Z]</code> (that is, your match must be of length 5!). You're not allowing the case where there isn't a character at all! You can try something like this instead: <pre><code>"(^|[^a-zA-Z])[Tt]he([^a-zA-Z]|$)" </code></pre> This isn't the most concise solution, but it works. Something like this (using <a href="http://www.regular-expressions.info/lookaround.html" rel="noreferrer">negative lookarounds</a>) also works: <pre><code>"(?<![a-zA-Z])[Tt]he(?![^a-zA-Z])" </code></pre> This has the benefit of matching just <code>[Tt]he</code>, without any extra characters around it like your previous solution did. This is relevant in case you actually want to process the tokens returned by <code>split</code>, because the delimiter in this case isn't "stealing" anything from the tokens. <hr> <h3>Non-<code>split</code></h3> Though using <code>split</code> to count is rather convenient, it isn't the most efficient (e.g. it's doing all kinds of work to return those strings that you discard). The fact that as you said you're counting line-by-line means that the pattern would also have to be recompiled and thrown away every line. A more efficient way would be to use the same regex you did before and do the usual <code>Pattern.compile</code> and <code>while (matcher.find()) count++;</code>
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POJava Counting # of occurrences of a word in a string
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USpolygenelubricants
UserOwnerUserId
1. USpolygenelubricants
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. POJava Counting # of occurrences of a word in a string
 singulars
 PostTypePostTypeId
 PTQuestion
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTAcceptedByOriginator
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
CommentsPostId
1. COReading through the documents and then spitting out my results line by line to see where it's not seeing my words my search of : "[^a-zA-Z][Tt]he[^a-zA-Z]" Does not count any 'the' that starts at the beginning of the string. Is there a reason why?
 singulars
 PostPostId
 PO
 UserUserId
 USDoug
2. CO@Doug: see my edit.
 singulars
 PostPostId
 PO
 UserUserId
 USpolygenelubricants

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.