StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
6227712
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
1
CommunityOwnedDate
CreationDate
2011-06-03T13:27:58.403
FavoriteCount
0
LastActivityDate
2011-06-03T13:27:58.403
LastEditDate
LastEditorUserId
0
OwnerUserId
2140998
ParentId
6226840
PostTypeId
2
Score
0
ViewCount
0
LastEditorDisplayName
text
Body
Identifying the gaps is an interesting problem. The best approach will depend on the size of the gap, but here is another way to tackle it, and one which might be better if the gaps are reasonably large compared to the number of records you have. Use a MySQL aggregation function in a query to count the number of records for a set of buckets. The buckets need to be similar in size to the kinds of gaps you are interested in. Assuming you're interested in gaps approximating a day or so, I'd do something like this: <pre><code>SELECT TO_DAYS(my_timestamp), COUNT(*) FROM my_table GROUP BY TO_DAYS(my_timestamp) </code></pre> This will return an association between days and timestamp counts. I'd do the rest in a language like Perl or Java (or even R, see later) where I can process the data. The technique I'd use would be a test of the difference between the observed frequency (the count) and the expected frequency, which will be the total number of records, divided by day range. The expected frequency for each day would be something like: <pre><code>SELECT (SELECT COUNT(*) FROM my_table) / ((SELECT TO_DAYS(MAX(my_timestamp)) FROM my_table) - (SELECT TO_DAYS(MIN(my_timestamp)) FROM my_table) + 1) </code></pre> Now, for each bucket (remembering that in the first result, completely missing days will just be not returned, not returned as a count of zero -- you need to treat them as if they are zero, you can use a statistical test, the chi square test, to estimate the probability of this being chance (for details, see: <a href="http://en.wikipedia.org/wiki/Pearson%27s_chi-square_test" rel="nofollow">http://en.wikipedia.org/wiki/Pearson%27s_chi-square_test</a>). The calculation is, basically ((expected - observed)^2 / expected). This is an estimate of the likelihood of deviation. If you need to work out which buckets are low in samples, set a reasonable threshold on this calculated value, and look for buckets where the value exceeds the threshold. It may take a little experimentation to devise an appropriate value, but this is a sound way of determining gaps. 
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POMySQL Datetime - Gap identification
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. This table or related slice is empty.
UserOwnerUserId
1. USStuart Watt
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. This table or related slice is empty.
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. This table or related slice is empty.
CommentsPostId
1. COWhy R? Well, it has some very good built-in stuff for statistical calculations like the chi square tests
 singulars
 PostPostId
 PO
 UserUserId
 USStuart Watt

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.