StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
1410855
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
3
CommunityOwnedDate
CreationDate
2009-09-11T13:26:01.623
FavoriteCount
0
LastActivityDate
2009-09-11T14:49:16.077
LastEditDate
2009-09-11T14:49:16.077
LastEditorUserId
2192597
OwnerUserId
2192597
ParentId
1410822
PostTypeId
2
Score
9
ViewCount
0
LastEditorDisplayName
text
Body
I would start here: <a href="http://en.wikipedia.org/wiki/Longest_common_substring_problem" rel="noreferrer">http://en.wikipedia.org/wiki/Longest_common_substring_problem</a> There are links to supplemental information in the external links, including Perl implementations of the two algorithms explained in the article. Edited to add: Based on the discussion, I still think Longest Common Substring could be at the heart of this problem. Even in the Journal example you reference in your comment, the defining characteristic of that set is the substring 'Journal'. I would first consider what defines a set as separate from the other sets. That gives you your partition to divide up the data, and then the problem is in measuring how much commonality exists within a set. If the defining characteristic is a common substring, then Longest Common Substring would be a logical starting point. To automate the process of set detection, in general, you will need a pairwise measure of commonality which you can use to measure the 'difference' between all possible pairs. Then you need an algorithm to compute the partition that results in the overall lowest total difference. If the difference measure is not Longest Common Substring, that's fine, but then you need to determine what it will be. Obviously it needs to be something concrete that you can measure. Bear in mind also that the properties of your difference measurement will bear on the algorithms that can be used to make the partition. For example, assume diff(X,Y) gives the measure of difference between X and Y. Then it would probably be useful if your measure of distance was such that diff(A,C) <= diff(A,B) + diff(B,C). And obviously diff(A,C) should be the same as diff(C,A). In thinking about this, I also begin to wonder whether we could conceive of the 'difference' as a distance between any two strings, and, with a rigorous definition of the distance, could we then attempt some kind of <a href="http://en.wikipedia.org/wiki/Cluster_analysis" rel="noreferrer">cluster analysis</a> on the input strings. Just a thought.
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POHow can I detect common substrings in a list of strings
 singulars
 PostTypePostTypeId
 PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USJeremy Bourque
UserOwnerUserId
1. USJeremy Bourque
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. POHow can I detect common substrings in a list of strings
 singulars
 PostTypePostTypeId
 PTQuestion
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
2. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTUpMod
3. VO
 singulars
 PostPostId
 PO
 UserUserId
 This table or related slice is empty.
 VoteTypeVoteTypeId
 VTAcceptedByOriginator
CommentsPostId

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.