StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

PO
primarykey
Id
11481384
data
AcceptedAnswerId
0
AnswerCount
0
ClosedDate
CommentCount
1
CommunityOwnedDate
CreationDate
2012-07-14T05:42:07.247
FavoriteCount
0
LastActivityDate
2012-07-14T06:01:13.370
LastEditDate
2012-07-14T06:01:13.370
LastEditorUserId
1470293
OwnerUserId
1470293
ParentId
11370247
PostTypeId
2
Score
11
ViewCount
0
LastEditorDisplayName
text
Body
<p>There is an unsupervized boot-strapping approach that was explained to me to do this.</p> <p>There are different ways of applying this approach, and variants, but here's a simplified version. </p> <h2>Concept:</h2> <p>Start by a assuming that if two words are synonyms, then in your corpus they will appear in similar settings. (eating grapes, eating sandwich, etc.)</p> <p>(In this variant I will use co-occurence as the setting).</p> <h2>Boot-Strapping Algorithm:</h2> <p>We have two lists, </p> <ul> <li>one list will contain the words that co-occur with food items</li> <li>one list will contain the words that are food items</li> </ul> <h3>Supervized Part</h3> <p>Start by seeding one of the lists, for instance I might write the word Apple on the food items list.</p> <p>Now let the computer take over.</p> <h3>Unsupervized Parts</h3> <p>It will first find all words in the corpus that appear just before Apple, and sort them in order of most occuring. </p> <p>Take the top two (or however many you want) and add them into the co-occur with food items list. For example, perhaps "eating" and "Delicious" are the top two.</p> <p>Now use that list to find the next two top food words by ranking the words that appear to the right of each word in the list.</p> <p>Continue this process expanding each list until you are happy with the results. </p> <h3>Once that's done</h3> <p>(you may need to manually remove some things from the lists as you go which are clearly wrong.)</p> <h3>Variants</h3> <p>This procedure can be made quite effective if you take into account the grammatical setting of the keywords. </p> <pre><code>Subj ate NounPhrase NounPhrase are/is Moldy The workers harvested the Apples. subj verb Apples That might imply harvested is an important verb for distinguishing foods. Then look for other occurrences of subj harvested nounPhrase </code></pre> <p>You can expand this process to move words into categories, instead of a single category at each step.</p> <h3>My Source</h3> <p>This approach was used in a system developed at the University of Utah a few years back which was successful at compiling a decent list of weapon words, victim words, and place words by just looking at news articles.</p> <p>An interesting approach, and had good results.</p> <p>Not a neural network approach, but an intriguing methodology.</p> <h3>Edit:</h3> <p>the system at the University of Utah was called AutoSlog-TS, and a short slide about it can be seen <a href="http://www.eng.utah.edu/~cs5340/slides/event_ie.4.pdf">here</a> towards the end of the presentation. And a link to a paper about it <a href="http://www.cs.utah.edu/research/techreports/2004/pdf/UUCS-04-015.pdf">here</a></p>
Tags
Title
singulars
PostAcceptedAnswerId
1. This table or related slice is empty.
PostParentId
1. POMethods for automated synonym detection
  singulars
  PostTypePostTypeId
  PTQuestion
PostTypePostTypeId
1. PTAnswer
UserLastEditorUserId
1. USXantix
UserOwnerUserId
1. USXantix
plurals
PostLinksPostIdRelatedPostId
1. This table or related slice is empty.
PostLinksRelatedPostIdPostId
1. This table or related slice is empty.
PostsAcceptedAnswerId
1. POMethods for automated synonym detection
  singulars
  PostTypePostTypeId
  PTQuestion
PostsParentIdCreationDate
1. This table or related slice is empty.
VotesPostIdCreationDate
1. VO
  singulars
  PostPostId
  PO
  UserUserId
  This table or related slice is empty.
  VoteTypeVoteTypeId
  VTUpMod
2. VO
  singulars
  PostPostId
  PO
  UserUserId
  This table or related slice is empty.
  VoteTypeVoteTypeId
  VTAcceptedByOriginator
3. VO
  singulars
  PostPostId
  PO
  UserUserId
  This table or related slice is empty.
  VoteTypeVoteTypeId
  VTBountyClose
CommentsPostId
1. COWell, this method isn't explicitly for a neural net, but the general method of finding similar words is the first method suggested here that is actually directly applicable to my problem. Again, this is definitely made for a larger corpus, but I think it should work if I make each seeding tree essentially an incrementation tree instead where a hit within a tree increments the link between the documents and the lack of a hit decrements it.
  singulars
  PostPostId
  PO
  UserUserId
  USSlater Victoroff

Querying!

Guidance

A row detail

Detail views are divided into sections. All the information in the data section comes from columns in the selected row. The other sections display data from other, related rows.

Related data can be related in a to-one or a to-many fashion. Captions of data related in a to-many fashion link to a list view showing a filtered view of the table.

Try moving around until you find a non-empty to-many entry and click on the label to get to one. You can move back to the root by clicking on the database name in the header.