Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    text
    copied!<p>Hiya. I'd first look to <a href="http://www.opencalais.com/" rel="nofollow noreferrer">OpenCalais</a> for finding entities within texts or input. It's great, and I've used it plenty myself (from the Reuters guys). </p> <p>After that you can analyze the text further, creating associations between entities and words. I'd probably look them up in something like <a href="http://en.wikipedia.org/wiki/WordNet" rel="nofollow noreferrer">WordNet</a> and try to typify them, or even auto-generate some ontology that matches the domain you're trying to map.</p> <p>As to how to pull it all together, there's many things you can do; the above, or two- or three-pass models of trying to figure out what words are and mean. Or, if you control the input, make up a format that is easier to parse, or go down the <a href="http://en.wikipedia.org/wiki/Natural_language_processing" rel="nofollow noreferrer">murky path of NLP</a> (which is a lot of fun).</p> <p>Or you could look to something like <a href="http://jena.sourceforge.net/" rel="nofollow noreferrer">Jena</a> for parsing arbitrary RDF snippets, although I don't like the RDF premise myself (I'm a Topic Mapper). I've written stuff that looks up words or phrases or names in WikiPedia, and rate their hitrate based on the semantics found in the WikiPedia pages (I could tell you the details more if requested, but isn't it more fun to work it out yourself and come up with something better than mine? :), ie. number of links, number of SeeAlso, amount of text, how big the discussion page, etc.</p> <p>I've written tons of stuff over the years (even in PHP and Perl; look to <a href="http://search.cpan.org/~drrho/" rel="nofollow noreferrer">Robert Barta's Topic Maps stuff on CPAN</a>, especially the TM modules for some kick-ass stuff), from engines to parsers to something weird in the middle. Associative arrays which breaks words and phrases apart, creating cumulative histograms to sort their components out and so forth. It's all fun stuff, but as to shrink-wrapped tools, I'm not so sure. Everyones goals and needs seems to be different. It depends on how complex and sophisticated you want to become.</p> <p>Anyway, hope this helps a little. Cheers! :)</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload