Note that there are some explanatory texts on larger screens.

plurals
  1. POHow to use Stanford CoreNLP with a Non-English parse model?
    primarykey
    data
    text
    <p>I'm trying to detect if a sentence is in <a href="https://stackoverflow.com/questions/19495967/getting-additional-information-active-passive-tenses-from-a-tagger">active or passive</a>. For that, I am using Stanford CoreNLP and watch out for the dependencies 'nsubj' (=active) or 'nsubjpass' (=passive).</p> <p>This works perfectly for English (<a href="http://pastebin.com/M1V1dEPd" rel="nofollow noreferrer">code is here</a>, if you are interested) with the following output:</p> <p><strong>Output:</strong></p> <pre><code>Adding annotator tokenize Adding annotator ssplit Adding annotator pos Reading POS tagger model from lib/stanford-postagger-full-2013-06-20/models/english-left3words-distsim.tagger ... done [1,2 sec]. Adding annotator lemma Adding annotator parse Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... done [1,1 sec]. reln: det reln: nsubjpass &lt;-- yeah! All I want. Passive sentence detected! reln: auxpass reln: root reln: det reln: prep_for </code></pre> <p>However, I also want to use German now and change the following lines for that:</p> <pre><code>Properties props = new Properties(); props.put("parse.flags", ""); props.put("pos.model", "lib/stanford-postagger-full-2013-06-20/models/german-fast.tagger"); props.put("annotators", "tokenize, ssplit, pos, lemma, parse"); props.put("parse.model", "edu/stanford/nlp/models/lexparser/germanPCFG.ser.gz"); &lt;--- not there </code></pre> <p>This fails, because there is no file parsing model "germanPCFG.ser.gz" in the jar (stanford-corenlp-3.2.0-models.jar) - only english. There are German parsing models on the web which I could include (<a href="http://lisa.spinfo.uni-koeln.de/trac/tesla/browser/tesla/trunk/tesla.component.stanfordparser/resources/germanFactored.ser.gz" rel="nofollow noreferrer">see this one, for example</a>), but then I get a massive stack trace. </p> <pre><code>Loading parser from serialized file lib/stanford-postagger-full-2013-06-20/germanFactored.ser.gz ... java.lang.NullPointerException at edu.stanford.nlp.parser.lexparser.BinaryGrammar.init(BinaryGrammar.java:224) at edu.stanford.nlp.parser.lexparser.BinaryGrammar.readObject(BinaryGrammar.java:211) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:172) at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromSerializedFile(LexicalizedParser.java:607) at edu.stanford.nlp.parser.lexparser.LexicalizedParser.getParserFromFile(LexicalizedParser.java:401) at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:158) at edu.stanford.nlp.parser.lexparser.LexicalizedParser.loadModel(LexicalizedParser.java:144) at edu.stanford.nlp.pipeline.ParserAnnotator.loadModel(ParserAnnotator.java:177) at edu.stanford.nlp.pipeline.ParserAnnotator.&lt;init&gt;(ParserAnnotator.java:107) at edu.stanford.nlp.pipeline.StanfordCoreNLP$12.create(StanfordCoreNLP.java:736) at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:81) at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:260) at edu.stanford.nlp.pipeline.StanfordCoreNLP.&lt;init&gt;(StanfordCoreNLP.java:127) at edu.stanford.nlp.pipeline.StanfordCoreNLP.&lt;init&gt;(StanfordCoreNLP.java:123) at nlp.Tagger.parse(Tagger.java:83) at nlp.GUI$5.doInBackground(GUI.java:474) at nlp.GUI$5.doInBackground(GUI.java:468) at javax.swing.SwingWorker$1.call(SwingWorker.java:277) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at javax.swing.SwingWorker.run(SwingWorker.java:316) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Loading parser from text file lib/stanford-postagger-full-2013-06-20/germanFactored.ser.gz java.lang.RuntimeException: lib/stanford-postagger-full-2013-06-20/germanFactored.ser.gz: expecting BEGIN block; got �� </code></pre> <p>If I just take the english parse model (englishPCFG.ser.gz) for the German input, the German passive sentence is not detected correctly. <strong>Any advice on how to continue?</strong></p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload