Note that there are some explanatory texts on larger screens.

plurals
  1. POLucene: IndexSearcher.search() causes java heap space error on very large database
    primarykey
    data
    text
    <p>I have a very large database (approximately 30 million records, each with at least 26 fields) which I have indexed with Apache Lucene Java. </p> <p>I am constructing a query from two fields. Each search term could appear in any one of nine fields, and I want my query to return a Document if both of the search terms appear in any of the relevant fields in the Document. The query is structured like so:</p> <pre><code>Private Query CreateQuery(String theSearchTerm, String theField) throws ParseException { StandardAnalyzer theAnalyzer = new StandardAnalyzer(Version.LUCENE_35); Query q; QueryParser qp = new QueryParser(Version.LUCENE_35, theField, theAnalyzer); qp.setDefaultOperator(QueryParser.Operator.AND); qp.setAllowLeadingWildcard = true; q = qp.parse(theSearchTerm); return q; } Public ScoreDoc[] RunTheQuery(String searchTerm1, String searchTerm2) { Directory theIndex = new SimpleFSDirectory(new File("C:\\MyDirectory"); IndexSearcher theSearcher = new IndexSearcher(InderReader.open(theIndex)); BooleanQuery theTopLevelBooleanQuery = new BooleanQuery(); BooleanQuery fields1 = new BooleanQuery(); BooleanQuery fields2 = new BooleanQuery(); BooleanQuery fields3 = new BooleanQuery(); BooleanQuery fields4 = new BooleanQuery(); BooleanQuery fields5 = new BooleanQuery(); BooleanQuery fields6 = new BooleanQuery(); BooleanQuery fields7 = new BooleanQuery(); BooleanQuery fields8 = new BooleanQuery(); BooleanQuery fields9 = new BooleanQuery(); BooleanQuery innerQuery = new BooleanQuery(); fields1.add(CreateQuery(searchTerm1, param1), BooleanClause.Occur.MUST); fields1.add(CreateQuery(searchTerm2, param2), BooleanClause.Occur.MUST); fields2.add(CreateQuery(searchTerm1, param3), BooleanClause.Occur.MUST); fields2.add(CreateQuery(searchTerm2, param4), BooleanClause.Occur.MUST); fields3.add(CreateQuery(searchTerm1, param5), BooleanClause.Occur.MUST); fields3.add(CreateQuery(searchTerm2, param6), BooleanClause.Occur.MUST); fields4.add(CreateQuery(searchTerm1, param7), BooleanClause.Occur.MUST); fields4.add(CreateQuery(searchTerm2, param8), BooleanClause.Occur.MUST); fields5.add(CreateQuery(searchTerm1, param9), BooleanClause.Occur.MUST); fields5.add(CreateQuery(searchTerm2, param10), BooleanClause.Occur.MUST); fields6.add(CreateQuery(searchTerm1, param11), BooleanClause.Occur.MUST); fields6.add(CreateQuery(searchTerm2, param12), BooleanClause.Occur.MUST); fields7.add(CreateQuery(searchTerm1, param13), BooleanClause.Occur.MUST); fields7.add(CreateQuery(searchTerm2, param14), BooleanClause.Occur.MUST); fields8.add(CreateQuery(searchTerm1, param15), BooleanClause.Occur.MUST); fields8.add(CreateQuery(searchTerm2, param16), BooleanClause.Occur.MUST); fields9.add(CreateQuery(searchTerm1, param17), BooleanClause.Occur.MUST); fields9.add(CreateQuery(searchTerm2, param18), BooleanClause.Occur.MUST); innerQuery.add(fields1, BooleanClause.Occur.SHOULD); innerQuery.add(fields2, BooleanClause.Occur.SHOULD); innerQuery.add(fields3, BooleanClause.Occur.SHOULD); innerQuery.add(fields4, BooleanClause.Occur.SHOULD); innerQuery.add(fields5, BooleanClause.Occur.SHOULD); innerQuery.add(fields6, BooleanClause.Occur.SHOULD); innerQuery.add(fields7, BooleanClause.Occur.SHOULD); innerQuery.add(fields8, BooleanClause.Occur.SHOULD); innerQuery.add(fields9, BooleanClause.Occur.SHOULD); theTopLevelBooleanQuery.add(innerQuery, BooleanClause.Occur.MUST); TopDocScoreCollector collector = TopDocScoreCollector.create(200, true); //Heap space error occurs here theSearcher.search(theTopLevelBooleanQuery, collector); ScoreDoc[] hits = collector.topDocs().scoreDocs; return hits; } </code></pre> <p>My problem is that when I call the IndexSearcher.search() method, the java.exe process on the server (Windows Server 2003 R2) consumes more than 540 MB, which causes a java heap space error. For completeness, the java app is running on a web server (currently Oracle Glassfish, although I'm looking to move to Apache Tomcat).</p> <p>Does anyone have an idea for how to stop this heap space error? A StackOverflow post (http://stackoverflow.com/questions/7259736/cant-open-lucene-index-java-heap-space) seems to address a similar problem, but doesn't really give a detailed answer.</p> <p>Is the only answer to increase the amount of memory that the Java process can use? Is the only answer to write a new searcher, in which case can anyone recommend a good article about light weight searchers?</p> <p>Is there a way of solving this issue by modifying the above code?</p> <p>Any help would be gratefully received, Thanks, Rik</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload