Note that there are some explanatory texts on larger screens.

plurals
  1. POUpload a RAMDirectory to AzureCloud creates EOF exceptions
    text
    copied!<p>I'm currently trying to use AzureBlobStorage to work with Lucene. So I created a new directory and to avoid too many latency I use a RAMDirectory as cache (this might not be the best solution but it seemed easy to do, I'm open to suggestions). Anyway everything seems to work quite well except when I write <code>.nrm</code> files to the cloud which always raises EOFExceptions when I upload them to the blob. </p> <p>I'm going to explain quickly how the directory works cause it will help to understand: I've created a new IndexOutput called <code>BlobOutputStream</code> it pretty much encapsulate a <code>RAMOutputStream</code> however when it is closed it uploads everything to the azureBlobStorage. Here is how this is done:</p> <pre><code>String fname = name; output.flush(); long length = output.length(); output.close(); System.out.println("Size of the upload: " + length); InputStream bStream = directory.openCachedInputAsStream(fname); System.out.println("Uploading cache version of: " + fname); blob.upload(bStream, length); System.out.println("PUT finished for: " + fname); </code></pre> <p><code>blob</code> is a <code>CloubBlockBlob</code> and <code>output</code> is a <code>RAMOutputStream</code>. <code>directory.openCacheInputAsStream</code> opens a new <code>InputStream</code> on an <code>IndexInput</code>. </p> <p>So everything works most of the time except with <code>.nrm</code> files which always raise an <code>EOFException</code> when they are being uploaded. Though I checked they are 5 bytes long when only one document is in the index and contains "NRM-1 and the norm for that document".</p> <p>I don't really understand why Azure tries to upload more than it exists in the file when I've specified the size of the stream in the upload call.</p> <p>I'm sorry if I'm not clear it's quite challenging to explain. Please tell me if you need more code I'll make everything accessible on a github or something.</p> <p>Thanks for your answers</p> <h1>EDIT</h1> <p>So maybe the code of my <code>inputStream</code> might show the problem:</p> <pre><code>public class StreamInput extends InputStream { public IndexInput input; public StreamInput(IndexInput openInput) { input = openInput; } @Override public int read() throws IOException { System.out.println("Attempt to read byte: "+ input.getFilePointer()); int b = input.readByte(); System.out.println(b); return b; } } </code></pre> <p>And here are the traces I get: <pre><code> Size of the upload: 5 Uploading cache version of: _0.nrm Attempt to read byte: 0 78 Attempt to read byte: 1 82 Attempt to read byte: 2 77 Attempt to read byte: 3 -1 Attempt to read byte: 4 114 Attempt to read byte: 5 Attempt to read byte: 1029 java.io.EOFException: read past EOF: RAMInputStream(name=_0.nrm) at org.apache.lucene.store.RAMInputStream.switchCurrentBuffer(RAMInputStream.java:100) at org.apache.lucene.store.RAMInputStream.readByte(RAMInputStream.java:73) at org.lahab.clucene.core.StreamInput.read(StreamInput.java:18) at java.io.InputStream.read(InputStream.java:151) at com.microsoft.windowsazure.services.core.storage.utils.Utility.writeToOutputStream(Utility.java:1024) at com.microsoft.windowsazure.services.blob.client.BlobOutputStream.write(BlobOutputStream.java:560) at com.microsoft.windowsazure.services.blob.client.CloudBlockBlob.upload(CloudBlockBlob.java:455) at com.microsoft.windowsazure.services.blob.client.CloudBlockBlob.upload(CloudBlockBlob.java:374) at org.lahab.clucene.core.BlobOutputStream.close(BlobOutputStream.java:92) at org.apache.lucene.util.IOUtils.close(IOUtils.java:141) at org.apache.lucene.index.NormsWriter.flush(NormsWriter.java:172) at org.apache.lucene.index.DocInverter.flush(DocInverter.java:71) at org.apache.lucene.index.DocFieldProcessor.flush(DocFieldProcessor.java:60) at org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:581) at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3587) at org.apache.lucene.index.IndexWriter.prepareCommit(IndexWriter.java:3376) at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3485) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3467) at org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:3451) at org.lahab.clucene.server.IndexerNode.addDocuments(IndexerNode.java:139) </pre></code></p> <p>It really seems like that the upload just goes too far...</p>
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload