Note that there are some explanatory texts on larger screens.

plurals
  1. POConverting Docx to image using Docx4j and PdfBox causes OutOfMemoryError
    primarykey
    data
    text
    <p>I'm converting the first page of a docx file to an image in twoo steps using dox4j and pdfbox but I'm currently getting an <code>OutOfMemoryError</code> every time.</p> <p>I've been able to determine that the exception is thrown on the very last step of this process, while the <code>convertToImage</code> method is being called, however I've been using the second step of this method to convert pdfs for some time now without issue so I am at a loss as to what might be the cause unless perhaps dox4j is encoding the pdf is a way which I have not yet tested or is corrupt.</p> <p>I've tried replacing the <code>ByteArrayOutputStream</code> with a <code>FileOutputStream</code> and the pdf seems to render correctly is not any larger than I would expect.</p> <p>This is the code I am using:</p> <pre><code>WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(file); org.docx4j.convert.out.pdf.PdfConversion c = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(wordMLPackage); ((org.docx4j.convert.out.pdf.viaXSLFO.Conversion)c).setSaveFO(File.createTempFile("fonts", ".fo")); ByteArrayOutputStream os = new ByteArrayOutputStream(); c.output(os, new PdfSettings()); byte[] bytes = os.toByteArray(); os.close(); ByteArrayInputStream is = new ByteArrayInputStream(bytes); PDDocument document = PDDocument.load(is); PDPage page = (PDPage) document.getDocumentCatalog().getAllPages().get(0); BufferedImage image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 96); is.close(); document.close(); </code></pre> <p><strong>Edit</strong> To give more context on this situation, this code is being run in a grails web-application. I have tried several different variants of this code, including nulling out everything once no longer needed, using FileInputStream and FileOutputStream to try to conserve more physical memory and inspect the output of docx4j and pdfbox, each of which seem to work correctly.</p> <p>I'm using docx4j 2.8.1 and pdfbox 0.7.3, I have also tried pdf-renderer but I still get an OutOfMemoryError. My suspicions are that docx4j is using too much memory but does not produce the error until the pdf to image conversion.</p> <p>I would gladly except an alternate way of converting a docx file to a pdf or directly to an image as an answer, however I am currently trying to replace jodconverter which has been problematic to run on a server.</p>
    singulars
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload