Note that there are some explanatory texts on larger screens.

plurals
  1. PO
    primarykey
    data
    text
    <p>Yes there is. Check out the <code>text.pdf.parser</code> package, specifically <code>LocationTextExtractionStrategy</code>. Actually, that might not do the trick either. You'll probably want to write your own <code>TextExtractionStrategy</code> to feed into PdfTextExtractor:</p> <pre><code>MyTexExStrat strat = new MyTexExStrat(); PdfTextExtractor.getTextFromPage(reader, pageNum, strat); // get the strings-n-rects from strat. public class MyTexExStrat implements TextExtractionStrategy { void beginTextBlock() {} void endTextBlock() {} void renderImage(ImageRenderInfo info) {} void renderText(TextRenderInfo info) { // track text and location here. } } </code></pre> <p>You'll probably want to look at the source for LocationTextExtractionStrategy to see how it combines text that shares a baseline. You might even just modify LTES to store parallel arrays of strings and rects.</p> <p>PS: to build the rects, you can just get the AscentLine &amp; DescentLine and use those coordinates as the top and bottom corners:</p> <pre><code>Vector bottomLeft = info.getDescentLine().getStartPoint(); Vector topRight = info.getAscentLine().getEndPoint(); Rectangle rect = new Rectangle(bottomLeft.get(Vector.I1), bottomLeft.get(Vector.I2), topRight.get(Vector.I1), topRight.get(Vector.I2)); </code></pre> <p>Warning: The above code ass-u-mes that the text is horizontal and proceeds from left to right. Rotated text will screw it up, as will vertical text or right-to-left (Arabic, Hebrew) text. For most applications, the above should be fine, but know it's limits.</p> <p>Good hunting.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. VO
      singulars
      1. This table or related slice is empty.
    2. VO
      singulars
      1. This table or related slice is empty.
    3. VO
      singulars
      1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload