public final class PdfTextExtractor extends Object
| Modifier and Type | Method and Description |
|---|---|
static String |
getTextFromPage(PdfReader reader,
int pageNumber)
Extract text from a specified page using the default strategy.
|
static String |
getTextFromPage(PdfReader reader,
int pageNumber,
TextExtractionStrategy strategy)
Extract text from a specified page using an extraction strategy.
|
static String |
getTextFromPage(PdfReader reader,
int pageNumber,
TextExtractionStrategy strategy,
Map<String,ContentOperator> additionalContentOperators)
Extract text from a specified page using an extraction strategy.
|
public static String getTextFromPage(PdfReader reader, int pageNumber, TextExtractionStrategy strategy, Map<String,ContentOperator> additionalContentOperators) throws IOException
reader - the reader to extract text frompageNumber - the page to extract text fromstrategy - the strategy to use for extracting textadditionalContentOperators - an optional map of custom ContentOperators for rendering instructionsIOException - if any operation fails while reading from the provided PdfReaderpublic static String getTextFromPage(PdfReader reader, int pageNumber, TextExtractionStrategy strategy) throws IOException
reader - the reader to extract text frompageNumber - the page to extract text fromstrategy - the strategy to use for extracting textIOException - if any operation fails while reading from the provided PdfReaderpublic static String getTextFromPage(PdfReader reader, int pageNumber) throws IOException
Note: the default strategy is subject to change. If using a specific strategy
is important, use getTextFromPage(PdfReader, int, TextExtractionStrategy)
reader - the reader to extract text frompageNumber - the page to extract text fromIOException - if any operation fails while reading from the provided PdfReaderCopyright © 2022. All rights reserved.