java获取pdf的页数、内容和缩略图
生活随笔
收集整理的這篇文章主要介紹了
java获取pdf的页数、内容和缩略图
小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.
1.導(dǎo)入maven依賴
? ? ? ?<dependency><groupId>org.apache.pdfbox</groupId><artifactId>pdfbox</artifactId><version>2.0.8</version></dependency><dependency><groupId>com.github.jai-imageio</groupId><artifactId>jai-imageio-jpeg2000</artifactId><version>1.3.0</version></dependency>2.工具類
public class PdfUtil {private static Logger logger = LoggerFactory.getLogger(PdfUtil.class); ?/*** 通過PDFbox獲取文章總頁數(shù)** @param filePath:文件路徑* @return* @throws IOException*/public static int getNumberOfPages(String filePath) throws IOException {PDDocument pdDocument = PDDocument.load(new File(filePath));int pages = pdDocument.getNumberOfPages();pdDocument.close();return pages;} ? ?/*** 通過PDFbox獲取文章內(nèi)容** @param filePath* @return*/public static String getContent(String filePath) throws IOException {PDFParser pdfParser = new PDFParser(new org.apache.pdfbox.io.RandomAccessFile(new File(filePath), "rw"));pdfParser.parse();PDDocument pdDocument = pdfParser.getPDDocument();String text = new PDFTextStripper().getText(pdDocument);pdDocument.close(); ?return text;} ?/*** 通過PDFbox生成文件的縮略圖** @param filePath:文件路徑* @param outPath:輸出圖片路徑* @throws IOException*/public static void getThumbnails(String filePath, String outPath) throws IOException {// 利用PdfBox生成圖像PDDocument pdDocument = PDDocument.load(new File(filePath));PDFRenderer renderer = new PDFRenderer(pdDocument); ?// 構(gòu)造圖片BufferedImage imgTemp = renderer.renderImageWithDPI(0, 30, ImageType.RGB);// 設(shè)置圖片格式Iterator<ImageWriter> it = ImageIO.getImageWritersBySuffix("png");// 將文件寫出ImageWriter writer = it.next();ImageOutputStream imageout = ImageIO.createImageOutputStream(new FileOutputStream(outPath));writer.setOutput(imageout);writer.write(new IIOImage(imgTemp, null, null));imgTemp.flush();imageout.flush();imageout.close();pdDocument.close();} ?/*** PDF轉(zhuǎn)圖片 根據(jù)頁碼一頁一頁轉(zhuǎn)** @throws IOException imgType:轉(zhuǎn)換后的圖片類型 jpg,png*/public static void PDFToImg(OutputStream sos, String fileUrl, int page, String imgType) throws IOException {PDDocument pdDocument = null;/* dpi越大轉(zhuǎn)換后越清晰,相對轉(zhuǎn)換速度越慢 */int dpi = 100;try {pdDocument = getPDDocument(fileUrl);PDFRenderer renderer = new PDFRenderer(pdDocument);int pages = pdDocument.getNumberOfPages();if (page <= pages && page >= 0) {BufferedImage image = renderer.renderImageWithDPI(page, dpi);ImageIO.write(image, imgType, sos);}} catch (Exception e) {logger.error(e.getMessage());} finally {if (pdDocument != null) {pdDocument.close();}}} ?private static PDDocument getPDDocument(String fileUrl) throws IOException {File file = new File(fileUrl);FileInputStream inputStream = new FileInputStream(file);return PDDocument.load(inputStream);} }3.測試
? ?@Testpublic void testPdf() throws IOException{String filePath = "/Users/apple/Desktop/學(xué)習書籍/Docker從入門到實踐.pdf";int numberOfPages = PdfUtil.getNumberOfPages(filePath);System.out.println("該pdf總頁數(shù)為:" + numberOfPages);//獲取pdf的前三頁圖片(若不足三頁,有幾頁獲取幾頁)for (int i = 0; i < 3 && i < numberOfPages ; i++) {PdfUtil.PDFToImg(new FileOutputStream(new File("/Users/apple/Desktop/學(xué)習書籍/Docker從入門到實踐" + i + ".png")), filePath, i, "PNG");}}控制臺打印:
該pdf總頁數(shù)為:370對應(yīng)的文件夾下生成以下圖片:
?
總結(jié)
以上是生活随笔為你收集整理的java获取pdf的页数、内容和缩略图的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Spring Boot集成Elastic
- 下一篇: Spring Cloud(八)使用Zip