Apache POI 详解 - Java 操作 Excel/Word/PPT
Apache POI 详解 - Java 操作 Excel/Word/PPT
Apache POI(Poor Obfuscation Implementation)是 Apache 提供的 Java 操作 Microsoft Office 文档(Excel、Word、PPT) 的开源库。它支持 .xls
(HSSF)、.xlsx
(XSSF)、.doc
(HWPF)、.docx
(XWPF)、.ppt
(HSLF)、.pptx
(XSLF)等格式。
1. POI 核心组件
组件 | 功能 | 适用格式 |
---|---|---|
HSSF | 操作 Excel 97-2003(.xls ) | .xls |
XSSF | 操作 Excel 2007+(.xlsx ) | .xlsx |
SXSSF | 流式处理大 Excel(.xlsx ) | .xlsx |
HWPF | 操作 Word 97-2003(.doc ) | .doc |
XWPF | 操作 Word 2007+(.docx ) | .docx |
HSLF | 操作 PowerPoint 97-2003(.ppt ) | .ppt |
XSLF | 操作 PowerPoint 2007+(.pptx ) | .pptx |
2. POI 依赖
Maven 依赖
<!-- Excel 操作(HSSF/XSSF/SXSSF) -->
<dependency><groupId>org.apache.poi</groupId><artifactId>poi</artifactId><version>5.2.3</version>
</dependency>
<dependency><groupId>org.apache.poi</groupId><artifactId>poi-ooxml</artifactId><version>5.2.3</version>
</dependency><!-- Word 操作(XWPF) -->
<dependency><groupId>org.apache.poi</groupId><artifactId>poi-scratchpad</artifactId><version>5.2.3</version>
</dependency>
3. POI 操作 Excel(XSSF/HSSF)
(1)写入 Excel(XSSF - .xlsx
)
import org.apache.poi.xssf.usermodel.*;
import java.io.FileOutputStream;public class ExcelWriter {public static void main(String[] args) throws Exception {// 1. 创建工作簿XSSFWorkbook workbook = new XSSFWorkbook();// 2. 创建工作表XSSFSheet sheet = workbook.createSheet("Sheet1");// 3. 创建行(第 0 行)XSSFRow row = sheet.createRow(0);// 4. 创建单元格并写入数据row.createCell(0).setCellValue("姓名");row.createCell(1).setCellValue("年龄");// 5. 写入数据行XSSFRow dataRow = sheet.createRow(1);dataRow.createCell(0).setCellValue("张三");dataRow.createCell(1).setCellValue(25);// 6. 保存到文件try (FileOutputStream fos = new FileOutputStream("output.xlsx")) {workbook.write(fos);}System.out.println("Excel 文件生成成功!");}
}
(2)读取 Excel(XSSF - .xlsx
)
import org.apache.poi.xssf.usermodel.*;
import org.apache.poi.ss.usermodel.*;
import java.io.FileInputStream;public class ExcelReader {public static void main(String[] args) throws Exception {// 1. 加载 Excel 文件FileInputStream fis = new FileInputStream("output.xlsx");XSSFWorkbook workbook = new XSSFWorkbook(fis);// 2. 获取工作表XSSFSheet sheet = workbook.getSheetAt(0);// 3. 遍历行和单元格for (Row row : sheet) {for (Cell cell : row) {switch (cell.getCellType()) {case STRING:System.out.print(cell.getStringCellValue() + "\t");break;case NUMERIC:System.out.print(cell.getNumericCellValue() + "\t");break;default:System.out.print("UNKNOWN\t");}}System.out.println();}workbook.close();}
}
(3)SXSSF(流式处理大 Excel)
适用于大数据量(百万行):
import org.apache.poi.xssf.streaming.SXSSFWorkbook;
import org.apache.poi.xssf.streaming.SXSSFSheet;
import org.apache.poi.xssf.streaming.SXSSFRow;
import java.io.FileOutputStream;public class BigExcelWriter {public static void main(String[] args) throws Exception {// 1. 创建 SXSSFWorkbook(默认 100 行缓存)SXSSFWorkbook workbook = new SXSSFWorkbook();SXSSFSheet sheet = workbook.createSheet("大数据");// 2. 写入 100 万行数据for (int i = 0; i < 1_000_000; i++) {SXSSFRow row = sheet.createRow(i);row.createCell(0).setCellValue("数据-" + i);row.createCell(1).setCellValue(i);}// 3. 保存到文件try (FileOutputStream fos = new FileOutputStream("big-data.xlsx")) {workbook.write(fos);}workbook.dispose(); // 清理临时文件System.out.println("大 Excel 文件生成成功!");}
}
4. POI 操作 Word(XWPF)
(1)写入 Word(.docx
)
import org.apache.poi.xwpf.usermodel.*;
import java.io.FileOutputStream;public class WordWriter {public static void main(String[] args) throws Exception {// 1. 创建文档XWPFDocument document = new XWPFDocument();// 2. 创建段落XWPFParagraph paragraph = document.createParagraph();XWPFRun run = paragraph.createRun();run.setText("Hello, POI Word!");run.setBold(true);run.setFontSize(16);// 3. 保存到文件try (FileOutputStream fos = new FileOutputStream("output.docx")) {document.write(fos);}System.out.println("Word 文件生成成功!");}
}
(2)读取 Word(.docx
)
import org.apache.poi.xwpf.usermodel.*;
import java.io.FileInputStream;public class WordReader {public static void main(String[] args) throws Exception {// 1. 加载 Word 文件FileInputStream fis = new FileInputStream("output.docx");XWPFDocument document = new XWPFDocument(fis);// 2. 读取所有段落for (XWPFParagraph paragraph : document.getParagraphs()) {System.out.println(paragraph.getText());}document.close();}
}
5. POI 操作 PPT(XSLF)
(1)写入 PPT(.pptx
)
import org.apache.poi.xslf.usermodel.*;
import java.io.FileOutputStream;public class PPTWriter {public static void main(String[] args) throws Exception {// 1. 创建 PPTXMLSlideShow ppt = new XMLSlideShow();// 2. 创建幻灯片XSLFSlide slide = ppt.createSlide();// 3. 添加标题XSLFTextBox shape = slide.createTextBox();XSLFTextRun textRun = shape.addNewTextParagraph().addNewTextRun();textRun.setText("Hello, POI PPT!");textRun.setFontSize(24.0);// 4. 保存到文件try (FileOutputStream fos = new FileOutputStream("output.pptx")) {ppt.write(fos);}System.out.println("PPT 文件生成成功!");}
}
(2)读取 PPT(.pptx
)
import org.apache.poi.xslf.usermodel.*;
import java.io.FileInputStream;public class PPTReader {public static void main(String[] args) throws Exception {// 1. 加载 PPT 文件FileInputStream fis = new FileInputStream("output.pptx");XMLSlideShow ppt = new XMLSlideShow(fis);// 2. 遍历所有幻灯片for (XSLFSlide slide : ppt.getSlides()) {for (XSLFShape shape : slide.getShapes()) {if (shape instanceof XSLFTextShape) {System.out.println(((XSLFTextShape) shape).getText());}}}ppt.close();}
}
6. 常见问题
(1)POI 如何处理日期格式?
CellStyle style = workbook.createCellStyle();
style.setDataFormat(workbook.createDataFormat().getFormat("yyyy-MM-dd"));
cell.setCellValue(new Date());
cell.setCellStyle(style);
(2)POI 如何设置单元格样式?
CellStyle style = workbook.createCellStyle();
style.setFillForegroundColor(IndexedColors.YELLOW.getIndex());
style.setFillPattern(FillPatternType.SOLID_FOREGROUND);
cell.setCellStyle(style);
(3)POI 如何处理大文件?
- 使用 SXSSF(Excel)或 Event API(SAX 解析)减少内存占用。
7. 总结
功能 | API | 适用格式 |
---|---|---|
Excel 操作 | HSSF/XSSF/SXSSF | .xls /.xlsx |
Word 操作 | HWPF/XWPF | .doc /.docx |
PPT 操作 | HSLF/XSLF | .ppt /.pptx |
POI 是 Java 操作 Office 文档的 最主流库,适用于:
- 报表导出(Excel)
- 合同生成(Word)
- 幻灯片处理(PPT)
建议:
- 小文件 →
XSSF
/XWPF
/XSLF
- 大文件 →
SXSSF
(Excel)或SAX 解析
(避免 OOM)