How to Convert Excel to PDF in Java
Excel, a widely used spreadsheet application, is a powerful tool for data analysis and management. However, sharing Excel files can be challenging due to compatibility issues and the potential for formatting changes. Conversely, PDF is a more universal file format that preserves document formatting and can be easily shared across different platforms. In this article, we’ll explore how to convert Excel to PDF in Java using Apache POI and iText.
Introduction
iText is a widely used library for working with PDF files and offers powerful features for PDF generation and manipulation. Apache POI is an open-source library that provides comprehensive tools for reading and writing Microsoft Office files, including Excel spreadsheets. It provides a comprehensive set of features for working with Excel files, including reading and manipulating cell styles. If we specifically need to extract detailed styling information, including font colour, background colour, font styles, and other formatting details from an Excel file, Apache POI is a more suitable library.
Implementation
Step 1: Add Apache POI Dependency
Firstly, to utilize Apache POI for Excel file handling and iText for PDF generation, include the necessary dependencies in your Maven project’s pom.xml
file:
<dependencies>
<!-- Apache POI for Excel handling -->
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>5.2.5</version>
</dependency>
<!-- iText for PDF generation -->
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>itextpdf</artifactId>
<version>5.5.13.3</version>
</dependency>
</dependencies>
Step 2: Load the Excel File
Load the Excel file using Apache POI’s XSSFWorkbook
for .xlsx
files:
public static XSSFWorkbook readExcelFile(String excelFilePath) throws IOException {
FileInputStream inputStream = new FileInputStream(excelFilePath);
XSSFWorkbook workbook = new XSSFWorkbook(inputStream);
inputStream.close();
return workbook;
}
This code snippet supports reading Excel files in the XSSF format. For older Excel formats (.xls), we need to use the HSSFWorkbook class instead.
Step 3: Create a PDF Document
Create a new PDF document using iText’s Document
class:
private static Document createPDFDocument(String pdfFilePath) throws IOException, DocumentException {
Document document = new Document();
PdfWriter.getInstance(document, new FileOutputStream(pdfFilePath));
document.open();
return document;
}
Step 4: Parse Excel Data and Convert to PDF Elements
Next, let’s iterate through the Excel sheets and convert the extracted data into corresponding PDF elements:
private static void createAndAddTable(XSSFSheet worksheet, Document document) throws DocumentException {
PdfPTable table = new PdfPTable(worksheet.getRow(0).getPhysicalNumberOfCells());
table.setWidthPercentage(100);
addTableHeader(worksheet, table);
addTableData(worksheet, table);
document.add(table);
}
private static void addTableHeader(XSSFSheet worksheet, PdfPTable table) {
Row headerRow = worksheet.getRow(0);
for (int i = 0; i < headerRow.getPhysicalNumberOfCells(); i++) {
Cell cell = headerRow.getCell(i);
String headerText = cell.getCellType() == CellType.STRING ? cell.getStringCellValue() : String.valueOf(cell.getNumericCellValue());
PdfPCell headerCell = new PdfPCell(new Phrase(headerText, new Font(Font.FontFamily.HELVETICA, 12, Font.BOLD)));
headerCell.setHorizontalAlignment(Element.ALIGN_CENTER);
table.addCell(headerCell);
}
}
private static void addTableData(XSSFSheet worksheet, PdfPTable table) {
Iterator<Row> rowIterator = worksheet.iterator();
while (rowIterator.hasNext()) {
Row row = rowIterator.next();
if (row.getRowNum() == 0) {
continue;
}
for (int i = 0; i < row.getPhysicalNumberOfCells(); i++) {
Cell cell = row.getCell(i);
String cellValue;
if (cell != null) {
if (cell.getCellType() == CellType.STRING) {
cellValue = cell.getStringCellValue();
} else if (cell.getCellType() == CellType.NUMERIC) {
cellValue = String.valueOf(cell.getNumericCellValue());
} else {
cellValue = "";
}
} else {
cellValue = "";
}
PdfPCell cellPdf = new PdfPCell(new Phrase(cellValue));
cellPdf.setHorizontalAlignment(Element.ALIGN_CENTER);
table.addCell(cellPdf);
}
}
}
The provided code snippet utilizes several PDF elements from the iText library to add table headers to a PDF document.
PdfPTable: This is the main element used for creating the table. It represents a table with rows and columns. The code snippet creates a new PdfPTable
object and subsequently adds header cells to it.
PdfPCell: This element represents a single cell within the PdfPTable
. Each header cell is created using a new PdfPCell
object, which takes an Phrase
object as its content.
Phrase: This element represents a formatted text string. The code snippet creates a new Phrase
object for each header cell, setting its text and font properties.
Font: This element represents the font used to display the text within the Phrase
.
Step 5: Save the Generated PDF Document
Lastly, we use iText’s PdfWriter
class to write the generated PDF content to an output file:
public static void convertExcelToPDF(String excelFilePath, String pdfFilePath) throws IOException, DocumentException {
XSSFWorkbook workbook = readExcelFile(excelFilePath);
Document document = createPDFDocument(pdfFilePath);
// Assuming you have only one sheet in the workbook
XSSFSheet worksheet = workbook.getSheetAt(0);
createAndAddTable(worksheet, document);
document.close();
workbook.close();
}
Testing
We’re ready to test the program. This code snippet demonstrates how to run the convertExcelToPDF()
method and convert an Excel spreadsheet to a PDF document.
public static void main(String[] args) throws DocumentException, IOException {
String excelFilePath = "/path/to/your/order_list.xlsx";
String pdfFilePath = "/path/to/your/order_list.pdf";
convertExcelToPDF(excelFilePath, pdfFilePath);
}
Once we have updated the file paths and imported the libraries, we can run the code snippet to convert your Excel file to PDF.
Once the code finishes executing, check if the PDF file has been created at the specified pdfFilePath
. Open the PDF to confirm that it contains all the data from our Excel spreadsheet.
Advanced Features
Preserving Excel Formatting and Styles
Preserve Excel cell formatting and apply it to the corresponding PDF cell. This includes text alignment and font styles (bold and italic).
Background Color
The setBackgroundColor()
function retrieves the background colour index from an Excel cell and applies it to the corresponding PDF cell.
private static void setBackgroundColor(Cell cell, PdfPCell cellPdf) {
// Set background color
short bgColorIndex = cell.getCellStyle()
.getFillForegroundColor();
if (bgColorIndex != IndexedColors.AUTOMATIC.getIndex()) {
XSSFColor bgColor = (XSSFColor) cell.getCellStyle()
.getFillForegroundColorColor();
if (bgColor != null) {
byte[] rgb = bgColor.getRGB();
if (rgb != null && rgb.length == 3) {
cellPdf.setBackgroundColor(new BaseColor(rgb[0] & 0xFF, rgb[1] & 0xFF, rgb[2] & 0xFF));
}
}
}
}
Text Alignment
The setCellAlignment()
method examines the Excel cell’s horizontal and vertical alignment settings and translates them to their corresponding iText7 Element alignment constants.
private static void setCellAlignment(Cell cell, PdfPCell cellPdf) {
CellStyle cellStyle = cell.getCellStyle();
HorizontalAlignment horizontalAlignment = cellStyle.getAlignment();
VerticalAlignment verticalAlignment = cellStyle.getVerticalAlignment();
switch (horizontalAlignment) {
case LEFT:
cellPdf.setHorizontalAlignment(Element.ALIGN_LEFT);
break;
case CENTER:
cellPdf.setHorizontalAlignment(Element.ALIGN_CENTER);
break;
case JUSTIFY:
case FILL:
cellPdf.setVerticalAlignment(Element.ALIGN_JUSTIFIED);
break;
case RIGHT:
cellPdf.setHorizontalAlignment(Element.ALIGN_RIGHT);
break;
}
switch (verticalAlignment) {
case TOP:
cellPdf.setVerticalAlignment(Element.ALIGN_TOP);
break;
case CENTER:
cellPdf.setVerticalAlignment(Element.ALIGN_MIDDLE);
break;
case JUSTIFY:
cellPdf.setVerticalAlignment(Element.ALIGN_JUSTIFIED);
break;
case BOTTOM:
cellPdf.setVerticalAlignment(Element.ALIGN_BOTTOM);
break;
}
}
Font Style
The getCellStyle()
method extracts various font properties from the Excel cell and translates them to iText7 Font settings. It handles font color, italics, strikethrough, underline, size, bold, and family name.
private static Font getCellStyle(Cell cell) throws DocumentException, IOException {
Font font = new Font();
CellStyle cellStyle = cell.getCellStyle();
org.apache.poi.ss.usermodel.Font cellFont = cell.getSheet()
.getWorkbook()
.getFontAt(cellStyle.getFontIndexAsInt());
short fontColorIndex = cellFont.getColor();
if (fontColorIndex != IndexedColors.AUTOMATIC.getIndex() && cellFont instanceof XSSFFont) {
XSSFColor fontColor = ((XSSFFont) cellFont).getXSSFColor();
if (fontColor != null) {
byte[] rgb = fontColor.getRGB();
if (rgb != null && rgb.length == 3) {
font.setColor(new BaseColor(rgb[0] & 0xFF, rgb[1] & 0xFF, rgb[2] & 0xFF));
}
}
}
if (cellFont.getItalic()) {
font.setStyle(Font.ITALIC);
}
if (cellFont.getStrikeout()) {
font.setStyle(Font.STRIKETHRU);
}
if (cellFont.getUnderline() == 1) {
font.setStyle(Font.UNDERLINE);
}
short fontSize = cellFont.getFontHeightInPoints();
font.setSize(fontSize);
if (cellFont.getBold()) {
font.setStyle(Font.BOLD);
}
String fontName = cellFont.getFontName();
if (FontFactory.isRegistered(fontName)) {
font.setFamily(fontName); // Use extracted font family if supported by iText
} else {
logger.warn("Unsupported font type: {}", fontName);
// - Use a fallback font (e.g., Helvetica)
font.setFamily("Helvetica");
}
return font;
}
Generating Multiple PDF Pages from a Single Excel File
If the Excel file contains multiple sheets or sections, we can generate multiple PDF pages by processing each sheet or section separately:
public static void convertExcelToPDF(String excelFilePath, String pdfFilePath) throws IOException, DocumentException {
XSSFWorkbook workbook = readExcelFile(excelFilePath);
Document document = createPDFDocument(pdfFilePath);
for (int i = 0; i < workbook.getNumberOfSheets(); i++) {
XSSFSheet worksheet = workbook.getSheetAt(i);
createAndAddTable(worksheet, document);
// Add a new page for each sheet (except the last one)
if (i < workbook.getNumberOfSheets() - 1) {
document.newPage();
}
}
document.close();
workbook.close();
}
Embedding Images and Other Embedded Objects
If the Excel file contains embedded images, extract and embed them into the PDF document:
// Add header with sheet name as title
Paragraph title = new Paragraph(worksheet.getSheetName(), new Font(Font.FontFamily.HELVETICA, 18, Font.BOLD));
title.setSpacingAfter(20f);
title.setAlignment(Element.ALIGN_CENTER);
document.add(title);
// Add logo
Image logo = Image.getInstance("/path/to/your/logo.png");
logo.setAlignment(Element.ALIGN_CENTER);
logo.scaleToFit(100, 100);
document.add(logo);
Conclusion
In conclusion, this article has provided a step-by-step guide on converting Excel files to PDF in Java using Apache POI and iText. By combining the capabilities of Apache POI for Excel handling and iText for PDF generation, you can seamlessly preserve formatting, apply styles, and embed images from Excel to PDF. This approach ensures a smooth transition between these two widely used file formats, facilitating efficient data sharing and distribution.
Download the full source code at GitHub.
Share this content:
Leave a Comment