Have a look at PdfConverter #357

ylussaud · 2019-06-26T10:56:34Z

If it works fine, it would be nice to generate a docx or a pdf according to the file extension of the output document.

ylussaud · 2019-10-02T13:26:23Z

It is not part of the POI project and need new dependencies:

fr.opensagres.xdocreport
fr.opensagres.poi.xwpf.converter.core
2.0.2

ylussaud · 2019-10-29T10:10:12Z

The converter uses iText which is LGPL that can be an other problem.

ejuliot · 2020-12-14T09:04:56Z

POI already has a built-in support for DOCX to PDF conversion. Loot at https://stackoverflow.com/questions/43363624/converting-docx-into-pdf-in-java (org.apache.poi.xwpf.converter.pdf.PdfConverter)

ylussaud · 2020-12-22T09:20:23Z

As stated above PdfConverter is not part of apache POI but fr.opensagres.poi.xwpf.converter.core that support apache POI 4.0.1. M2Doc is using apache POI 4.1.0 and will move to next versions.

ylussaud · 2024-09-05T09:49:27Z

The LGPL licence is not an issue, there is LGPL code in the Orbit update site. At the moment both M2Doc and fr.opensagres.poi.xwpf.converter.pdf 2.0.0 depend on POI 5.2.3 so I was able to tests the pdf conversion.

There are the following issues:

when a table is present it sometimes throws an NPE:

fr.opensagres.poi.xwpf.converter.core.XWPFConverterException: java.lang.NullPointerException: Cannot invoke "org.openxmlformats.schemas.wordprocessingml.x2006.main.CTTblGrid.getGridColList()" because "grid" is null
	at fr.opensagres.poi.xwpf.converter.pdf.PdfConverter.doConvert(PdfConverter.java:71)
	at fr.opensagres.poi.xwpf.converter.pdf.PdfConverter.doConvert(PdfConverter.java:39)
	at fr.opensagres.poi.xwpf.converter.core.AbstractXWPFConverter.convert(AbstractXWPFConverter.java:42)

To solve this issue we need to add a CTTblGrid tot the created XWPFTable. This implies knowing the width of each column. A width has been added to the MCell (see #472) but I'm not sure we will be able to compute a width when importing from HTML. I'm opening this issue #525.

with asImage test :

java.lang.StackOverflowError
	at java.base/java.lang.StringBuffer.<init>(StringBuffer.java:133)
	at com.lowagie.text.pdf.BidiLine.createArrayOfPdfChunks(Unknown Source)
	at com.lowagie.text.pdf.BidiLine.createArrayOfPdfChunks(Unknown Source)
	at com.lowagie.text.pdf.BidiLine.processLine(Unknown Source)
	at com.lowagie.text.pdf.ColumnText.go(Unknown Source)
	at com.lowagie.text.pdf.ColumnText.goComposite(Unknown Source)
	at com.lowagie.text.pdf.ColumnText.go(Unknown Source)
	at com.lowagie.text.pdf.ColumnText.go(Unknown Source)
	at com.lowagie.text.pdf.PdfPRow.writeCells(Unknown Source)
	at com.lowagie.text.pdf.PdfPTable.writeSelectedRows(Unknown Source)
	at com.lowagie.text.pdf.PdfPTable.writeSelectedRows(Unknown Source)
	at com.lowagie.text.pdf.PdfPTable.writeSelectedRows(Unknown Source)
	at com.lowagie.text.pdf.ColumnText.goComposite(Unknown Source)
	at com.lowagie.text.pdf.ColumnText.go(Unknown Source)
	at com.lowagie.text.pdf.ColumnText.go(Unknown Source)
	at com.lowagie.text.pdf.PdfDocument.addPTable(Unknown Source)
	at com.lowagie.text.pdf.PdfDocument.add(Unknown Source)
	at com.lowagie.text.Document.add(Unknown Source)
	at fr.opensagres.xdocreport.itext.extension.ExtendedDocument.add(ExtendedDocument.java:114)
	at fr.opensagres.poi.xwpf.converter.pdf.internal.elements.StylableDocument.flushTable(StylableDocument.java:374)
	at fr.opensagres.poi.xwpf.converter.pdf.internal.elements.StylableDocument.pageBreak(StylableDocument.java:141)
	at fr.opensagres.poi.xwpf.converter.pdf.internal.elements.StylableDocument.columnBreak(StylableDocument.java:120)
	at fr.opensagres.poi.xwpf.converter.pdf.internal.elements.StylableDocument.simulateText(StylableDocument.java:230)
	at fr.opensagres.poi.xwpf.converter.pdf.internal.elements.StylableDocument.pageBreak(StylableDocument.java:160)
	at fr.opensagres.poi.xwpf.converter.pdf.internal.elements.StylableDocument.columnBreak(StylableDocument.java:120)

some differences between the word document and the pdf document:
- some bullets from bullet list are missing (HTML ul test)
- ...

Overall the output pdf is pretty close to the word document if it don't use MTable.

ylussaud added feature POI labels Jun 26, 2019

ylussaud added this to the 3.1.0 milestone Dec 13, 2019

ylussaud modified the milestones: 3.1.0, 3.1.1 Jun 29, 2020

ylussaud mentioned this issue Dec 14, 2020

Add PDF support #403

Closed

ylussaud modified the milestones: 3.1.1, 3.1.2 Jan 6, 2021

ylussaud modified the milestones: 3.2.0, 4.0.0 Apr 16, 2021

ylussaud modified the milestones: 3.2.2, 3.2.3 Sep 22, 2022

ylussaud modified the milestones: 3.3.0, 3.3.1 May 2, 2023

ylussaud modified the milestones: 3.3.1, 3.3.2 Sep 19, 2023

ylussaud modified the milestones: 3.3.2, 3.3.3 Dec 4, 2023

ylussaud mentioned this issue Sep 5, 2024

When creating a MTable from HTML we should compute each column width. #525

Open

ylussaud removed this from the 3.3.3 milestone Sep 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Have a look at PdfConverter #357

Have a look at PdfConverter #357

ylussaud commented Jun 26, 2019

ylussaud commented Oct 2, 2019

ylussaud commented Oct 29, 2019

ejuliot commented Dec 14, 2020

ylussaud commented Dec 22, 2020

ylussaud commented Sep 5, 2024

Have a look at PdfConverter #357

Have a look at PdfConverter #357

Comments

ylussaud commented Jun 26, 2019

ylussaud commented Oct 2, 2019

ylussaud commented Oct 29, 2019

ejuliot commented Dec 14, 2020

ylussaud commented Dec 22, 2020

ylussaud commented Sep 5, 2024