Friday, November 22, 2013

Export Journal Content as PDF in Liferay

Nagul Meera Mahankali

Objective:

Export Journal Content/Web Content as PDF.

Liferay have feature to export journal content as PDF.

Liferay already have support to export journal content as PDF document but we need to do open office configuration.

Here we are doing without help of open office configuration.

With help of JTidy and Flying Saucer we will export general content as PDF with zero configurations.

Download Export Jorinal Content portlet from following location

You can find source and war file

https://sourceforge.net/projects/meeralferay/files/LiferayJournalContentExporterPortlet/

Note:

Portlet developed in Liferay 6.1GA2 EE version

If you want deploy in CE version you just do changes in liferay-plugin-package.properties

Liferay 6.1 EE version

name= ExportJournalContentAsPDF

module-group-id=liferay-ee

module-incremental-version=1

tags=

short-description=

change-log=

page-url=http://www.liferay.com

author=Liferay, Inc.

licenses=EE

portal-dependency-jars=\

jstl-api.jar,\

jstl-impl.jar

portal-dependency-tlds=c.tld

liferay-versions=6.1.20

Liferay 6.1 CE version

name = ExportJournalContentAsPDF

module-group-id=liferay

module-incremental-version=1

tags=

short-description=

change-log=

page-url=http://www.liferay.com

author=Liferay, Inc.

licenses=LGPL

portal-dependency-jars=\

jstl-api.jar,\

jstl-impl.jar

portal-dependency-tlds=c.tld

liferay-versions=6.1.1

Procedure for deploy portlet:

You can use war file and directly place in your portal deploy folder and test or you can also use source to deploy portlet.

Once portlet is deployed successfully you can see the portlet in sample category name as

Export Jorinal Content.

JTidy:

JTidy is java based library for cleaning up malformed and faulty HTML and JTidy provides a DOM interface to the document that is being processed, which effectively makes you able to use JTidy as a DOM parser for real-world HTML.

Go through following link to get more information about JTidy.

http://jtidy.sourceforge.net/

Flying Saucer:

Flying Saucer is java library to generate PDF from HTML and XML. This is pretty interesting library to generate PDF very easy way even complex PDF too. Generally we can design content in HTML the same thing we can generate as PDF.

Go through following link to get more information about flying saucer

https://code.google.com/p/flying-saucer/

Why we are using JTidy with Flying Saucer?

Generally when we use HTML in flying saucer that should be well formed, if any syntax errors or any other malformed data then it won’t be exported as PDF.

Generally when we generate PDF with help of flying saucer we mostly getting html content as dynamic, so we need to use JTidy to clean HTML, means to correct syntax errors and clean malformed data.

Steps to generate PDF

Get html contents from any sources
Convert html data as Input Stream
Apply JTidy to cleans html data make it as well formed w3c document.
Pass w3c document to flying saucer to generate PDF.

Get html contents from any sources:

First we need to get HTML data from any sources we can use any URL to get html data mean its string of html tags.

Example:

We can manually prepare html data as string or we get html data from any URL.

String html = HttpUtil.URLtoString(“http://www.liferaysavvy.com/2013/11/liferay-document-conversion-portlet.html”);

String html = ""<!DOCTYPE HTML><<html><head><title>First parse</title></head>"

+ "<body>This is test HTML.</body></html>";

Convert html data as Input Stream:

Now once get html content as String now we need convert string to input stream

String html = ""<!DOCTYPE HTML><<html><head><title>First parse</title></head>"

+ "<body>This is test HTML.</body></html>";

InputStream is = new ByteArrayInputStream(html.getBytes());

Apply JTidy to cleans html data make it as well formed w3c document

String html = ""<!DOCTYPE HTML><<html><head><title>First parse</title></head>"

+ "<body>This is test HTML.</body></html>";

InputStream is = new ByteArrayInputStream(articleHtml.getBytes());

Tidy tidy = new Tidy();

org.w3c.dom.Document doc = tidy.parseDOM(is, null);

Pass w3c document to flying saucer to generate PDF

OutputStream outputStream = resourceResponse.getPortletOutputStream();

ITextRenderer renderer = new ITextRenderer();

renderer.setDocument(doc, null);

renderer.layout();

renderer.createPDF(outputStream);

Note:

This process can apply anywhere to generate PDF from HTML using flying saucer.

The following is complete example to export journal content as PDF

import java.io.ByteArrayInputStream;

import java.io.IOException;

import java.io.InputStream;

import java.io.OutputStream;

import javax.portlet.PortletException;

import javax.portlet.ResourceRequest;

import javax.portlet.ResourceResponse;

import com.liferay.portal.kernel.language.LanguageUtil;

import com.liferay.portal.kernel.util.ParamUtil;

import com.liferay.portal.kernel.util.StringPool;

import com.liferay.portal.kernel.util.WebKeys;

import com.liferay.portal.theme.ThemeDisplay;

import com.liferay.portlet.journal.model.JournalArticleDisplay;

import com.liferay.portlet.journalcontent.util.JournalContentUtil;

import com.liferay.util.bridges.mvc.MVCPortlet;

import com.lowagie.text.DocumentException;

import org.w3c.dom.Document;

import org.w3c.tidy.Tidy;

import org.xhtmlrenderer.pdf.ITextRenderer;

public class ExportJorinalContentAction extends MVCPortlet {

@Override

public void serveResource(

ResourceRequest resourceRequest, ResourceResponse resourceResponse)

throws IOException, PortletException {

String articleId=ParamUtil.getString(resourceRequest,"webContentSelectBox");

ThemeDisplay themeDisplay = (ThemeDisplay)resourceRequest.getAttribute(WebKeys.THEME_DISPLAY);

long groupId =ParamUtil.getLong(resourceRequest,"sitesSelectBox");

try {

//get journal article

JournalArticleDisplay articleDisplay = JournalContentUtil.getDisplay(groupId, articleId, "", LanguageUtil.getLanguageId(resourceRequest),themeDisplay);

//set up response to handle PDF

resourceResponse.reset();

resourceResponse.setContentType("application/pdf");

resourceResponse.setProperty("Content-disposition", "attachment; filename=\"" + articleDisplay.getTitle().concat(StringPool.PERIOD).concat("pdf") + "\"");

OutputStream outputStream = resourceResponse.getPortletOutputStream();

String articleHtml = "<!DOCTYPE HTML><html><body>"+articleDisplay.getContent()+"</body></html>";

//prepend portal URL to local document library relative URLs

articleHtml = articleHtml.replaceAll("src=\"/documents", "src=\""+themeDisplay.getPortalURL()+"/documents");

Tidy tidy = new Tidy();

// Create inputStream to parse with tidy.

InputStream is = new ByteArrayInputStream(articleHtml.getBytes());

// Create XML Document from tidy

Document doc = tidy.parseDOM(is, null);

//render PDF

ITextRenderer renderer = new ITextRenderer();

renderer.setDocument(doc, null);

renderer.layout();

renderer.createPDF(outputStream);

} catch (DocumentException e) {

// TODO Auto-generated catch block

e.printStackTrace();

}

Important points

With the help of JTidy and Flying saucer we can generate PDF from HTML
We can also apply CSS to html in flying saucer.
Without configuration of open office in Liferay we can export Journal Content as PDF with the help of Flying saucer.

I have written Article about flying saucer.

http://www.liferaysavvy.com/2013/09/liferay-pdf-generation-from-html-using.html

Note:

In above example I did not use JTidy to clean HTML but better use JTidy to clean HTML and pass to flying saucer so that we can get PDF without any problems.

Screens:

Journal Content Export Portlet

Example Journal Content for Export

Example PDF after Export

Reference Links:

http://jtidy.sourceforge.net/

https://code.google.com/p/flying-saucer/

Author

Meera Prince

Friday, November 22, 2013

Export Journal Content as PDF in Liferay

Recent Posts

Popular Posts

Blog Archive

Followers

About Me

Total Pageviews

Categories

Find Us On Facebook

Labels

Popular Posts

Friday, November 22, 2013

Recent Posts

Popular Posts

Blog Archive

Followers

About Me

Total Pageviews

Subscribe To

Categories

Find Us On Facebook

Labels

Popular Posts