Friday, November 22, 2013

Export Journal Content as PDF in Liferay

 

Objective:

Export Journal Content/Web Content  as PDF.

Liferay have feature to export journal content as PDF.

Liferay already have support to export journal content as PDF document but we need to do open office configuration.

Here we are doing without help of open office configuration.

With help of JTidy and Flying Saucer we will export general content as PDF with zero configurations.

Download Export Jorinal Content  portlet from following location

You can find source and war file


Note: 

Portlet developed in Liferay 6.1GA2 EE version
If you want deploy in CE version you just do changes in liferay-plugin-package.properties

Liferay 6.1 EE version

name= ExportJournalContentAsPDF
module-group-id=liferay-ee
module-incremental-version=1
tags=
short-description=
change-log=
page-url=http://www.liferay.com
author=Liferay, Inc.
licenses=EE
portal-dependency-jars=\
    jstl-api.jar,\
    jstl-impl.jar
portal-dependency-tlds=c.tld
liferay-versions=6.1.20


Liferay 6.1 CE version

name = ExportJournalContentAsPDF
module-group-id=liferay
module-incremental-version=1
tags=
short-description=
change-log=
page-url=http://www.liferay.com
author=Liferay, Inc.
licenses=LGPL
portal-dependency-jars=\
    jstl-api.jar,\
    jstl-impl.jar
portal-dependency-tlds=c.tld
liferay-versions=6.1.1

Procedure for deploy portlet:

You can use war file and directly place in your portal deploy folder and test or you can also use source to deploy portlet.

Once portlet is deployed successfully you can see the portlet in sample category name as
Export Jorinal Content.

JTidy:

JTidy  is java based library for cleaning up malformed and faulty HTML and JTidy provides a DOM interface to the document that is being processed, which effectively makes you able to use JTidy as a DOM parser for real-world HTML.

Go through following link to get more information about JTidy.


Flying Saucer:

Flying Saucer is java library to generate PDF from HTML and XML. This is pretty interesting library to generate PDF very easy way even complex PDF too. Generally we can design content in HTML the same thing we can generate as PDF.

Go through following link to get more information about flying saucer


Why we are using JTidy with Flying Saucer?

Generally when we use HTML in flying saucer that should be well formed, if any syntax errors or any other malformed data then it won’t be exported as PDF.

Generally when we generate PDF with help of flying saucer we mostly getting html content as dynamic, so we need to use JTidy to clean HTML, means to correct syntax errors and clean malformed data.

Steps to generate PDF

  1. Get html contents from any sources
  2. Convert html data as Input Stream
  3. Apply JTidy to cleans html data make it as well formed w3c document.
  4. Pass w3c document to flying saucer to generate PDF.


Get html contents from any sources:

First we need to get HTML data from any sources we can use any URL to get html data mean its string of html tags.

Example:

We can manually prepare html data as string or we get html data from any URL.






String html = ""<!DOCTYPE HTML><<html><head><title>First parse</title></head>"
  + "<body><p>This is test HTML.</p></body></html>";


Convert html data as Input Stream:

Now once get html content as String now we need convert string to input stream


String html = ""<!DOCTYPE HTML><<html><head><title>First parse</title></head>"

  + "<body><p>This is test HTML.</p></body></html>";
InputStream is = new ByteArrayInputStream(html.getBytes());



Apply JTidy to cleans html data make it as well formed w3c document


String html = ""<!DOCTYPE HTML><<html><head><title>First parse</title></head>"
  + "<body><p>This is test HTML.</p></body></html>";
InputStream is = new ByteArrayInputStream(articleHtml.getBytes());
Tidy tidy = new Tidy();
org.w3c.dom.Document doc = tidy.parseDOM(is, null);



Pass w3c document to flying saucer to generate PDF



OutputStream outputStream = resourceResponse.getPortletOutputStream();
ITextRenderer renderer = new ITextRenderer();
renderer.setDocument(doc, null);
renderer.layout();
renderer.createPDF(outputStream);



Note:

This process can apply anywhere to generate PDF from HTML using flying saucer.

The following is complete example to export journal content as PDF


import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import javax.portlet.PortletException;
import javax.portlet.ResourceRequest;
import javax.portlet.ResourceResponse;
import com.liferay.portal.kernel.language.LanguageUtil;
import com.liferay.portal.kernel.util.ParamUtil;
import com.liferay.portal.kernel.util.StringPool;
import com.liferay.portal.kernel.util.WebKeys;
import com.liferay.portal.theme.ThemeDisplay;
import com.liferay.portlet.journal.model.JournalArticleDisplay;
import com.liferay.portlet.journalcontent.util.JournalContentUtil;
import com.liferay.util.bridges.mvc.MVCPortlet;
import com.lowagie.text.DocumentException;
import org.w3c.dom.Document;
import org.w3c.tidy.Tidy;
import org.xhtmlrenderer.pdf.ITextRenderer;
public class ExportJorinalContentAction extends MVCPortlet {
            @Override
            public void serveResource(
                                    ResourceRequest resourceRequest, ResourceResponse resourceResponse)
                        throws IOException, PortletException {
                        String articleId=ParamUtil.getString(resourceRequest,"webContentSelectBox");
                        ThemeDisplay themeDisplay = (ThemeDisplay)resourceRequest.getAttribute(WebKeys.THEME_DISPLAY);
                        long groupId =ParamUtil.getLong(resourceRequest,"sitesSelectBox");
                        try {
                        //get journal article
                        JournalArticleDisplay articleDisplay = JournalContentUtil.getDisplay(groupId, articleId, "", LanguageUtil.getLanguageId(resourceRequest),themeDisplay);
                        //set up response to handle PDF
                        resourceResponse.reset();
                        resourceResponse.setContentType("application/pdf");
                        resourceResponse.setProperty("Content-disposition", "attachment; filename=\"" + articleDisplay.getTitle().concat(StringPool.PERIOD).concat("pdf") + "\"");
                        OutputStream outputStream = resourceResponse.getPortletOutputStream();
                        String articleHtml = "<!DOCTYPE HTML><html><body>"+articleDisplay.getContent()+"</body></html>";
                        //prepend portal URL to local document library relative URLs
                        articleHtml = articleHtml.replaceAll("src=\"/documents", "src=\""+themeDisplay.getPortalURL()+"/documents");
                        Tidy tidy = new Tidy();
                        // Create inputStream to parse with tidy.
                        InputStream is = new ByteArrayInputStream(articleHtml.getBytes());
                        // Create XML Document from tidy
                        Document doc = tidy.parseDOM(is, null);
                        //render PDF
                        ITextRenderer renderer = new ITextRenderer();
                        renderer.setDocument(doc, null);
                        renderer.layout();
                        renderer.createPDF(outputStream);
                                   
                        } catch (DocumentException e) {
                                    // TODO Auto-generated catch block
                                    e.printStackTrace();
                        }
                       
            }
}


Important points

  • With the help of JTidy and Flying saucer we can generate PDF from  HTML
  • We can also apply CSS to html in flying saucer.
  • Without configuration of open office in Liferay we can export Journal Content as PDF with the help of Flying saucer.


 I have written Article about flying saucer.


Note:

In above example I did not use JTidy to clean HTML but better use JTidy to clean HTML and pass to flying saucer so that we can get PDF without any problems.

Screens:

Journal Content Export Portlet




Example Journal Content for Export




Example PDF after Export




Reference Links:



Popular Posts

Recent Posts

Recent Posts Widget