Replace CfDocument with Flying Saucer Xhtml renderer

Posted By Andrea Campolonghi | Posted in railo , cfml , coldfusion | Posted on Dec 23 2009

As per my previous post about Xhtml Railo capabilities I will post my firsts experiences using Flying Saucer as enhancements of cfdocument tag.

Now, cfdocument is not bad but the rendering capabilities are quite limitated and when I discovered that cfdocument, up to cf9, do not support justified text alignment I decided to give a try with Flying Saucer.

The library works taking your xhtml and rendering that into a pdf processing your css.

What is good is that css2 and 3 are processed correctly so, for example, you can create a page header like:

            div.header {
                display: block;
                position: running(header);
            }
            @page {
                size: 8.5in 11in;
                    margin: 18% 0%;   
                @top-left {
                    content: element(header);
                }
           }

The element in the body that is selected as .header is not rendered as normally you expect but is rendered into any page heading.

Of course same works for footer.  See more about css @page here.

Creating a page-break is a breath:

            div.break{
                page-break-after:always;
            }

Any div with class break will make a new page starts from the div position.

I also discovered that pdf needs fonts to be embedded into the file itself.

This is not supported in css spec but Flying Saucer added some extra css 'helper' so that embedding a font is easy like this:

            @font-face {
                src: url(file:///TradeGothicLTStd.ttf);
                -fs-pdf-font-embed: embed;
            }

So far the possibilities looks immense and I think they are .

I have created a java class that the absolute path of the xhtml you need to convert and the path of where you want your pdf to be placed and does the job for you.

package com.andreacfm.utils;
import com.lowagie.text.*;
import org.xhtmlrenderer.pdf.ITextRenderer;
import org.xhtmlrenderer.resource.XMLResource;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import java.io.*;
public class PDFReader{
    public void createPDF(String url, String pdf)
        throws IOException, DocumentException {
        OutputStream os = null;
       
        try {
            os = new FileOutputStream(pdf);
       
            ITextRenderer renderer = new ITextRenderer();
            Document doc = XMLResource.load(new InputSource(url)).getDocument();
           
            renderer.setDocument(doc, url);
           
            renderer.layout();           
            renderer.createPDF(os);
       
            os.close();
            os = null;
        } finally {
            if (os != null) {
                try {
                    os.close();
                } catch (IOException e) {
                    // ignore
                }
            }
        }
    }
           
}

You can call it like :

<cfset objPDFReader = createObject('java','com.andreacfm.utils.PDFReader').init()>
<cfset objPDFReader.createPDF(xhtmlPath,pdfPath)>

To install the library download the source.

Copy:
core-renderer.jar
iText-2.0.8.jar
andreacfm.jar ( if you want to load also my util class )

into folder WEB-INF/lib

At this point flying saucer is ready to go but I found some more details to be fixed.

 

DO NOT MAKE THESE STEPS ON PRODUCTION ENVIRONMENT BEFORE  TESTING YOUR APPLICATIONS.


For cf8 users. I have been forced to replace the xalan library that ships cf8 with the latest xalan release due to a bug that prevent the correct namespaceevaluations.
I downlaoded the xalan source from here. I then copied serializer.jar,xml-apis.jar,xalan.jar,xsltc.jar and xerceImpl.jar into

WEB-INF/cfusion/lib/updates

In this way when you restart cf server the new xalan version is loaded ( update folder is at the top of jrun classpath ). This worked for me but I cannot be 100% sure that any other xml implementation of cf server will have an impact.

For Railo Users. Flying Saucer works fine with itext 2.0.8. Version 2.1.2, that ships with Railo, brakes previous code. You will need to downgrade to 2.0.8 to make it works correctly. Due to the fact that Railo is often deployed on differents Application Server this can be done in differents ways. Contact me if you want to share yoru experience on this point.
I personally added a folder to the Tomcat classpath.

11 responses to “Replace CfDocument with Flying Saucer Xhtml renderer”

Andrea, this is a really interesting post. As I've been so disappointed with text justification (I reported it as a bug 4.5 years ago!) I'm keen to try something else to render PDFs. I have to stick with CF8 for now. I was wondering if your Java class is available for others to use (who don't know how to compile classes) and what needs to be done to get Flying Saucer working with CF8. i.e. which jar files are needed and where do they go? Do any files have to be 'registered'? It's encouraging that you've managed to escape the grip of cfdocument! A nice enhancement would be if your class could take xhtml from a CF variable rather than a file so the xhtml doc can be generated dynamically without having to commit it to disk).
Andrea, I've been working on this as well, but ran into some issues. A question - why not use JavaLoader to get around the issue on CF8?
@Ray I am not sure that javaloader can make the trick. You need to overwrite teh xalan library ( and dependencies with a new version ) and as far as I know that is possible only working on the classpath. @Gary I will add some more details to the post to better explain how to do the trick.
interesting! Thanks!
@Andrea, Very interesting stuff. I started playing around with some Flying Saucer stuff after Peter Ferrel raved about its pixel-precision. I used Groovy, specifically CFGroovy, to load in the JAR files. I also used TagSoup to clean the HTML to XHTML (I was planning on using it to convert HTTP-based web pages to PDF). As an "out of the box" attempt, I actually found CFDocument to render the web page nicer than flying saucer. I didn't play around with it too long, but none of the background images seemed to render by default. I am sure this is a setting (just as I think it is a setting on CFDocument as well). Anyway, thanks for sharing. @Ray, When going through Groovy, it uses a separate class loader as you are hinting at.
@Ben In my opinion FS increase the possibility of cfdocument of at least 70/80%. Just think about that cfdocument do not support a justified text ..... and you already loose the possibility to use that in any kind of "formal" output like contracts etc... I found a not so good rendering using fonts embedded into the css but, as I could see, that is not a FS limitation but the fact that Adobe map fonts in a specific manner into pdf docs and so, the reader sometimes has some trouble in rendering an 'embedded' font. Btw I could do things with FS that are absolutely not immaginable with cfdocument ( and also in html sometimes ) . About images I think the trick is to use absolute paths and not relative. Try that out and let me know.
@Andrea, No doubt Flying Saucer has some awesome stuff in it. I was merely pointing out that you have to get your hands a bit dirty to get it to do some of the things that CFDocument does so cleanly out of the bod. I have ever intention to keep looking in FS :)
@Ben I agree with you about the images in css. Looks like FS has only option to read an absolute path and that can be tricky. Keep me posted about your testings. Should be nice to share experiences and to pack something more complete to reuse.
Should "into folder WEB-INF/lib" read "WEB-INF/cfusion/lib" for installing the library?
Can you change the process to read an xhtml string instead a files?
@Mike No the jar must be loaded from WEB-INF/lib. I think reading a string instead of a file can be done but I am not planning to do that in the near future.

Leave a Reply

Leave this field empty: