Many web apps demand that the user be able to access something in PDF format. For applications (such as e-commerce stores), these PDFs must be generated using dynamic data and are instantly accessible to the user.
Starting With HTML and CSS
Our web application will possibly already be generating an HTML document using the details to be added to our PDF. In the event of an invoice, the user may display the details online and then click to print a PDF for their records. You could be making packing slips; the information is already in the system, again. You want to format this downloading and printing in a nice way. Hence, considering whether it is possible to use the HTML and CSS to create a PDF version would be a good place to start.
CSS has a specification for CSS printing, and that is the Paged Media module. In my article "Designing for Print with CSS" I have an overview of this standard, and many book publishers use CSS for all of their print output. So because CSS has requirements for printed materials itself, surely we should be able to use it?
The fastest way a user can create a PDF is via its browser. A PDF will be produced by choosing to print to PDF instead of to a printer. Unfortunately this PDF is not typically entirely satisfactory! First, it will have the headers and footers that are applied automatically when you print something from a webpage. It will also be formatted as per your print stylesheet — if you have one.
The issue we run into here is the weak support in browsers for the fragmentation specification; this can mean that the pages' content splits in unexpected ways. It is patchy to help fragmentation, as I discovered when I researched my paper, "Breaking Boxes with CSS Fragmentation." Which means you might not be able to avoid suboptimal content breakdown, with headers left on the page as the last element, and so on.
Furthermore, we do not have the ability to monitor the content in the page margin boxes, for example by adding a header of our choice to each page or page numbering to indicate how many pages a complex invoice includes. Such issues are part of the standard Paged Web, but were not implemented in any browser.
Use Browser Rendering Engines
There are ways to print to PDF using browser rendering engines, without going through the browser's print menu and ending up with headers and footers as if the text had been printed out. In response to my tweet, the most common choices were wkhtmltopdf, and printing using headless Chrome and Puppeteer.
WKHTMLTOPDF
A solution that has been listed on Twitter many times is a command-line tool named wkhtmltopdf. This tool, along with a stylesheet, takes an HTML file or several files and transforms them into a PDF. Using the WebKit rendering engine, it does so.
Therefore this method practically does the same thing as browser printing, except you won't get the automatically inserted headers and footers. If you have a working print stylesheet for your content on this positive side, this method will also be nicely output to PDF, so a simple layout can be very nicely printed.
Unfortunately, however, since you are still printing using a browser rendering engine, you will also run into the same problems as when printing directly from the web browser in terms of lack of support for the Paged Media specification and fragmentation properties. There are some flags you can transfer to wkhtmltopdf to bring back some of the missing functionality you'd have by default using the Paged Media specification. It also includes some hard work in addition to writing strong HTML and CSS.
HEADLESS CHROME
Another interesting possibility is that of using Headless Chrome and Puppeteer to print to PDF.
But again you are restricted by Paged Media and fragmentation browser support. There are some choices that can be passed in to the feature page.pdf). As with wkhtmltopdf, these bring in some of the functionality that would be possible from CSS if browser support existed.
It could well be that one of these solutions will do what you need, but if you find that you're fighting a war, you're likely to reach the limits of what's possible with current browser rendering engines, and you're going to need to look for a better solution.
JAVASCRIPT POLLS FOR PAGED MEDIALYFI
There are a few attempts to replicate the Paged Media specification in the browser basically using JavaScript — mainly making a Paged Media Polyfill. This could give you help in Paged Media while using Puppeteer. See paged.js and Vivliostyle for a preview.
Using A Print User Agent
If you want to continue with an HTML and CSS solution, you need to look at a User Agent (UA) for HTML and CSS printing with an API to produce PDF from your files. These user agents follow the Paged Media specification and have much stronger support for the properties of CSS Fragmentation; this will give you more control over the performance. Lead options include:
· Prince
A UA print can format documents using CSS — much like a web browser. As with CSS support for browsers, you need to review these UAs' documentation to find out what they do. For example, at the time of writing, Prince (who I am most familiar with) supports Flexbox but not the CSS Grid Style. Usually that will be with a different stylesheet for printing when you submit your pages to the device you are using. Like with a standard print stylesheet, not all of the CSS that you use on your web is appropriate for the PDF edition.
The development of a stylesheet for such devices is somewhat similar to designing a standard print stylesheet, making choices about what to display or cover, maybe using a different font size or colours. You will then be able to take advantage of the functionality in the specification on Paged Files, including footnotes, page numbers, etc.
In terms of using these devices from your web application, you'd need to install them on your server (of which you've obtained a license to do so). The biggest issue with these devices is that they are expensive. That said, given the ease with which you can then generate printed documents with them, they may well be paying for themselves in time saved by the developer.
Prince can be used on a pay-per-document basis via an API, through a service called DocRaptor. This would certainly be a good place for many applications to start because if it seemed like hosting your own would become more cost-effective, the creation cost of switching would be negligible.
WeasyPrint is a free alternative, which is not as detailed as the aforementioned methods but can well produce the results you need. Although, it doesn't completely enforce all of Paged Press, it does more than a search engine does.
As a reputed Software Solutions Developer we have expertise in providing dedicated remote and outsourced technical resources for software services at very nominal cost. Besides experts in full stacks We also build web solutions, mobile apps and work on system integration, performance enhancement, cloud migrations and big data analytics. Don’t hesitate to get in touch with us!
This article is contributed by Ujjainee. She is currently pursuing a Bachelor of Technology in Computer Science . She likes most about Computer Engineering is that it allows her to learn and be creative. Also, She believes in sharing knowledge hence is very fond of writing technical contents for the same. Coding, analyzing and blogging are things which She can keep on doing!!
Comments