WJBlog


Why I Published “Easy PHP Websites with the Zend Framework” Using Docbook and Open Source Tools

Published On: March 17, 2011 @ 10:45:59am


After several months of rather hard work, on March 11 I published the long-awaited update to Easy PHP Websites with the Zend Framework. “Easy PHP Websites with the Zend Framework” is the first book of three I’ve self-published in the past two years, after having written four books for Apress (who I happen to still have a great relationship with thanks to the longetivity of Beginning PHP and MySQL, now in its fourth edition).

This was a particularly stressful project in that not only did I have pretty ambitious aspirations in terms of the content found in the update (including PHPUnit tests found at the conclusion of almost every chapter, a new chapter covering Capistrano-based deployment, and a huge companion project complete with all source code), but also because I was determined to write and produce the book using open source tools.

If you’ve ever worked with a traditional book publisher, it’s practically a certainty you’ll be asked to write the book using Microsoft Word (OpenOffice is also commonly supported). There’s good reason for the requirement; publishers have spent an enormous amount of time and money creating processes which allow camera-ready PDF documents to be created from stylized Word documents in a fairly straightforward way. All the author has to do is use Word to write each chapter, mark up the paragraphs, code and other elements using the provided styles, and everybody is happy.

Well, not quite everybody. While Word, OpenOffice, and similar products may be perfectly suitable for writing the next great American novel, they are horrifically maladapted for writing programming books. For starters, repeatedly copying code out of an IDE and into Word quickly becomes a tiresome and error-prone process, particularly after your technical reviewer has pointed out problems which require you to modify and test the code anew. For that matter, asking a technical reviewer to test code by repeatedly copying it from a Word document and into their development environment is completely unreasonable and will result in the reviewer not properly testing any of it, instead quietly opting for the “eyeball test”. Finally, most developers spend the majority of their day operating within a highly customized IDE, one which will undoubtedly play an integral part in the book writing process. So why should the developer be forced to suddenly spend evenings working within an unfamiliar writing environment? I could go on and on.

Yet familiarity is a funny thing, and so even after having decided to self-publish “Easy PHP Websites with the Zend Framework” the first time around, I stuck it out with Word. And all of the aforementioned problems made the entire process a fairly miserable one. In the end however the book wound up being a pretty successful first self-publishing effort, so when I set out to start work on the update there was little doubt I had to take a different route.

The Project Requirements

Having the advantage of working with a clean slate, I spent some time idealizing the perfect writing environment requirements:

  • I must be able to write the book in text format, preferably my IDE (which happens to be Eclipse running on Ubuntu) but at the very least in a lightweight text editor such as gedit.
  • I must be able to manage the book in version control (Git), allowing me to easily push changes to a private GitHub repository.
  • I must be able to easily convert the book from the source format into PDF, ePub, and HTML formats. By easily, I mean avoiding having to use a layout program such as Adobe InDesign.
  • I must be able to externally manage book assets, including notably code examples and images, and then integrate those examples into the book source code at build time (“build” defined as the point in which I create the PDF, ePub, and HTML formats).
  • I would prefer to manage many of the build-related tools via a unified build system such as Phing.

The Solution

Because I’m acting as both author and publisher, meeting the above requirements involves quite a few moving parts, all of which I’m happy to report were pretty easily handled after some investigation.

The Format

I quickly settled upon using one of three candidates: Docbook, HTML, or Markdown. Docbook was my immediate choice, because I had read that a pretty mature set of XSL tools and transformation recipes were available. HTML was a consideration because I’m a huge fan of Mark Pilgrim‘s work, and if it’s good enough for Mark, then it’s good enough for me. Markdown was also a consideration because I had the honor of editing the first edition of The Definitive Guide to Django: Web Development Done Right (Apress, 2007), authored by Adrian Holovaty and Jacob Kaplan-Moss, and marveled over how effectively they used it throughout the writing process.

I wound up going with Docbook based on the aforementioned toolset and considerable amount of online resources. While the XML tags are easy enough to use, in hindsight it turned out that transforming the Docbook files to a PDF which suited my stylistic desires was far more difficult than I imagined. Again in hindsight though, the difficulties were all related to the fact that I was a newbie. The trouble was well worth it, if for any other reason because I can build a camera-ready PDF and Kindle-ready ePub document from these source Docbook files in literally minutes.

Incidentally, after quickly tiring of simultaneously writing and marking up chapters using the Docbook styles, I stuck with exclusively writing and revising each chapter, and then marking it up afterwards. This allowed me to focus on writing the best possible book rather than constantly tweaking formatting directives, and resulted in a much better book.

Git

If you follow my Developer.com contributions, then you already know I’m a huge Git fan. It is by far the easiest and most natural way to manage source code that I’ve ever encountered (and I’ve tried all of them). I use it for managing practically everything, all of my writing included (I’ve even written about using Git to streamline writing projects, you can read the article here).

Given the above, it’s not a surprise I have been using Git to manage my book throughout the course of the entire project. From the version control standpoint, writing is quite different from coding in the sense that you probably won’t be reverting changes or anything like that, however Git has proven to be extremely useful for stashing changes, and branching in order to start work on a new chapter that I’m considering for a future update.

Due to a long history of laptop abuse I try to vigilantly backup material on a regular basis but hate dealing with DVDs and USB keys, so I use GitHub to regularly backup all of the chapters, code and images.

Building the ePub

Although I’ve historically sold a fair amount of books through WJGilmore.com, it would be foolhardy to ignore the industry 800LB gorilla, and so making a version available for the Amazon Kindle was an immediate priority. Amazon supports a proprietary book packaging format known as AZW, however you can upload books into their system http://kdp.amazon.com) in a variety of formats, ePub and MOBI included, and Amazon will convert the book for you.

To convert the Docbook-formatted book into ePub, I used the standard ePub stylesheet bundled with the Docbook distribution. Converting the book was as simple as executing the following command:

$ xsltproc /usr/share/xml/docbook/stylesheet/docbook-xsl/epub/docbook.xsl docbook.docbook

Executing this command creates two directories, OEBPS and META-INF, which form part of the ePub book. I stress part because you’re not done quite yet. Notably at a minimum you also need to create a mimetype file and also copy your images into the OEBPS directory. To my knowledge, ePub only supports GIF, JPEG, and PNG formats, which caused a bit of a problem for me because all of my screenshots were saved in TIFF format, however some ImageMagick magic converted all of my images in no time flat. Maybe in a future post I’ll talk more about this process, but for the moment if you are interested I suggest reading this excellent IBM developerWorks tutorial, authored by Liza Daly.

Because Amazon’s KDP service keeps spitting out into cryptic conversion errors when I try to upload ePub documents, I use Calibre to convert the ePub to MOBI format. For whatever reason Amazon is able to process the MOBI version without problems. Within 24 hours after uploading the book the book was available for sale. Using Barnes & Noble’s PubIT service is similarly easy, the only difference being it was able to process the original ePub document without problem. Like Amazon, the Nook version was available for sale within 24 hours.

Other Useful Tools

I’ve also taken advantage of a number of other useful tools throughout this process, including:

  • Aspell: Aspell is an open source spell checker. All you have to do is point Aspell to the document you want to spell check, and a streamlined interface appears which allows you to buzz through and correct potentially misspelled words in a very convenient fashion.
  • ImageMagick: I use ImageMagick to convert and resize screenshots.
  • pdftk: I use pdftk to merge my book cover and PDF.
  • GIMP: I use GIMP to touch up screenshots, including notably placing borders around them when applicable.

The Unified Build System

This is the most recent, and I think coolest addition to the process. I’ve lately been promoting the importance of setting up an automated website deployment process, having given several talks on the topic at various user groups. If you’re interested you can download the slides from GitHub Automating Deployments and Other Annoying Tasks with Phing, Capistrano and Liquibase). I wanted to create a similarly effective solution for building books, and so created a Phing build file which unifies the various commonly used commands. While not strictly necessary, it’s a very simple and convenient way to manage the various tasks. I’ve included a screenshot below:

Using Phing to manage book builds
Figure 1. Using Phing to manage book builds

Lessons Learned

I learned so many lessons throughout this process that it’s hard to distill them down into a few concluding paragraphs. It wasn’t as easy as I originally thought it would be, but the hard work has definitely paid off, and the system will allow me to write books at a speed never before imaginable. Writing has become a much more natural part of my daily operational workflow rather than something which requires special effort to settle into. Getting to this point wasn’t anywhere near as easy as I had expected, and there were no shortage of frustrating days throughout, however with one under my belt I really think the sky is the limit.

But the most enjoyable part of this process has come in recent days. In the days following the release, several readers e-mailed me pointing out various formatted-related issues. Each time I would fix the issue, rebuild the PDF and ePub versions, and deploy those versions to WJGilmore.com, Amazon.com, and BN.com in literally minutes. This ability to respond to reader feedback in mere minutes is extraordinary, and something which doesn’t happen in days nor weeks, and often not at all at traditional publishing houses.

Questions?

I love talking about this stuff. E-mail me at wjATwjgilmore.com with your questions.