Previous Up Next

Chapter 14  Suggestions for Future Work

14.1  Introduction

In this chapter, I discuss several ways in which Canthology might be improved. If you have the time and skills to make any of these improvements, then please do so.

14.2  A Wider Selection of Title-page Templates

As discussed in Section 7.4, Canthology provides three template files that make it easy to create the title page of a book. It would be nice to provide a more extensive range of template title pages with future versions of Canthology. If you would like to contribute to this, then you might find some inspiration from Peter Wilson’s extensive collection of title pages [13].

Also, I do not claim to be especially gifted at designing title pages, even when borrowing ideas from Peter Wilson’s document. If you feel the existing title pages can be tweaked to improve their layout, then please do so.

14.3  Background Graphics for Title Pages

I like having a white background for the title pages of books and manuals that I write. However, many people prefer a title page to have a colourful background picture. For this reason, it might be nice to ship a collection of high-resolution, royalty-free pictures with Canthology that people could use on the title page of documents.

Having said that, I would not like a distribution of Canthology to consist of, say, 1 MB of software, a few more MB of manuals and 400 MB of high-resolution pictures. Perhaps a better idea would be to have a small distribution of Canthology containing just a few sample pictures (so users can play with the \thisPageBackgroundImage command), and a separate website that users can browse to access a large collection of pictures. I know such websites already exist, but they are filled mostly with landscape-oriented images, while book covers require portrait-oriented images.

14.4  A Blog-to-LaTeX Converter

Consider the following scenario. Fred has a popular blog, and has been posting articles to it for on a regular basis for several years. He would like to create a “best of” collection of his blog articles, and publish it in book format. He decides Canthology would be a good tool to help him do this. Unfortunately, his blog articles are written using one markup language, while Canthology uses a different markup language (LaTeX), so he has to do the tedious work of converting blog postings from one markup language to another.

A useful addition to Canthology would be a utility that can convert documents from various blog markup formats into LaTeX format.

14.5  Improve the Quality of Generated HTML

Unfortunately, the HTML generated by HeVeA is not 100% compliant with standards. The Canthology Makefile that runs hevea and hacha tries to improve the quality of the generated HTML by passing it through the tidy utility. However, there is still room for improvement. Thus, one way to improve Canthology is to improve HeVeA so it generates better-quality HTML.

14.6  Installers for Various Operating Systems

Installing Canthology is a simple, albeit multi-step, process, which can be described as, “unzip the distribution and set a few environment variables”. It would be nice to have platform-specific installers that turn the multi-step process into a single-step process.

14.7  Generate ebooks

Canthology can generate books in PDF and HTML formats. Would it be possible for Canthology to also generate books in popular ebook formats? I spent a few weeks investigating this possibility and discovered that it presents some challenges.

14.7.1  Different ebook File Formats

There are approximately 20 different ebook file formats. In some of the popular formats, an ebook is represented as an archive, such as a ZIP file, that holds the following: (1) XML or HTML files that contain the main text of the book; (2) a ".css" file that defines the visual layout of the text; (3) image files for diagrams and pictures used within the book; and (4) metadata to specify information—such as the title, author and publisher of the book—and impose an organisational structure upon the collection of files in the archive.

I suspect the wide variety of ebook formats is due mainly to each manufacturer of ebook readers wanting to create a proprietary file format for its devices, in the hope of gaining a monopoly on the market for ebooks. However, as a back-up plan (in case a monopoly is not achieved), ebook readers tend to support a few open or competing file formats too. The result is that almost all ebook readers can be used to view books in the EPUB format. The most notable exception to this is the Amazon Kindle, which does not support EPUB. However, the Kindle supports the Mobipocket file format, which is supported on some other ebook readers too. Thus, if you want to make a book viewable on all ebook readers, a pragmatic approach is to produce EPUB and Mobipocket versions of your book, and ignore all the other ebook file formats.

14.7.2  Using HeVeA and Calibre

Calibre is an open-source application for managing collections of ebooks. Calibre provides the ability to convert an HTML document into any of many ebook formats, including EPUB and Mobipocket. This raises the possibility of using HeVeA to convert a LaTeX document into HTML format, and then using Calibre to convert that into an ebook format.

I briefly experimented with this approach. Unfortunately, I was not very happy with the results it produced. Most of the text in my sample document was displayed appropriately in the ebook reader application provided by Calibre. However, the formatting of some text, such as poems, was messed up badly. Initially, I thought there might be a bug in Calibre, but when I started to investigate what might have gone wrong, I learned that both the EPUB and Mobipocket formats support only a subset of HTML and CSS. Perhaps the HTML generated by HeVeA did not fall into that supported subset, so Calibre had to modify the HTML when converting it into an ebook format, and these modifications resulted in the ebook version of my sample document displaying in a way I had not intended.

14.7.3  Tailoring HeVeA to better Support the Generation of ebooks

Perhaps it would be possible to write two sets of ".hva" files for HeVeA. One set would instruct HeVeA to generate HTML that takes advantage of all the capabilities of the HTML standard and displays nicely in web browsers. The other set would instruct HeVeA to generate only the subset of HTML elements and tags that are supported in EPUB and Mobipocket.

The two sets of ".hva" files would reside in different directories, and the "-I directory" command-line option could instruct HeVeA to look for ".hva" files in one of those directories. For example, if the two directories are called browser-support-files and ebook-support-files, then executing the command:


hevea -I browser-support-files my-document.tex

would produce HTML suitable for viewing in a web browser. In contrast, executing the command:


hevea -I ebook-support-files my-document.tex

would produce HTML suitable for converting into an ebook format via Calibre.

14.7.4  Playing with HeVeA and Calibre

If you want to experiment with using HeVeA and Calibre to convert LaTeX documents into ebooks, then the following information might help you to get started.

Before using Calibre, you should convert your LaTeX document into a single-page HTML document. You can do this by editing Canthology.cfg and ensuring the @copyFrom statement copies from the book:html-one-page scope. Afterwards, run canthology.

The web page below contains documentation on the ebook-convert command (provided as part of Calibre):

http://manual.calibre-ebook.com/cli/ebook-convert.html

Within that page, you should scroll down to the “HTML Input” list item, and then click on one of the sub-items, such as “HTML Input to EPUB Output”. Doing that will lead you to documentation on the command-line options for performing the desired conversion.

If you want to convert a file called my-anthology.html into EPUB format, then the following example illustrates how you might do that (\ denotes a line continuation):


ebook-convert my-anthology.html my-anthology.epub \
                  --chapter "//*[name()=’h1’]" \
                  --page-breaks-before "//h:h1" \
                  --no-default-epub-cover \
                  --chapter-mark none \
                  --authors "Example Name" \
                  --publisher "Example Company"

You will probably want to play with different command-line options to see what effect they have.

14.7.5  Small Screen Sizes of ebook Readers

Support for a limited subset of HTML and CSS is not the only challenge when creating an ebook. Another challenge is the small displays of most ebook readers. For example, the diagonal screen size of the Amazon Kindle is is just 6 inches (15 cm). That is slightly smaller than a postcard.

The small screen size of an ebook reader is acceptable for reading a novel, because most novels contain minimal formatting. However, the small screen size is likely to cause problems for technical books, because they often have more ambitious formatting requirements. For example, Figure 11.4 is a listing of a Makefile. The longest line in that listing is almost 70 characters across, which is too wide to display in its entirety on the screen of most ebook readers (unless you reduce the font size to something that is uncomfortably small to read). Likewise, some technical books contain tables of data that are too big for viewing on an ebook reader.

Perhaps a good rule of thumb is to consider creating ebook versions of novels, but to avoid trying to create ebook versions of product manuals and other technical documentation.

14.8  Providing Customisable Anthologies as Demos

One particular benefit of Canthology is its ability to easily customise an anthology. This benefit would be more readily apparent if Canthology was shipped with some book-length demos that are useful in their own right.

For example, Shakespeare’s poems are old enough to be out of copyright, and they can be downloaded (legally) from numerous websites. Let’s assume one person took the time to download all those poems, save each poem in a separate ".tex" file, and typeset them with LaTeX commands. Such a task might be completed in, say, a weekend. Then, a Canthology.cfg file could be written to provide a title page and table of contents, and have an \input command for each poem. The result, obviously, would be a complete anthology of Shakespeare’s poems. However, the result would be something else too: a customisable anthology of Shakespeare’s poems. This customisability would offer benefits:

Of course, a limitation of such an anthology would be that, by default, it would not contain any critiques of the poems because such critiques tend to be new enough to still be in copyright. However, there would be nothing preventing budding literature geeks from writing their own critiques and contributing them to the anthology.

Customisable anthologies could be useful in other fields too. For example, food lovers might use Canthology to compile a large collection of recipes. The idea of yet another recipe book is not very exciting. But what is exciting is distributing the recipes in Canthology format, so people can edit the configuration file to create a personalised book that contains only the recipes they like.

14.9  Configuration Support for Additional LaTeX Tools

Currently, the etc/defaults.cfg file shipped with Canthology assumes people will use pdflatex to convert LaTeX documents into a printable format. However, not everybody uses pdflatex. Some people prefer to use latex to produce a ".dvi" file, and then use a post-processing tool to convert that file into a printable format, such as PDF or PostScript. Other people prefer to use luatex or xelatex. Likewise, not everyone uses HeVeA to generate HTML documents. Some people prefer to use another tool, such as TTH, LaTeX2HTML, TeX4ht, or LaTeXML.

The choice of a LaTeX tool affects not only the build_commands configuration variable. It can also affect optional arguments passed to packages, and set-up commands used in the preamble of a document.

It would be nice to see Canthology extended to provide out-of-the-box configuration support for tools other than pdflatex and HeVeA. One way to do this would be to extend etc/defaults.cfg to contain scopes for each tool. However, doing that might result in the file growing too large to be easily maintained. Another approach would be to provide a separate configuration file for each tool. For example: etc/pdflatex.cfg could provide configuration support for pdflatex; etc/hevea.cfg could provide configuration support for HeVeA; etc/luatex.cfg could provide configuration support for luatex; and so on. Presumably, configuration settings that are common to many tools could be factored out into, say, etc/common.cfg and a tool-specific configuration file would access those settings via an @include statement.

14.10  Add Windows Support for Generating HTML

In Section 10.4, I explained why the use of Canthology with HeVeA (to produce HTML files) is not supported on Windows. Overcoming this restriction would require several pieces of work to be carried out.

First, somebody would have to enhance the Windows port of HeVeA to contain ports of the UNIX utilities required to run imagen.

Second, Canthology uses a Makefile to execute the sequence of commands required to generate HTML output. A Windows-compatible replacement (perhaps a ".bat" file) would need to be written.

Finally, most of Canthology is written in Java, but Canthology also contains some simple Tcl scripts. Ideally, we should write Java replacements for those Tcl scripts. Doing this would eliminate the requirement to have Tcl installed on a computer, which would make it easier to run Canthology on a Windows-based PC.

14.11  A Graphical User Interface

Canthology is simple to use—at least for people who are comfortable executing commands from a command-line prompt. But, of course, many people do not know how to execute commands in a UNIX shell or a Windows command window. Such people would find Canthology much easier to use if there was a graphical user interface (GUI) “wrapper” for it.

14.12  A Web Interface

Consider the following scenario. Fred has no Canthology-related software installed on his computer. He briefly looks through this Canthology manual, thinks Canthology might be useful, and decides to try it. But to do so, he first has to install the following software:

Having to install more than one GB of software might be enough to dissuade Fred from trying Canthology. But, if Canthology was installed on a website and he needed only a web browser to use it, then Fred could try Canthology without having to install it. Fred could write some ".tex" files and upload them to the website. An application on the website could help him create a configuration file. Then, when he clicks on a “create document” button, the web server would run canthology on his configuration file and ".tex" files, and provide him with a downloadable PDF file.


Previous Up Next