Previous Up Next

Chapter 11  Using HeVeA with Canthology

11.1  Introduction

To use HeVeA with Canthology, you need to modify your Canthology configuration file so your document’s scope copies from one of the html-related scopes previously listed in Table 7.1. You can see an example of this in line 3 of Figure 11.1.

Figure 11.1: Configuration file for generating HTML
 1  @include getenv("CANTHOLOGY_HOME") + "/etc/defaults.cfg";
 2  anthology1 {
 3      @copyFrom "book:html-many-pages";
 4      root_file {
 5          base_name = "my-anthology" + macro.macro.paperSizeSuffix;
 6          front_matter = [...];        # omitted for brevity
 7          main_matter  = [...];        # omitted for brevity
 8          back_matter  = [...];        # omitted for brevity
 9      }
10      substitutions.search_replace_pairs = [
11          ...                          # omitted for brevity
12      ] + substitutions.search_replace_pairs;
13  }

When you run canthology on this modified configuration file, the generated HTML file(s) are placed in the output-html subdirectory. Within that directory, the index.html file contains the title page of your document.

The rest of this chapter explains how the Canthology-HeVeA integration works. That knowledge will explain how you can customise some aspects of the generated HTML pages.

11.2  The default.html Configuration Scope

Figure 11.2 shows a slightly abridged version of the default.html scope in the etc/defaults.cfg file of an installation of Canthology.

Figure 11.2: Outline of the book:html-many-pages configuration scope
 1  default.html {
 2      @copyFrom "default.common";
 3      working_dir = "output-html";
 4      macro {
 5          paperSizeSuffix = "";
 6          ...
 7      }
 8      root_file {
 9          documentclass.name = "book";
10          package.names = [..., "hevea", "hevea-fix", ...];
11          ... # omitted for brevity
12      }
13      copy {
14          file_extensions = [".hva"] + file_extensions;
15          extra_files_to_copy = [
16              "Makefile",
17              "canthology.css",
18              "html-header.txt",
19              "html-footer.txt",
20              "tidy.conf",
21              "scripts/remove_pseudo_toc.tcl",
22              "scripts/insert_html_header_and_footer.tcl",
23          ];
24          ... # omitted for brevity
25      }
26      build_commands = ["make"];
27      substitutions {
28          search_replace_pairs = [
29            "(MAKEFILE-IMAGEN-OPTIONS-PLACEHOLDER)",
30                                              "-pdf -png -mag 2000",
31            "(MAKEFILE-INSTALL-DIR-PLACEHOLDER)", "/var/www",
32          ];
33      }
34  }

The @copyFrom statement (line 2) copies default settings that are shared with other configuration scopes.

On line 3, the string "output-html" is assigned to the working_dir variable.

The macro.paperSizeSuffix variable is assigned the value "" (line 5) because HTML is independent of paper size.

Both hevea and hevea-fix are included in the list of packages used by the document (line 10). The hevea-fix package was discussed in Section 10.10.1.

Line 14 adds ".hva" to the copy.file_extensions list. This enables Canthology to find the ".hva" implementations of package when it encounters \usepackage commands.

The extra_files_to_copy variable (lines 15-23) instructs Canthology to copy specific files—used to help turn a document into HTML pages—into the working directory. The first file to be copied is Makefile (line 16), and this is used by the make build command (line 32). The other files to be copied are required for the Makefile to work properly. In particular:

The substitutions.search_replace_pairs variable (lines 28–32) is used to provide values for two variables used in the Makefile.

11.3  The book:html-many-pages Configuration Scope

One variable missing from the default.html scope is copy.search_path, which specifies the directories in which Canthology should look for the files listed in copy.extra_files_to_copy. This is because the value of copy.search_path depends on whether the HTML document will consist of one or many pages.

Figure 11.3 shows that the book:html-many-pages scope copies from the default.html scope (line 2) and then sets copy.search_path appropriately (lines 4–9).

Figure 11.3: Outline of the book:html-many-pages configuration scope
 1  book:html-many-pages {
 2      @copyFrom "default.html";
 3      copy {
 4          search_path = [
 5              ".",
 6              getenv("CANTHOLOGY_HOME") + "/etc/html-many-pages",
 7              getenv("CANTHOLOGY_HOME") + "/etc/html-common",
 8              getenv("CANTHOLOGY_HOME") + "/etc/latex",
 9          ];
10      }
11      substitutions {
12          search_replace_pairs = search_replace_pairs + [
13              "-link "Table of contents" -contentsname "Contents"",
14                    "-link %"Table of contents%" -contentsname %"Contents%"",
15          ];
16      }
17  }

The book:html-one-page scope is similar, but sets copy.search_path to contain etc/html-one-page instead of etc/html-many-pages (line 6).

11.4  The Makefile

The Makefile used to convert the input document into multiple HTML pages is shown in Figure 11.4.

Figure 11.4: The Makefile used by the html-many-pages scopes
 1  DOC=(ROOT_FILE_BASE_NAME)
 2  IMAGEN_OPTIONS=(MAKEFILE-IMAGEN-OPTIONS-PLACEHOLDER)
 3  INSTALL_DIR=(MAKEFILE-INSTALL-DIR-PLACEHOLDER)
 4  TOC_OPTIONS = (MAKEFILE-TOC-OPTIONS-PLACEHOLDER)
 5  HTML_FILES  = ‘find . -name "*.html"‘
 6  GRAPHIC_FILES  = ‘find . -name "*.jpg" -o -name "*.gif" -o -name "*.png"‘
 7  
 8  html:  clean
 9      hevea $(DOC).tex
10      if [ -f $(DOC).image.tex ]; then \
11          imagen $(IMAGEN_OPTIONS) $(DOC); \
12      fi
13      if [ ‘grep -c \\bibliography $(DOC).tex‘ -ne 0 ]; then \
14          bibhva $(DOC); \
15          hevea $(DOC).tex; \
16      fi
17      hevea $(DOC).tex
18      hacha -o index.html $(DOC).html
19      rm $(DOC).html
20      tclsh scripts/remove_pseudo_toc.tcl index.html
21      tclsh scripts/insert_html_header_and_footer.tcl \
22                                  -header html-header.txt \
23                                  -footer html-footer.txt \
24                                  $(HTML_FILES)
25      cat canthology.css >> $(DOC).css
26      -tidy -config tidy.conf -m $(HTML_FILES)
27  
28  install:
29      mkdir -p  $(INSTALL_DIR)
30      -chmod 775 $(INSTALL_DIR)
31      cp $(HTML_FILES) $(INSTALL_DIR)
32      cp $(DOC).css $(INSTALL_DIR)
33      -cp -f $(GRAPHIC_FILES) $(INSTALL_DIR)
34  
35  clean:
36      rm -f *.aux *.log *.toc *.out *.dvi $(DOC).pdf *.blg
37      rm -f *.haux *.htoc *.hbbl $(HTML_FILES) $(DOC).css
38      rm -f $(DOC)[0-9][0-9][0-9].gif $(DOC)[0-9][0-9][0-9].png
39      rm -f contents_motif.gif next_motif.gif previous_motif.gif
40      rm -f $(DOC).image.tex

Readers not familiar with Makefiles should read Section 11.4.1 to get an overview of the basic concepts and syntax used in Makefiles. Readers who are already familiar with Makefiles can skip ahead to Section 11.4.2.

11.4.1  Overview of Makefile Concepts and Syntax

A Makefile typically contains instructions for turning one or more source-code files into an executable application. In the case of Canthology, a Makefile contains instructions for converting ".tex" files into HTML.

An application called make reads a Makefile and executes the instructions contained in it.

Within a Makefile, a line of the form name=value defines a variable called name. The value of the variable can later be accessed with the syntax $(name). For example, line 1 in Figure 11.4 defines a variable called DOC, and the value of this variable is used in line 9 (and several other lines).

A line starting with name: defines a target called name. Figure 11.4 contains three targets: html, install and clean. The indented lines immediately following the name of a target are shell commands that need to be executed to “make” that target. The list of shell commands is terminated by a blank line, the name of the next target or the end of the file. Thus, lines 9–26 are the commands for making the html target, lines 29–33 are the commands for making the install target, and lines 36–40 are the commands for making the clean target.

If the name of a target is specified when running make, then make will execute the commands associated with that target. For example, running "make html" executes the commands associated with the html target, while running "make clean" executes the commands associated with the clean target. The first target that appears in a Makefile is the default target. Thus, in Figure 11.4, the default target is html, so running "make" is equivalent to running "make html".

If a command is too long to fit on one line, then \ can be used as a line continuation character. For example, \ is used to merge lines 10–12 into one long command.

11.4.2  Variables Used in the Makefile

The Makefile in Figure 11.4 defines four variables (lines 1–4) whose values are placeholder strings that will be replaced by real values when Canthology is run. The (ROOT_FILE_BASE_NAME) placeholder will be replaced by the value of root_file.base_name in the Canthology configuration file. Values for the other placeholder strings are specified in the substitutions.search_replace_pairs configuration variable, as can be seen in lines 29–31 of Figure 11.2 and in line 13 of Figure 11.3.

11.4.3  The html Target

The definition of the html target (line 8–26 in Figure 11.4) is long, but straightforward. It first runs hevea on the root ".tex" file (line 9). Then, it checks for a particular file whose existence indicates that imagen needs to be run (lines 10–12). Next (lines 13–16), it checks if a \bibliography command appears in any of the ".tex" files; if so, it runs bibhva to convert the bibliography’s contents into LaTeX format, and then runs hevea to process the newly generated LaTeX. Afterwards, hevea is run again (line 17) to resolve cross references.

At this point, the LaTeX document has been converted into a monolithic HTML page. To split that into multiple HTML pages, the hacha utility is run (line 18) and the no-longer-needed monolithic HTML file is deleted (line 19). Then, some Tcl scripts are run to remove the pseudo table of contents (line 20) and insert headers and footers into each HTML page (lines 21–24). The canthology.css file is appended to the ".css" file created by hevea and hacha (line 25). Finally, tidy is run to convert the HTML pages into XHTML format (line 26).

11.4.4  The install Target

The install target copies HTML files plus supporting CSS and image files into the directory specified by the INSTALL_DIR variable. Line 31 of Figure 11.2 sets the default value of this variable to be "/var/www", which is the root directory for web servers on many UNIX machines.

11.4.5  The clean Target

The clean target uses the UNIX rm command to remove generated files.

11.5  Customising the HTML Pages

If you have a working knowledge of the syntax used in HTML and CSS, then it is possible to customise the “look and feel” of the HTML pages generated by Canthology. You can do this by providing your own versions of the html-header.txt, html-footer.txt and canthology.css files, which, as I explained in Section 11.3, are used by the Makefile.

The default versions of html-header.txt and html-footer.txt used by the book:html-many-pages scope is shown in Figures 11.5 and 11.6.

Figure 11.5: The html-header.txt file
<div class="banner">
    <div id="bannerleft">
        Change the look-and-feel with your own versions of
        "html-header.txt", "html-footer.txt" and
        "canthology.css".
    </div>
    <div id="bannerright">
        <a href="index.html">title page</a>&nbsp;&nbsp;
        <a href="contents.html">contents</a>
    </div>
    <div class="endtwocolumntext"></div>
</div>
<div class="content">
Figure 11.6: The html-footer.txt file
</div><br/>

The html-header.txt file uses div elements to divide the HTML page into a “banner” area follows by a “content” area (for the main content of the page). The “banner” area is further subdivided into “bannerleft” and ‘bannerright” areas.

The html-footer.txt file closes the “content” area and adds a br element (a line break) to provide a bottom margin, thus ensuring the end of the page’s content is not uncomfortably close to the bottom of the window in which it appears.

The visual formatting of the div elements mentioned above is defined in the canthology.css file. A discussion of that file is outside the scope of this chapter, but I encourage readers to examine the file to see how the visual formatting rules are specified.

The effect of html-header.txt and html-footer.txt is illustrated in Figures 11.7 and 11.8, which show the start and end of a HTML page.

Figure 11.7: The start of a HTML page
Figure 11.8: The end of the HTML page

You might wish to replace the text in the “bannerleft” area with, say, the logo for your website. Likewise, you might modify html-footer.txt to add some text (perhaps contact details for your organisation) to the bottom of HTML pages. If you do not wish to have any header or footer contents, then you can delete all the contents of html-header.txt and html-footer.txt, thus making them empty files.

Among other things, the canthology.css file sets a white background for the HTML page and highlights a hypertext link when the mouse pointer hovers over it. You can create your own version of canthology.css to implement a different “look and feel” for the generated HTML pages. This enables you to ensure that the Canthology-generated HTML pages can blend into the overall visual style of a parent website.

11.6  Other HTML Configuration Scopes

The discussion so far has assumed the use of the book:html-many-pages configuration scope. The report:html-many-pages and article:html-many-pages scopes are almost identical—they just use a different value for the root_file.documentclass.name configuration variable.

There is another set of configuration scopes called book:html-one-page, report:html-one-page and article:html-one-page. They use a slightly different setting for the copy.search_path variable, so they look for support files in the etc/html-one-page directory instead of in etc/html-many-pages. Because of this, they pick up different versions of Makefile, html-header.txt and html-footer.txt.

The etc/html-one-page version of Makefile runs hevea to convert a LaTeX document into a single-page HTML file; it does not run hacha to split the file into multiple HTML pages.

The etc/html-one-page version of html-header.txt does not contain “title page” or “contents” links.


Previous Up Next