Wednesday 12 November 2014

Converting a LaTeX Thesis to HTML with latex2html

 https://math.dartmouth.edu/~ahb/thesis/latex2html_tips.html

Converting a LaTeX Thesis to HTML with latex2html

Here's my tips for putting your Latex'ed thesis online. Now that Google can search PDF files, there is less urgency to do this, but I like having it online, and I'm your sure potential collaborators would like it too... I'm assuming a Harvard-style document (i.e. using huthesis.cls; see my page on this), but most of what I say applies to any custom thesis style.
Keep in mind that it is a least a day's work (at least it was for me, with all the debugging below). More if you want it perfect.

HOW TO DO IT (rough guide):

  • Back up thesis. Make a new version (eg thesis_html.tex etc) which is in regular \documentclass[11pt]{book} form. Cut out table of contents. Add 'Chapter 1: ', 'Appendix A: ', etc at start of \chapter{} arguments for each chapter and appendix. (Here you have to do the counting by hand: it's clumsy).
  • Cut out all huthesis.cls special things: reorganise frontmatter into a separate chapter. (reset the chapter counter after this so your chapters don't get numbers offset). Put each frontmatter piece as a regular section{} rather than the huthesis.cls environments. Build title-page by hand within the \author{} and \date{} commands (clumsy).
  • Make sure compiles with regular latex. Do so 3 times for correct refs.
  • Change the bibilography to be \input{thesis.bbl} if you used BibTeX, rather than the call to bibs.bib. This hack gives you a regular book-style bibliography so latex2html can handle it.
  • Make sure you have a recent version of latex2html. Version 2k.1 seems to come with recent linux dists.
  • Check your version of netpbm library via rpm -qa | grep netpbm (if you're on Linux). If 9.11 or less, either update (see sourceforge distribution) or make yourself a new patched verion of pstoimg script as described in Ross Moore's black border fix.
  • Copy the icons directory in your latex2html distribution to your public_html directory (they must be accessible from outside world!)
  • Use following .latex2html_init in local directory:
    # Force white background and black text
    $BODYTEXT = "text=\"\#000000\" bgcolor=\"\#FFFFFF\"";
    # This ensures that some figures do not end up with a grey background
    $WHITE_BACKGROUND = 1;
    # Tell LATEX:
    $PSTOIMG = "$PERL /home/barnett/lib/pstoimg";
    $LATEX_COLOR = "\\pagecolor{white}"; # use GIF rather than PNG:
    $ICONSERVER = "http://monsoon.harvard.edu/~barnett/icons";
    $IMAGE_TYPE = $IMAGE_TYPES[1];
    1; # This must be the last line
    Note that you only need the $PSTOIMG line if you made your own patched version of pstoimg, in which case have it point to wherever you put your version. The $ICONSERVER line points to where you put your icons directory.
  • Run latex2html on a paired-down document (eg all but 1 chapter removed) first.
  • My final top-level document: thesis_html.tex (cf original thesis.tex. My final frontmatter: frontmatter_html.tex (cf original frontmatter.tex.
  • My command line:
    latex2html -white -antialias -image_type gif -html_version 3.2 thesis_html
    
    This creates (or modifies) the directory thesis_html
  • If not all inline images or equations appear in final HTML form, scan though images.log file generated in the output directory. Most likely latex got stuck in a loop and died, failing to produce the remaining .ps images. You may need to alter your math latex code until it works. I needed to remove \mathletters calls altogether. Anyone got any help on \mathletters (less buggy version)?
  • Move the resulting directory into your public_html directory. The resulting size was only a few Mb, much less than the (uncompressed) PostScript version, about the same as in PDF. The entrance URL is then thesis_html/index.html

Useful resources


Alex Barnett, October 2001.     

No comments:

Post a Comment