LaTeX
lets you create PDF and Postscript files directly, but these formats are historically more print-media oriented. I'm saying "historically", because one can't really say that about the modern PDF specification any more. See for example my page on movies in PDF documents for a case in point.
Nevertheless, the format of choice for web-based communication is currently HTML
, and increasingly the more general XML
format. For about a decade, people have been more or less patiently waiting to get the MathML
standard, a type of XML
, implemented in common web browsers, and that's finally becoming a reality.
So the question is: how do you get all your LaTeX
source files, which produce such beautiful PDF, to make beautiful HTML
, too? Here I'll focus on one solution, based on tex4ht [note July 2009: the creator of tex4ht
is Eitan Gurari. Sadly, he died last month.] I get this package through fink.
Here is a sample LaTeX
source:
\documentclass[12pt]{article} \usepackage[latin1]{inputenc} \usepackage{graphicx} \begin{document} This shows inline math, where $\alpha$ is related to $\sqrt{\beta}=2$. Not to be confused with \(\sum_{\nu}a_{\nu}x^{\nu}\) and finally \[ \sin\frac{1}{2\gamma\sum_{\mu=1}^{\infty}C_{\mu}} \ne \alpha \] \begin{figure}[t] \center{\includegraphics{../IllustratorScreenShot.png}} \caption{This is an unrelated figure.} \end{figure} And bye. \end{document}
Assuming this file is called tex4htExample.tex
and can be processed successfully with pdflatex, you should get a PDF file that looks like the one linked here: tex4htExample.pdf
Now try the following two procedures from the command line:
htlatex tex4htExample
tex4htexample.html
which you can inspect here.
/sw/share/tex4ht/bin/mzlatex tex4htExample
(the path prefix is only needed if you have installed tex4ht from fink)tex4htexample.xml
which you can inspect here, provided your browser understands this XML
dialect. Some more remarks on this approach are at the Tex4Moz web site.
The difference between the HTML
in the first case and the XML
in the last example is the way math is represented. The displayed equation and some inline math in case 1 are bitmapped, making the base-line of the inline square root look wrong. This is something that latex2html
handles better (out of the box).
Case 2, though, has no bitmapped maths at all. Everything is represented using MathML
, which means the semantic information about the mathematical expression is completely preserved. Even with case 1, improvements are possible by adding your own configuration commands to help htlatex avoid bitmaps. However, since MathML support is getting better, it may soon be unnecessary to worry about HTML export at all.
The figure is placed where it appeared in the source, because the specification [t]
(for top of the page) makes no sense in an HTML document that isn't page-oriented.
The fonts used for Math display in case 2 are still incomplete on my Mac. But there is hope. The Scientific and Technical Information Exchange (STIX) font creation project will release a free set of fonts that can be used across platforms and applications, specifically aimed at scientists who use TeX typesetting. Mozilla browsers will likely be using these fonts once they are released.