Transforming LaTeX to HTML and EPUB Formats Using TEX4ht
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Open Educational Resources (OER) are defined by UNESCO as “learning, teaching and research materials in any format and medium that reside in the public domain or are under copyright that have been released under an open license, that permit no-cost access, re-use, re-purpose, adaptation and redistribution by others”. Many OER are created and shared as documents accessible to readers who use screen readers using several prominent platforms (Pressbooks, Manifold, etc.) While these platforms support some LaTeX (a typesetting system to produce high-quality documents that involve complex formatting), they do not ingest documents entirely typeset in LaTeX nor is it easy to use them to create math-intensive documents. The primary output for LaTeX is generally Portable Document Format (PDF). However, a LaTeX to PDF workflow results in labor-intensive PDF tagging, and a low possibility for OER authored in LaTeX to be both consistently accessible as it would need to be remediated each time it is customized and recompiled into PDF. This is not practical nor is a sustainable practice. In order for LaTeX-authored publications to be accessible, accessibility markup must be embedded in the LaTeX source and template, and contents exported to formats which are more accessible than PDF, such as HTML and EPUB. This paper explores one successful path to transforming LaTeX source code into HTML and EPUB using TeX4ht framework. Our paper provides a brief overview of the TeX4ht system before delving into the TeX4ht for HTML and TeX4ht for EPUB conversion processes separately. We provide a detailed, step-by-step guide for converting a ‘.tex’ file to an ‘.html’ file using make4ht, demonstrated through an example. Similarly, a step-by-step process for converting a ‘.tex’ file to an ‘.epub’ file using the tex4ebook framework is outlined. The conversion process is executed via Command Line on Windows OS and Terminal on MacOS and the entire process is illustrated using appropriate screenshots at every step of the entire process. We also highlight some of the best practices associated with creating accessible LATEX documents for the authors, particularly those that want to transform their work into digitally accessible formats. Furthermore, we summarize our findings from extensive experimentation with the make4ht and tex4ebook frameworks, providing insights to better equip readers of varying skill levels. Some of the key findings include advanced techniques such as creating and using custom configuration scripts for personalized settings and utilizing built-in “options” provided by these frameworks. We also found that requiring use of a pre-made LaTeX template, enclosure of math and symbols using \(...\) instead of $ . . . $, and configuring MathJax java scripts to accurately replicate chapter or section-level equation numbers in the exported HTML or EPUB are critical for successful transformation. This transformation method will be very helpful to the open education community, libraries, and institutions who support OER authors. Additionally, it may also be of interest to the publishing industry in general and non-technical individuals who wish to host a website generated from their work in LaTeX. In the future, we look forward to providing guidance to LaTeX OER authors regarding how to create a LaTeX source-code template which is more easily transformable to an accessible format. Whether you are a researcher, educator, or content creator, mastering this conversion process empowers you to make your resources more widely accessible and impactful in today’s inclusive and equitable digital landscape.