awarm.spacenewsletter | fast | slow

Making a modern ebook with Standard Ebooks

I am a big fan of e-readers, and of ebooks, but I've never really dug into what it takes to make one. Inspired by Joel's website tour to dip my toes into some digital publishing, I decided to take a stab at producing an ebook.

I of course didn't want to have to write a book, so I needed some material to start with. I decided to go for Democracy and Education by John Dewey because:

If you're into it, you can download the final file (epub, azw3) or, read on to learn how I did it.

With a book chosen, I just need to figure out what specific things I was going to fix. An ebook, like a physical book, depends on a plethora of creative decisions, from the structuring of text, to the typography, to spelling choices. Luckily, I didn't have to make them myself as the incredible Standard Ebooks project exists.

It pulls together a comprehensive style manual and a set of tools, and gives you a step-by-step process to producing a beautiful, semantically meaningful ebook.

Doing the work

The process starts with a "raw" ebook from Project Gutenberg. The one for Democracy and Education was transcribed by someone named David Reed and began with this dedication:

I have tried to make this the most accurate text possible but I am sure that there are still mistakes.

I would like to dedicate this etext to my mother who was a elementary school teacher for more years than I can remember. Thanks.

David Reed

From what I can tell Mr. Reed did a pretty great job. The text itself contains almost no mistakes. However, the markup, while pretty functional, is a little outdated. It has many elements that are used for things like spacing, when that job should be left to CSS styles for the document.

The first step is to take the one big xhtml file and split it up into files per chapter. Then after that I have to go through the markup for each chapter and make it match the guidelines of Standard Ebooks.

So this:

<h2>Summary. It is the very nature of life ...
Since this continuance can be secured ...


<section id="summary" epub:type="part">
<p<h3 epub:type="title">Summary</h3> — It is the very nature of life ...</p>
<p>Since this continuance can be secured ...</p>

I did this with liberal use of ~macros~. They're a feature in editors like emacs or vim that let you automate complicated editing tasks.

A macro is basically just a recording of keys pressed that you can replay as many times are you need. So you hit record, do the thing you want to do, make sure your cursor is positioned such that you can do it again and then hit stop record. The trick is to use fancy editing key combinations instead of just swinging around with your cursor (as I normally do). So / <h2 to search for the header, instead of just clicking there, or using your arrow keys.

Editing text in this way really gives you an appreciation for the structure of text documents, and how it can be used to manipulate them way more effectively than normal.


The worst part of this was debugging. When I loaded up the epub onto my ereader for the first time (I was feeling pretty good about myself at this point) I discovered that the table of contents didn't work for sub-chapters, except for the summaries. What a strange bug.

My first attempt at fixing it was creating section headings and given them ids matching the hyphenated version of the titles.

This didn't work.

I tried removing the colons from the titles, guessing that special characters might be throwing off the links. Also didn't work.

Finally, grasping at straws, I realized that the summaries were the only sub-chapters without numbers preceding them. So I removed the numbers at the start, and it worked!

Why it worked: Because EPUB2 is built around HTML4 which doesn't allow for numbers at the start of element ids.


Anyways, now that everything was in the right shape, and functioned roughly as it was supposed to, it was time to make things look a little prettier with some CSS. I didn't use much just:

h3 {
display: inline;
font-size: inherit;
section {
margin-top: 2rem;

I was aiming to keep the ebook pretty close to the original:

That's what the h3 styles are doing. The section styles deviate from the original to make things a little more readable by adding some spacing between sub-chapters.

I also had to pick out a public domain image for the book cover, and chose The School of Athens by Raphael upon suggestion from Celine.

And with that, it was done!

So what could be better?

The main frustration here is that ebooks aren't really designed to be versioned. There's no way, as far as I know, for me to push a new version to my e-reader and maintain my highlights, position, and other metadata. In that way ebooks are surprisingly similar to traditional media, once it's published, it's out there and there's not much you can do to it.

Also, as magical as macros were, my use of them still feels quite primitive. I don't yet know how to modify them, either adding keys to the end or the beginning, or chain them meaningfully.

Ultimately this feels like a very typical 80/20 process. StandardEbooks removes a ton of the overhead but I think there's still room to make it easier to produce beautiful books.