in Uncategorized

Formatting Poetry for Kindle (with OEBPS Template)

n.b. The possibilities for Kindle formatting have been increased with the introduction of KF8, although at present these options are only available on the Kindle Fire. It is uncertain how this will affect e-ink Kindle devices. Fixed layout would (in theory) remove the problem of Kindle-broken lines, but we are not quite there yet. This guide works on the assumption of reflowable lines, with as much consistency across Kindle devices (Kindle Fire, Kindle readers, & Kindle for iPad & iPhone) as possible. Different screen sizes/resolutions causes some problems here, but we shall do the best we can. I will work on a post on what the new KF8 formatting options could do for poetry, & how we can work with better html markup & css media queries to improve backwards compatibility.

Contact me if you have any suggestions or comments. (Thanks to @tadaja & @cdcasey for some proof-reading & suggestions.)

Introduction

The Kindle is not kind to poetry. For those who want to self-publish their poetry on Kindle, formatting your poems is a gloomy prospect, & one that requires reducing your expectations. If you want your poetry ebook to look at least acceptable, the best chance is by doing the conversion to mobi yourself. The workflow which allows the most hands-on control is to create your ebook in oebps format (a predecessor of epub), & use Kindlegen to convert & package your files into a mobi ebook. This is not as difficult as it sounds, & this guide will go through the process step-by-step, along with code examples and a full sample ebook of some of Shakespeare’s sonnets. Teaching you about basic html, css & xml is not a part of this guide, as you can find enough about this elsewhere. The knowledge required for putting together an ebook is very small, & you should be able to see most of what you need from the example code.

You will need:

  • A text editor. In Windows, Notepad will do the job. However, something with syntax highlighting (e.g., Notepad++) will be helpful.
  • Kindlegen (to convert the oebps ebook into a mobi file) & Kindle Previewer (to check the results).  Both can be downloaded for free from Amazon. Kindlegen also comes with sample oebps ebooks, which include comments beyond the scope of this guide (applying to all types of ebooks).

Useful references:

An OEBPS ebook

As mentioned above, you will first of all be putting together an oebps ebook. It is a predecessor of epub, so those familiar with its descendant will recognise most of it. Basically, the components of an oebps ebook which we are going to be using are:

  • One (or more) html files, incorporating the actual text of the ebook
  • A css stylesheet, which controls how the text is formatted
  • A cover image
  • An xml .ncx file, which lists the table of contents for the ebook
  • An xml .opf file, which contains the ebooks metadata (title, author, etc.), lists the files which make up the ebook (html, image & ncx files), & lists the reading order of the html files (if there are more than one)

A Kindle, when displaying a mobi file (which the oebps ebook will be converted into), does not use all of the information that an oebps ebook provides, so there is some degree of superfluity in this structure. This means that the oebps does not need to be complete, & so in the following example I have only included the elements which appear in the produced mobi.

Kindlegen comes with a ‘Sample’ folder, containing all of the files for an oebps ebook. It complicates things a bit with individual html files for each chapter, with each .html having its own folder for the images used in that chapter. Other than that, this ‘Sample’ is the base for the (much simplified) oebps sample I have put together.

You can look at or download the sample here.

I suggest that you go through this guide with the Kindlegen sample code (which is comprehensive & includes comments), my sample code (which is stripped-down), & any reference material you might need on html, css & xml.

The book

The ebook we are creating is that of the Dark Lady sequence of Shakespeare’s sonnets, 127 to 152. (I’ve never understood why the last two sonnets are sometimes included in this.) The source text is here. You’ll notice that Wikisource indents the final couplet; indents are going to be an issue, which we shall return to below, but I (rather conveniently) don’t like indenting those two lines like this, & won’t be doing it. It will need a contents page with links to each of the sonnets, some front matter (title, author, etc.), & a cover image.

The text (& stylesheet)

It is easy to work in plain text (i.e., not Word) here, as all the formatting will need to be added in html. A workflow that begins in Word wastes a lot of time cleaning up unnecessary markup. It will be easier to paste plain text into a html file.

The basic structure of a html file is:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title></title>
<link rel="stylesheet" href="" type="text/css" />
</head>
<body>
</body>
</html>

The body of the text goes (surprisingly) within the body tags, which is what we will be dealing with in a moment. But first of all, to be done with the head section: The title tags are quite probably redundant, as metadata for the ebook is found elsewhere. The only other thing to be aware of so far is the css stylesheet, the filename of which needs to go in the href attribute of the link tag. In my sample ebook, the stylesheet is called ‘style.css’, &, since it will be in the same folder as the html file, its URL in the href attribute doesn’t need to be any more than that.

The first thing to go in the text is the front matter, which looks as follows:

<h1>THE DARK LADY SONNETS</h1><br/>
<h2>William Shakespeare</h2><br/><br/>
<h4>Originally Published: 1609</h4><br/>
<p class="centeralign">A poetry template formatted for Kindle by Charles Whalley. Source files available at <a href="http://charleswhalley.co.uk">http://charleswhalley.co.uk</a>.</p>

As you can see, we have the title, author, a note about the publication, & another note about the ebook itself. The header tags, of successive size (h1, h2, & h4), do the job, along with extra line-breaks to increase the spacing (br). I wanted the final note to be the same size as the normal text of the book, but centre-aligned along with the headers, & so did so using a class to be styled by css in the stylesheet.

h1, h2, h3, h4 {
        text-align: center;
}
 
.centeralign {
        text-align: center;
}

The next thing to go in the ebook is the table of contents, which looks as follows:

<a name="TOC"><h3 class="section">Contents</h3></a>
 
<a href="#s127" class="toc_item">127</a><br/>
<a href="#s128" class="toc_item">128</a><br/>
<a href="#s129" class="toc_item">129</a><br/>
<a href="#s130" class="toc_item">130</a><br/>
<a href="#s131" class="toc_item">131</a><br/>
<a href="#s132" class="toc_item">132</a><br/>
<a href="#s133" class="toc_item">133</a><br/>
<a href="#s134" class="toc_item">134</a><br/>
<a href="#s135" class="toc_item">135</a><br/>
<a href="#s136" class="toc_item">136</a><br/>
<a href="#s137" class="toc_item">137</a><br/>
<a href="#s138" class="toc_item">138</a><br/>
<a href="#s139" class="toc_item">139</a><br/>
<a href="#s140" class="toc_item">140</a><br/>
<a href="#s141" class="toc_item">141</a><br/>
<a href="#s142" class="toc_item">142</a><br/>
<a href="#s143" class="toc_item">143</a><br/>
<a href="#s144" class="toc_item">144</a><br/>
<a href="#s145" class="toc_item">145</a><br/>
<a href="#s146" class="toc_item">146</a><br/>
<a href="#s147" class="toc_item">147</a><br/>
<a href="#s148" class="toc_item">148</a><br/>
<a href="#s149" class="toc_item">149</a><br/>
<a href="#s150" class="toc_item">150</a><br/>
<a href="#s151" class="toc_item">151</a><br/>
<a href="#s152" class="toc_item">152</a>

There are a few things to notice here. First of all, the title of the contents is marked with an anchor called ‘TOC’. This name is required. Second, the ‘section’ class of the h3 tags around the title allows it to be marked as something which required a page-break before it. The Publishing Guidelines suggest using a custom mbp (mobipocket) tag, but when the same can be done with standard css it seems unnecessary.

.section, .sonnet_title {
	page-break-before: always;
}

Third, the links around the items on the table all refer to anchors (hence the hash before the name; e.g. “#s150”). These anchors will need to be put in to be put around the title of each sonnet, so the links refer to those points in the text. They will become more important with the ncx file, as, since all the text is only in one html, it is only with anchors that we can differentiate between different points in the text.

All we need is the title of the text, & then we come to the sonnets themselves, & the dreaded issue of indents.

<a name="start"><h3 class="section">The Sonnets 127 to 154</h3></a>

The anchor named ‘start’ is the second required anchor (after ‘TOC’). This indicates where the actual text of the ebook begins.

Hanging indents

In my opinion, the best way to format poetry for the Kindle is with hanging indents: when lines broken by the Kindle (rather than the poet) are shown by being indented. The trouble is that this takes us out of the frying pan & into the fire, as it means that we have really only moved the ambiguity over the indented lines. What I mean is, this is only a clear system (indent, Kindle; not indented, poet) if the poet doesn’t indent any lines herself. We could, of course, use different amounts of indentation, but this can become even more of a tangle. The solution I have chosen for the sonnets is to not indent any of the lines (so, to do nothing with the final couplet), & to have hanging indents for when the line doesn’t fit on the screen. There are a few methods to trick the Kindle into hanging indents, but this one works for me. A poem, marked up for this, looks as follows (also note the anchor around the sonnet’s title):

<a name="s127"><h4 class="sonnet_title">127</h4></a>
<p class="line">In the old age black was not counted fair,</p>
<p class="line">Or if it were, it bore not beauty's name;</p>
<p class="line">But now is black beauty's successive heir,</p>
<p class="line">And beauty slander'd with a bastard shame:</p>
<p class="line">For since each hand hath put on Nature's power,</p>
<p class="line">Fairing the foul with Art's false borrowed face,</p>
<p class="line">Sweet beauty hath no name, no holy bower,</p>
<p class="line">But is profan'd, if not lives in disgrace.</p>
<p class="line">Therefore my mistress' eyes are raven black,</p>
<p class="line">Her eyes so suited, and they mourners seem</p>
<p class="line">At such who, not born fair, no beauty lack,</p>
<p class="line">Sland'ring creation with a false esteem:</p>
<p class="line">Yet so they mourn becoming of their woe,</p>
<p class="line">That every tongue says beauty should look so.</p>

Then we need css to make the hanging indents work (& to stop each line from having too much spacing, since the html thinks each is a new paragraph):

p.line {
    text-indent: -40px;
    padding-left: 40px;
    margin-top: 0;
    margin-bottom: 0;
}

(Thanks to @cdcasey for pointing out that this needs to be px to work on Kindle for iOS. Since this is css to work for both mobi7 & kf8, we need to use margins for the line-spacing but padding makes more sense for the indent )

This is a bit of jiggerypokery, but basically it works by putting two attributes in opposition to each other: text-indent, which only applies to the first line of the content of the line class p; & padding-left, which applies to every line. Padding is applied first, so every line is indented. Then text-indent has a negative value, so it moves the line back to its original position. When the entire line (the entire contents of the p) fits on the page, everything is the first line, so text-indent moves everything back. But when the line is broken because it doesn’t fit, the overspill isn’t covered the negative by text-indent so it remains indented. To change the size of the indent, both values (here 40px) need to be changed. To give an idea of how this works, here are some screenshots from it in action on my Kindle 3:

Sonnet 129 with no hanging indents

All of the lines from sonnet 129 fit on the screen with this font size.

Sonnet 129

In this font size the penultimate line is too long to fit, & so the Kindle-broken extra line is indented.

Having a single word on a new line can be disruptive to reading. To counteract this, you can tell the Kindle to keep certain phrases together by using non-breaking spaces instead of spaces between words; for instance:

<a name="s127"><h4 class="sonnet_title">127</h4></a>
<p class="line">In the old age black was not counted fair,</p>
<p class="line">Or if it were, it bore not beauty's&nbsp;name;</p>
<p class="line">But now is black beauty's successive&nbsp;heir,</p>
<p class="line">And beauty slander'd with a&nbsp;bastard&nbsp;shame:</p>
<p class="line">For since each hand hath put on Nature's&nbsp;power,</p>
<p class="line">Fairing the foul with Art's false borrowed&nbsp;face,</p>
<p class="line">Sweet beauty hath no name, no&nbsp;holy&nbsp;bower,</p>
<p class="line">But is profan'd, if not lives in disgrace.</p>
<p class="line">Therefore my mistress' eyes are raven&nbsp;black,</p>
<p class="line">Her eyes so suited, and they mourners seem</p>
<p class="line">At such who, not born fair, no&nbsp;beauty&nbsp;lack,</p>
<p class="line">Sland'ring creation with a&nbsp;false&nbsp;esteem:</p>
<p class="line">Yet so they mourn becoming of their woe,</p>
<p class="line">That every tongue says beauty should look so.</p>

Those phrases with &nbsp; between the words will be kept together. Obviously, this can make everything even more clunky, as more will be carried into the next line, but it is worth doing if it helps readability. The sample file only uses these in the first poem.

It is advisable to push the Kindle-broken lines across as far as possible (since the greater the hanging indent, the less distance the reader’s eye has to track back to the left to start the next line, the more likely he is to interpret it as linked to the end of the previous one). You have about 550px of width to work with (in the new Kindle ereaders, not the Kindle for iOS). However, if you put it too far across you’ll cramp the Kindle’s ability to make the best use of the space on the screen (especially on an iPhone, which is narrower). Working out the size of the hanging indent is a design decision (& one of the few you’ll get with a Kindle ebook) which is a compromise between the average length of the lines (which determines how much they’ll get broken), how much poet-indentation you wish to use (which the hanging indent probably wants to exceed), & what remains readable in practice. If you have any poet-indentation (or even not) a value around 200px might be best. Since I have not used any indented lines in this ebook, the hanging indent does not need to be very big (only 40px), but as soon as you want to indent lines of your own (which you can do by putting multiple &nbsp;s between the opening p tag & the start of the line to be indented), having a greater hanging indent will make a considerable difference in helping the reader distinguish between poet- & Kindle-broken lines.

Here’s how the relevant html & css would look like if I had chosen to indent the couplet of sonnet 127. This is not in the sample.:

	text-indent:-200px;
	padding-left:200px;

N.B. the hanging indent applies to all lines, & is blind to the level of poet-indentation, as consistency will help the reader. 200px is much greater than any poet-indentation. If you want better cross-platform consistency, you would need to reduce the indentation (to something which would fit the iPhone, so perhaps around 120px).

<a name="s127"><h4 class="sonnet_title">127</h4></a>
<p class="line">In the old age black was not counted fair,</p>
<p class="line">Or if it were, it bore not beauty's&nbsp;name;</p>
<p class="line">But now is black beauty's successive&nbsp;heir,</p>
<p class="line">And beauty slander'd with a&nbsp;bastard&nbsp;shame:</p>
<p class="line">For since each hand hath put on Nature's&nbsp;power,</p>
<p class="line">Fairing the foul with Art's false borrowed&nbsp;face,</p>
<p class="line">Sweet beauty hath no name, no&nbsp;holy&nbsp;bower,</p>
<p class="line">But is profan'd, if not lives in disgrace.</p>
<p class="line">Therefore my mistress' eyes are raven&nbsp;black,</p>
<p class="line">Her eyes so suited, and they mourners seem</p>
<p class="line">At such who, not born fair, no&nbsp;beauty&nbsp;lack,</p>
<p class="line">Sland'ring creation with a&nbsp;false&nbsp;esteem:</p>
<p class="line">&nbsp;&nbsp;&nbsp;&nbsp;Yet so they mourn becoming of their woe,</p>
<p class="line">&nbsp;&nbsp;&nbsp;&nbsp;That every tongue says beauty should look so.</p>

The four &nbsp;s indent the couplet. It is advisable, with multiple levels of indentation, to use at least multiples of four of these to give sufficient difference for the reader.

The cover image

The cover image is a 600 x 800 jpg. You can include this in the html, but as long as it is referenced in the ncx & opf file (which we will come to next) this is unnecessary. For the sample the cover image I put together uses a public domain image & Igino Marini’s Fell Type, which is perhaps one of my favourite fonts (in terms of its look).

 

Cover for The Dark Lady Sonnets

(A few minutes in GIMP.) Bear in mind, this will be in b&w on the Kindle.

The .ncx file

The ncx file is essentially a table of contents. It allows the Kindle to mark points in the progress bar, & to allow the reader to jump between sections. The basic structure is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE ncx PUBLIC "-//NISO//DTD ncx 2005-1//EN"
	"http://www.daisy.org/z3986/2005/ncx-2005-1.dtd">
<ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1" xml:lang="en-US">
<head>
<meta name="dtb:uid" content="BookId"/>
<meta name="dtb:depth" content="1"/>
<meta name="dtb:totalPageCount" content="0"/>
<meta name="dtb:maxPageNumber" content="0"/>
</head>
<docTitle><text>The Dark Lady Sonnets</text></docTitle>
<docAuthor><text>William Shakespeare</text></docAuthor>
  <navMap>
  </navMap>
</ncx>

dtb:uid will be set in the opf file. Depth refers to the number of nested layers (e.g., sub-sections, within sections, within chapters, would be 3 layers). All navigation points are on the same level in this ebook, so this is 1. totalPageCount & maxPageNumber can be ignored. docTitle & docAuthor are self-explanatory.

The different marked sections (in this case, each sonnet) are ‘nav points’ which go within the navMap tags. A nav points looks like this:

 <navPoint class="toc" id="toc" playOrder="1">
      <navLabel>
        <text>Table of Contents</text>
      </navLabel>
      <content src="book.html#TOC"/>
    </navPoint>

For nested layers you put navPoints within navPoints, so the sub-sections within a section would be listed as complete navPoints before the section’s closing navPoint tag. Class & id don’t seem to have that much impact on the end result; the important things to look at here are the playOrder which marks when this point occurs in the sequence, the navLabel (which does what it says on the tin), & the src attribute of the content tag, which should show the correct point in the html file (or which html file, if you are using multiple). I’ll put the next few navPoints to give a better idea:

<navPoint class="section" id="sonnets" playOrder="2">
      <navLabel>
        <text>Sonnets</text>
      </navLabel>
      <content src="book.html#start"/>
    </navPoint>
	<navPoint class="section" id="s127" playOrder="3">
      <navLabel>
        <text>Sonnet 127</text>
      </navLabel>
      <content src="book.html#s127"/>
    </navPoint>
	<navPoint class="section" id="s128" playOrder="4">
      <navLabel>
        <text>Sonnet 128</text>
      </navLabel>
      <content src="book.html#s128"/>
    </navPoint>

It is otherwise quite simple.

The .opf file

The opf file contains metadata for the ebook. The basic structure is as follows:

<?xml version="1.0" encoding="utf-8"?>
 
<package xmlns="http://www.idpf.org/2007/opf" version="2.0" unique-identifier="BookId">
 
</package>

Within this there are four sections: metadata, manifest, spine, & guide:

<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
 
        <dc:title>The Dark Lady Sonnets</dc:title>
	<dc:language>en-gb</dc:language>
 
  <meta name="cover" content="cover" />
 
	<dc:identifier id="BookId" opf:scheme="ISBN"></dc:identifier>
 
	<dc:creator>William Shakespeare</dc:creator>
 
 	<dc:publisher>Charles Whalley</dc:publisher>
 
	<dc:subject>Poetry</dc:subject>
 
  	<dc:date>2012-01-06</dc:date>
 
   <dc:description>Shakespeare's Sonnets 127-152</dc:description>
 
</metadata>
 
<manifest>
	<item id="book" media-type="application/xhtml+xml" href="book.html"></item>
	<item id="toc" media-type="application/x-dtbncx+xml" href="toc.ncx"/>
    <item id="cover" media-type="image/gif" href="cover.jpg"/>
</manifest>
 
<spine toc="toc">
	<itemref idref="book"/>
</spine>
 
<guide>
	<reference type="toc" title="Contents" href="book.html#TOC"></reference>
	<reference type="text" title="Start" href="book.html#start"></reference>
</guide>

The Kindlegen sample has notes which are useful for working with the opf file. The metadata section does what you’d imagine from the name. This should be the same as the metadata you put with your file if/when you upload it to the Amazon store. I have ignored the identifier for this sample. The manifest lists all of the files used (although, since the stylesheet is referenced within the html file & since Kindlegen processes it all anyway, the stylesheet does not need to be listed). The spine puts all of the files that make up the ebook in order (which seems a bit redundant when we only have one). The guide marks our two important points in the book, the table of contents & the start of the actual text, in the format shown.

Now we have a bare-bones oebps ebook, ready to be converted into a mobi file. It will help if you put all your files together in the same folder, along with the Kindlegen .exe.

Running it through Kindlegen

Run a command prompt in Windows & change the current directory to that which contains all of your files (with ‘cd’ & then the address of this directory; e.g., cd c:\ebook\). Then you need to run Kindlegen & point it at your .opf file (which packages everything together). This is done by typing the command ‘kindlegen filename.opf’. It will run (hopefully swimmingly) & a new mobi file, with the same filename as the opf, will appear in the same folder. You can then run Kindle Previewer & check your results.

Here is the finished product:

Free Kindle ebook: The Dark Lady Sonnets by William Shakespeare

Resources: