Quantcast
Channel: MobileRead Forums - Calibre
Viewing all articles
Browse latest Browse all 31536

HTML to ePub stripping out Content text

$
0
0
Here is a puzzler. I am running ebook-convert on a HTML toc doc with the following settings:

sudo ebook-convert tmp/temptoc.html $mediatargetpath$sku".epub" --max-levels=1 --toc-threshold=100 --cover=$imagedir$sku$cover_image_extension --book-producer="Nimble Combinatorial Publishing" --publisher="Nimble Combinatorial Publishing" --max-toc-links=100 --preserve-cover-aspect-ratio

the document 1.html referenced by tmp/temptoc.html

http://en.wikipedia.org/w/index.php?...&title=Magento

has a "Contents" section whose html source looks like this:

Quote:

<table id="toc" class="toc">
<tr>
<td>
<div id="toctitle">
<h2>Contents</h2>
</div>
<ul>
<li class="toclevel-1 tocsection-1"><a href="#History"><span class="tocnumber">1</span> <span class="toctext">History</span></a></li>
<li class="toclevel-1 tocsection-2"><a href="#See_also"><span class="tocnumber">2</span> <span class="toctext">See also</span></a></li>
<li class="toclevel-1 tocsection-3"><a href="#References"><span class="tocnumber">3</span> <span class="toctext">References</span></a></li>
<li class="toclevel-1 tocsection-4"><a href="#External_links"><span class="tocnumber">4</span> <span class="toctext">External links</span></a></li>
</ul>
</td>
</tr>
</table>
When Calibre processes this document, it is removing the text from the bullets, so that all that's showing up is four bullets, which looks stupid. I used Sigil to inspect the HTML inside the ePub, and it looks as if Calibre is applying new styles to what it detects as TOC bullets.

Quote:

<body class="calibre">
<table class="toc" id="toc">
<tr class="calibre11">
<td class="calibre15">
<div class="calibre8" id="toctitle">
<h2 class="calibre16" id="calibre_pb_1">Contents</h2>
</div>

<ul class="calibre9">
<li class="toclevel"><a class="calibre5" href="../Text/1_split_000.html#History"></a></li>

<li class="toclevel"><a class="calibre5" href="../Text/1_split_000.html#See_also"></a></li>

<li class="toclevel"><a class="calibre5" href="../Text/1_split_000.html#References"></a></li>

<li class="toclevel"><a class="calibre5" href="../Text/1_split_000.html#External_links"></a></li>
</ul>
</td>
</tr>
</table>
</body>
</html>
Apparently this has something to do with toc detection, but I've been pulling my hair out and haven't gotten anywhere. Can some kind soul speed things along for me?

Attached Files
File Type: epub 614162738.epub (148.6 KB)
File Type: pdf wikisourcehtml.pdf (127.9 KB)

Viewing all articles
Browse latest Browse all 31536

Trending Articles