Quantcast
Channel: MobileRead Forums - Calibre
Viewing all articles
Browse latest Browse all 31491

[GUI Plugin] eBookCleaner

$
0
0
Ebooks, even retail ones, are often badly formatted, to the point that it detracts from the reading experience. And while editing programs exist, they fall short of the task. Do you have the time or patience to trawl through thousands of paragraphs, mangled html and css (div-div-div-p-span-span-br-span...)? A close friend of mine, 'burbleburbleburble' wrote a prototype of this program a while back. While he has more or less retired from it, I have taken up the project, and it has finally reached an acceptable level of sophistication.

eBookCleaner is a fully functional program, relatively intuitive, tested and workable! It finds patterns, allowing for batch editing; it finds common punctuation errors, it cleans up the quotes, it has a host of editing tools, it creates tocs and titlepages, it has a clean interface (sparse, one html, etc.), you can drag and drop covers, edit basic metadata, search and replace... There is one caveat however - I wrote it in Python 3.2, and as a standalone program (originaly on www.ebookcleaner.com, but the version posted is long outdated...).

I realize that it doesn't really make sense for most people to run it separately from calibre, as a 40mb application (god PyQt is unwieldy), and have to first convert the ebook to htmlz... So I have ported most of code to Python 2.7 (excruciating, and still buggy...), but I have not written the calibre-plugin piece.

As the situation currently stands: eBookCleaner works for me, and is highly customized for my needs. Still, I am interested and willing to work with the community to improve it as per everyone's needs. But I am looking for this to be a collaboration effort - I personally don't have the time to write and brainstorm every step of the way, especially when what I already have works for me.

So: below is the code, (should be launchable, from the gui.py, in a calibre development environment - py2.7, pyqt4, lxml), but I would be greatly appreciative if someone can help write some skeleton plugin code for the following (otherwise it may be months before I get around to it):
  • The plugin.py interface to calibre.
  • I am still confused with calibre's internal api: A function to retrieve the selected book, in xhtml or similar format.
  • I am horrible at writing directions (I tend to run on and not be all that clear). Anyone willing to work with me on this?

Personally, I am almost done creating code that will flatten the html (divs, spans, brs, ad nausea), merge and simplify the css, and provide the user with an eBook that is virtually the same, but a million times easier to edit. Especially with eBookCleaners ability to find patterns and batch edit.

Also, I often don't make it to the internet more that a few time of week... please have patience when waiting for my response.

Attached Thumbnails
Click image for larger version

Name:	image1.PNG
Views:	N/A
Size:	152.1 KB
ID:	82512   Click image for larger version

Name:	image2.PNG
Views:	N/A
Size:	307.5 KB
ID:	82513   Click image for larger version

Name:	image3.PNG
Views:	N/A
Size:	105.7 KB
ID:	82514  
Attached Files
File Type: zip eBookCleaner Source py2.7 v1.9.0.zip (601.8 KB)

Viewing all articles
Browse latest Browse all 31491

Trending Articles