This is probably a question for Kovid.
I'm getting a trap in lxml.etree._utf8 with the message "ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters"
With recursions=0 and simultaneous downloads=1 this crashes ebook-convert with the following traceback
With recursion set to 1 and simultaneous_downloads left to the default the ebook-convert application doesn't crash, but the following traceback does appear, indicating a subprpcess of the main ebook_convert process crashed
In that case, feed_1/article_4/index.html is sitting in the debug-pipeline directories looking happy as a clam, so I'm not sure what is going on here.
I've looked at the calibre source at http://bazaar.launchpad.net/~kovid/calibre/trunk/files and the line numbers in the tracebacks don't seem to line up so I'm at a loss here.
My question: what is causing this and could calibre be made a little more bulletproof here?
I'm getting a trap in lxml.etree._utf8 with the message "ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters"
With recursions=0 and simultaneous downloads=1 this crashes ebook-convert with the following traceback
Code:
Python function terminated unexpectedly
All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters (Error Code: 1)
Traceback (most recent call last):
File "site.py", line 132, in main
File "site.py", line 109, in run_entry_point
File "site-packages\calibre\ebooks\conversion\cli.py", line 325, in main
File "site-packages\calibre\ebooks\conversion\plumber.py", line 979, in run
File "site-packages\calibre\customize\conversion.py", line 208, in __call__
File "site-packages\calibre\ebooks\conversion\plugins\recipe_input.py", line 105, in convert
File "site-packages\calibre\web\feeds\news.py", line 881, in download
File "site-packages\calibre\web\feeds\news.py", line 1130, in build_index
File "site-packages\calibre\web\feeds\news.py", line 974, in feed2index
File "site-packages\calibre\web\feeds\templates.py", line 43, in generate
File "site-packages\calibre\web\feeds\templates.py", line 177, in _generate
File "site-packages\lxml\builder.py", line 222, in __call__
File "site-packages\lxml\builder.py", line 185, in add_text
File "lxml.etree.pyx", line 916, in lxml.etree._Element.text.__set__ (src/lxml/lxml.etree.c:36134)
File "apihelpers.pxi", line 721, in lxml.etree._setNodeText (src/lxml/lxml.etree.c:17141)
File "apihelpers.pxi", line 1366, in lxml.etree._utf8 (src/lxml/lxml.etree.c:22211)
ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters
Code:
Parsing feed_1/article_4/index.html as HTML
HTML 5 parsing failed, falling back to older parsers
Traceback (most recent call last):
File "site-packages\calibre\ebooks\oeb\parse_utils.py", line 259, in parse_html
File "site-packages\calibre\ebooks\oeb\parse_utils.py", line 86, in html5_parse
File "site-packages\html5lib\html5parser.py", line 38, in parse
File "site-packages\html5lib\html5parser.py", line 211, in parse
File "site-packages\html5lib\html5parser.py", line 111, in _parse
File "site-packages\html5lib\html5parser.py", line 179, in mainLoop
File "site-packages\html5lib\html5parser.py", line 447, in processStartTag
File "site-packages\html5lib\html5parser.py", line 725, in startTagMeta
File "site-packages\html5lib\treebuilders\_base.py", line 259, in insertElementNormal
File "site-packages\html5lib\treebuilders\etree_lxml.py", line 219, in _setAttributes
File "site-packages\html5lib\treebuilders\etree_lxml.py", line 189, in __init__
File "lxml.etree.pyx", line 2145, in lxml.etree._Attrib.__setitem__ (src/lxml/lxml.etree.c:46818)
File "apihelpers.pxi", line 563, in lxml.etree._setAttributeValue (src/lxml/lxml.etree.c:15781)
File "apihelpers.pxi", line 1366, in lxml.etree._utf8 (src/lxml/lxml.etree.c:22211)
ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters
I've looked at the calibre source at http://bazaar.launchpad.net/~kovid/calibre/trunk/files and the line numbers in the tracebacks don't seem to line up so I'm at a loss here.
My question: what is causing this and could calibre be made a little more bulletproof here?