Quantcast
Channel: MobileRead Forums - Calibre
Viewing all articles
Browse latest Browse all 31491

Discard non-existent redirected article?

$
0
0
Hi folks,

With my recipe here, I occasionally get the case where the RSS feed points to an invalid article (I guess this is some sort of race condition issue). When this happens, the request for the article redirects to an index page. This wouldn't be a problem, except that the index page has a heap of content and my Sony PRS-T1 spends a minute or two trying to render it.

Ideally, I'd like to discard the page if I can detect a redirect to an index URL. Here's part of a debug log with the cookie hidden:

Spoiler:

Fetching http://www.autosport.com/news/report.php/id/101289
Downloaded article: Kovalainen upbeat after aero test from http://www.autosport.com/news/report.php/id/101288
17% Article downloaded: Kovalainen upbeat after aero test
send: 'GET /news/report.php/id/101289 HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: www.autosport.com\r\nCookie: xxx\r\nConnection: close\r\nAccept: */*\r\nUser-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.13) Gecko/20101210 Gentoo Firefox/3.6.13\r\n\r\n'
reply: 'HTTP/1.1 302 Found\r\n'
header: Date: Sat, 21 Jul 2012 13:55:09 GMT
header: Server: Apache
header: Expires: Thu, 19 Nov 1981 08:52:00 GMT
header: Last-Modified: Sat, 21 Jul 2012 13:55:09GMT
header: Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
header: Pragma: no-cache
header: Location: http://www.autosport.com/news/
header: Vary: Accept-Encoding,User-Agent
header: Content-Length: 0
header: Connection: close
header: Content-Type: text/html


I'd like to try to detect the bold bit. I've spent a bit of time trying to dig around - the closest I can find is feed.articles.remove() when called from parse_feeds(), but this seems to be before the articles are downloaded, so before I can detect the redirect.

Is what I want to do possible?

Cheers,
Simon.

Viewing all articles
Browse latest Browse all 31491

Trending Articles