2008-11-22 Christopher Blizzard <blizzard@0xdeadbeef.com>
* whoisi/test_controller.py (TestController.modified): Now returns
a valid RSS feed + data type for tests.
* whoisi/model.py (Site): Add etag, last_modified and entity_url
entries to the site table.
* tests/twisted/network/test_download.py: Lots of changes here to
support the new download command format (using hash instead of
direct call.)
* tests/twisted/local/test_feedparse_perf.py: Same.
* tests/twisted/local/test_feedparse.py: Same.
* tests/twisted/network/test_feedrefresh.py
(TestFeedRefresh.test_RefreshSiteManagerEntityProperties): This
test makes sure that we set entity properties in the site table
after we hit a site that includes them.
(TestFeedRefresh.test_RefreshSiteManagerEntityHit): This test
makes sure that we return early and don't parse when we send a
matching etag or last-modified along with a request.
* services/command/siterefresh.py (RefreshSiteDone.srDone): Save
etag, last-modified and entity_url info in the site if we have it.
(RefreshSiteDone.done): When returning the data to the master
process add a http_entity_hit=0 in the dict so we know we did a
download. (For future use.)
(RefreshSiteError.handleError): Handle the DownloadCommand
throwing a NotModifiedError which means that we don't have to do
any parsing or updating of information. Short cut to exit.
Return value will include a http_entity_hit=1 for future use. We
also set the error field to http_not_modified when we hit this
condition. Also update the error field in the SiteRefresh table
when there's a real error.
* services/command/controller.py (RefreshManager.__init__): Use
new DownloadResourceSaveState after a download as part of a
refresh.
* services/command/newsite.py (NewSiteTryURL.doCommand): When
calling the download command pass in the url as part of a
dictionary.
(NewSiteTryURL.downloadDone): More args["filename"] changes.
(NewSiteTryURL.startSecondDownload): Same.
(NewSiteTryURL.secondDownloadDone): Same.
(NewSiteTryURL.tryFeed): Same.
* services/command/download.py (DownloadResourceSaveState): Shim
command that takes the download data and saves it into the state
for later commands.
(DownloadCommand.doCommand): New code to handle etag,
last_modified and entity_url info as arguments to this command.
(DownloadCommand.downloadDone): Data is now returned as a hash
that includes filename, etag, last_modified and the url stack of
downloads.
* services/command/feedparse.py (FeedRefreshSetup.gotNewSite):
Gets the etag, last_modified and entity_url out of the database
when setting up for a feed refresh.
(FeedRefreshSetup.gotFeed): When returning with a setup refresh
the next command is the download so set up everything the download
needs to send an etag + last-modified header if we can.
(FeedParseCommand.doCommand): Convert to use args["filename"]
instead of just filename since the downloadcommand now returns
more than just the filename.
* services/command/linkedin.py (LinkedInScrapeCommand.doCommand):
Convert linkedin code to use a hash["filename"] instead of just
the filename.
git-svn-id: svn://trac.whoisi.com/whoisi/trunk@13
ae879524-a8bd-4c4c-a5ea-
74d2e5fc5a2c