python-scrapy set stats value in extension -


i'm trying write simple scrapy extension-class send crawler-stats when spider closes via email. have far, works fine.

class spiderclosedcommit(object):      def __init__(self, stats):         self.stats = stats      @classmethod     def from_crawler(cls, crawler):         ext = cls(crawler.stats)         crawler.signals.connect(ext.spider_closed, signal=signals.spider_closed)         return ext      def spider_closed(self, spider):         spider_stats = self.stats.get_stats(spider)         # more code send email stats ... 

but i'm trying figure out how add list stats domains scraped. looked through docs couldn't figure out how code should , put it, in extension or in spider-class. how can access scraped domains in extension class or how can access stats in spider-class?

thanks in advance , best

jacques

here's 1 way it:

  1. make extension hook response_received signal , extract domain response.url
  2. keep set() in extension domains seen
  3. when closing spider, add domains tospider_stats before sending email

Comments

Popular posts from this blog

How to remove text and logo OR add Overflow on Android ActionBar using AppCompat on API 8? -

html - How to style widget with post count different than without post count -

url rewriting - How to redirect a http POST with urlrewritefilter -