python-scrapy set stats value in extension -
i'm trying write simple scrapy extension-class send crawler-stats when spider closes via email. have far, works fine.
class spiderclosedcommit(object): def __init__(self, stats): self.stats = stats @classmethod def from_crawler(cls, crawler): ext = cls(crawler.stats) crawler.signals.connect(ext.spider_closed, signal=signals.spider_closed) return ext def spider_closed(self, spider): spider_stats = self.stats.get_stats(spider) # more code send email stats ... but i'm trying figure out how add list stats domains scraped. looked through docs couldn't figure out how code should , put it, in extension or in spider-class. how can access scraped domains in extension class or how can access stats in spider-class?
thanks in advance , best
jacques
here's 1 way it:
- make extension hook response_received signal , extract domain
response.url - keep
set()in extension domains seen - when closing spider, add domains to
spider_statsbefore sending email
Comments
Post a Comment