StackOverflow2013

Note that there are some explanatory texts on larger screens.

plurals

POCan not get simplest pipeline example to work in scrapy
text
Body
copied!<p>This is my simple code and i am not getting it work.</p> <p>I am subclassing from <code>initspider</code></p> <p>This is my code</p> <pre><code>class MytestSpider(InitSpider): name = 'mytest' allowed_domains = ['example.com'] login_page = 'http://www.example.com' start_urls = ["http://www.example.com/ist.php"] def init_request(self): #"""This function is called before crawling starts.""" return Request(url=self.login_page, callback=self.parse) def parse(self, response): item = MyItem() item['username'] = "mytest" return item </code></pre> <h1>Pipeline</h1> <pre><code>class TestPipeline(object): def process_item(self, item, spider): print item['username'] </code></pre> <p><strong>i am geting same error if try to print the item</strong></p> <p>The error i get is</p> <pre><code> File "crawler/pipelines.py", line 35, in process_item myitem.username = item['username'] exceptions.TypeError: 'NoneType' object has no attribute '__getitem__' </code></pre> <p>I the problem is with <code>InitSpider</code> . My pieplines are not getting item objects</p> <h1>items.py</h1> <pre><code>class MyItem(Item): username = Field() </code></pre> <h1>setting.py</h1> <pre><code>BOT_NAME = 'crawler' SPIDER_MODULES = ['spiders'] NEWSPIDER_MODULE = 'spiders' DOWNLOADER_MIDDLEWARES = { 'scrapy.contrib.downloadermiddleware.cookies.CookiesMiddleware': 700 # <- } COOKIES_ENABLED = True COOKIES_DEBUG = True ITEM_PIPELINES = [ 'pipelines.TestPipeline', ] IMAGES_STORE = '/var/www/htmlimages' </code></pre>

Querying!

Guidance

An individual column

Larger individual text columns get their own page to allow for proper reading.

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload