Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

执行scrapy crawl xici 报错? #2

Open
netzeng opened this issue Jul 14, 2019 · 1 comment
Open

执行scrapy crawl xici 报错? #2

netzeng opened this issue Jul 14, 2019 · 1 comment

Comments

@netzeng
Copy link

netzeng commented Jul 14, 2019

我使用的是windows 7 +py3.7
F:\aox_proxy_pool\proxy_pool>scrapy crawl xici
2019-07-14 08:17:05,617 - log.py[line:146] - INFO: Scrapy 1.6.0 started (bot: proxy_pool)
2019-07-14 08:17:05 [scrapy.utils.log] INFO: Scrapy 1.6.0 started (bot: proxy_pool)
2019-07-14 08:17:05,630 - log.py[line:149] - INFO: Versions: lxml 4.2.5.0, libxml2 2.9.5, cssselect 1.0.3, parsel 1.5.1, w3lib 1.20.0, Twisted 19.2.1, Python 3.7.0 (v3.7.0:1bf9cc50
93, Jun 27 2018, 04:06:47) [MSC v.1914 32 bit (Intel)], pyOpenSSL 18.0.0 (OpenSSL 1.1.0h 27 Mar 2018), cryptography 2.3, Platform Windows-7-6.1.7601-SP1
2019-07-14 08:17:05 [scrapy.utils.log] INFO: Versions: lxml 4.2.5.0, libxml2 2.9.5, cssselect 1.0.3, parsel 1.5.1, w3lib 1.20.0, Twisted 19.2.1, Python 3.7.0 (v3.7.0:1bf9cc5093, Ju
n 27 2018, 04:06:47) [MSC v.1914 32 bit (Intel)], pyOpenSSL 18.0.0 (OpenSSL 1.1.0h 27 Mar 2018), cryptography 2.3, Platform Windows-7-6.1.7601-SP1
2019-07-14 08:17:05,651 - crawler.py[line:38] - INFO: Overridden settings: {'BOT_NAME': 'proxy_pool', 'LOG_LEVEL': 'INFO', 'NEWSPIDER_MODULE': 'proxy_pool.spiders', 'ROBOTSTXT_OBEY
': True, 'SPIDER_MODULES': ['proxy_pool.spiders'], 'USER_AGENT': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.92 Safari/
537.36'}
2019-07-14 08:17:05 [scrapy.crawler] INFO: Overridden settings: {'BOT_NAME': 'proxy_pool', 'LOG_LEVEL': 'INFO', 'NEWSPIDER_MODULE': 'proxy_pool.spiders', 'ROBOTSTXT_OBEY': True, 'S
PIDER_MODULES': ['proxy_pool.spiders'], 'USER_AGENT': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.92 Safari/537.36'}
2019-07-14 08:17:05,754 - telnet.py[line:60] - INFO: Telnet Password: 1c62fc3f0d34ef85
2019-07-14 08:17:05 [scrapy.extensions.telnet] INFO: Telnet Password: 1c62fc3f0d34ef85
2019-07-14 08:17:05,873 - middleware.py[line:48] - INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.logstats.LogStats']
2019-07-14 08:17:05 [scrapy.middleware] INFO: Enabled extensions:
['scrapy.extensions.corestats.CoreStats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.logstats.LogStats']
2019-07-14 08:17:07,454 - init.py[line:58] - ERROR: Loading "scrapy.core.downloader.handlers.http.HTTPDownloadHandler" for scheme "http"
Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers_init_.py", line 48, in load_handler
dhcls = load_object(path)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http.py", line 3, in
from .http11 import HTTP11DownloadHandler as HTTPDownloadHandler
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 16, in
from twisted.web.client import Agent, ProxyAgent, ResponseDone,
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
ModuleNotFoundError: No module named 'win32api'
2019-07-14 08:17:07 [scrapy.core.downloader.handlers] ERROR: Loading "scrapy.core.downloader.handlers.http.HTTPDownloadHandler" for scheme "http"
Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers_init
.py", line 48, in load_handler
dhcls = load_object(path)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http.py", line 3, in
from .http11 import HTTP11DownloadHandler as HTTPDownloadHandler
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 16, in
from twisted.web.client import Agent, ProxyAgent, ResponseDone,
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
ModuleNotFoundError: No module named 'win32api'
2019-07-14 08:17:07,653 - init.py[line:58] - ERROR: Loading "scrapy.core.downloader.handlers.http.HTTPDownloadHandler" for scheme "https"
Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers_init
.py", line 48, in load_handler
dhcls = load_object(path)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http.py", line 3, in
from .http11 import HTTP11DownloadHandler as HTTPDownloadHandler
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 16, in
from twisted.web.client import Agent, ProxyAgent, ResponseDone,
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
ModuleNotFoundError: No module named 'win32api'
2019-07-14 08:17:07 [scrapy.core.downloader.handlers] ERROR: Loading "scrapy.core.downloader.handlers.http.HTTPDownloadHandler" for scheme "https"
Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers_init
.py", line 48, in load_handler
dhcls = load_object(path)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http.py", line 3, in
from .http11 import HTTP11DownloadHandler as HTTPDownloadHandler
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 16, in
from twisted.web.client import Agent, ProxyAgent, ResponseDone,
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
ModuleNotFoundError: No module named 'win32api'
2019-07-14 08:17:07,830 - init.py[line:58] - ERROR: Loading "scrapy.core.downloader.handlers.s3.S3DownloadHandler" for scheme "s3"
Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers_init
.py", line 48, in load_handler
dhcls = load_object(path)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\s3.py", line 6, in
from .http import HTTPDownloadHandler
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http.py", line 3, in
from .http11 import HTTP11DownloadHandler as HTTPDownloadHandler
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 16, in
from twisted.web.client import Agent, ProxyAgent, ResponseDone,
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
ModuleNotFoundError: No module named 'win32api'
2019-07-14 08:17:07 [scrapy.core.downloader.handlers] ERROR: Loading "scrapy.core.downloader.handlers.s3.S3DownloadHandler" for scheme "s3"
Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers_init
.py", line 48, in load_handler
dhcls = load_object(path)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\s3.py", line 6, in
from .http import HTTPDownloadHandler
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http.py", line 3, in
from .http11 import HTTP11DownloadHandler as HTTPDownloadHandler
File "c:\python37-32\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 16, in
from twisted.web.client import Agent, ProxyAgent, ResponseDone,
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import _win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
ModuleNotFoundError: No module named 'win32api'
Unhandled error in Deferred:
2019-07-14 08:17:08,096 - _legacy.py[line:154] - CRITICAL: Unhandled error in Deferred:
2019-07-14 08:17:08 [twisted] CRITICAL: Unhandled error in Deferred:

Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\scrapy\crawler.py", line 172, in crawl
return self._crawl(crawler, *args, **kwargs)
File "c:\python37-32\lib\site-packages\scrapy\crawler.py", line 176, in _crawl
d = crawler.crawl(*args, **kwargs)
File "c:\python37-32\lib\site-packages\twisted\internet\defer.py", line 1613, in unwindGenerator
return _cancellableInlineCallbacks(gen)
File "c:\python37-32\lib\site-packages\twisted\internet\defer.py", line 1529, in _cancellableInlineCallbacks
_inlineCallbacks(None, g, status)
--- ---
File "c:\python37-32\lib\site-packages\twisted\internet\defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File "c:\python37-32\lib\site-packages\scrapy\crawler.py", line 80, in crawl
self.engine = self._create_engine()
File "c:\python37-32\lib\site-packages\scrapy\crawler.py", line 105, in create_engine
return ExecutionEngine(self, lambda : self.stop())
File "c:\python37-32\lib\site-packages\scrapy\core\engine.py", line 69, in init
self.downloader = downloader_cls(crawler)
File "c:\python37-32\lib\site-packages\scrapy\core\downloader_init
.py", line 88, in init
self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
File "c:\python37-32\lib\site-packages\scrapy\middleware.py", line 53, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "c:\python37-32\lib\site-packages\scrapy\middleware.py", line 34, in from_settings
mwcls = load_object(clspath)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import

File "", line 983, in _find_and_load

File "", line 967, in _find_and_load_unlocked

File "", line 677, in _load_unlocked

File "", line 728, in exec_module

File "", line 219, in _call_with_frames_removed

File "c:\python37-32\lib\site-packages\scrapy\downloadermiddlewares\retry.py", line 20, in
from twisted.web.client import ResponseFailed
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import _win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
builtins.ModuleNotFoundError: No module named 'win32api'

2019-07-14 08:17:08,236 - _legacy.py[line:154] - CRITICAL:
Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\twisted\internet\defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File "c:\python37-32\lib\site-packages\scrapy\crawler.py", line 80, in crawl
self.engine = self._create_engine()
File "c:\python37-32\lib\site-packages\scrapy\crawler.py", line 105, in create_engine
return ExecutionEngine(self, lambda : self.stop())
File "c:\python37-32\lib\site-packages\scrapy\core\engine.py", line 69, in init
self.downloader = downloader_cls(crawler)
File "c:\python37-32\lib\site-packages\scrapy\core\downloader_init
.py", line 88, in init
self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
File "c:\python37-32\lib\site-packages\scrapy\middleware.py", line 53, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "c:\python37-32\lib\site-packages\scrapy\middleware.py", line 34, in from_settings
mwcls = load_object(clspath)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "c:\python37-32\lib\site-packages\scrapy\downloadermiddlewares\retry.py", line 20, in
from twisted.web.client import ResponseFailed
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import _win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
ModuleNotFoundError: No module named 'win32api'
2019-07-14 08:17:08 [twisted] CRITICAL:
Traceback (most recent call last):
File "c:\python37-32\lib\site-packages\twisted\internet\defer.py", line 1418, in _inlineCallbacks
result = g.send(result)
File "c:\python37-32\lib\site-packages\scrapy\crawler.py", line 80, in crawl
self.engine = self._create_engine()
File "c:\python37-32\lib\site-packages\scrapy\crawler.py", line 105, in create_engine
return ExecutionEngine(self, lambda : self.stop())
File "c:\python37-32\lib\site-packages\scrapy\core\engine.py", line 69, in init
self.downloader = downloader_cls(crawler)
File "c:\python37-32\lib\site-packages\scrapy\core\downloader_init
.py", line 88, in init
self.middleware = DownloaderMiddlewareManager.from_crawler(crawler)
File "c:\python37-32\lib\site-packages\scrapy\middleware.py", line 53, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "c:\python37-32\lib\site-packages\scrapy\middleware.py", line 34, in from_settings
mwcls = load_object(clspath)
File "c:\python37-32\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object
mod = import_module(module)
File "c:\python37-32\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in _call_with_frames_removed
File "c:\python37-32\lib\site-packages\scrapy\downloadermiddlewares\retry.py", line 20, in
from twisted.web.client import ResponseFailed
File "c:\python37-32\lib\site-packages\twisted\web\client.py", line 42, in
from twisted.internet.endpoints import HostnameEndpoint, wrapClientTLS
File "c:\python37-32\lib\site-packages\twisted\internet\endpoints.py", line 41, in
from twisted.internet.stdio import StandardIO, PipeAddress
File "c:\python37-32\lib\site-packages\twisted\internet\stdio.py", line 30, in
from twisted.internet import _win32stdio
File "c:\python37-32\lib\site-packages\twisted\internet_win32stdio.py", line 9, in
import win32api
ModuleNotFoundError: No module named 'win32api'

@netzeng
Copy link
Author

netzeng commented Jul 14, 2019

pip install pypiwin32或pip3 install pypiwin32 或 python -m pip install pypiwin32,

已解决!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant