CentOS6.8安装scrapy爬虫框架

字数 626阅读 408

背景

最近想搞一个个人项目,需要用到爬虫,所以接触了scrapy。

环境参数

  • OS环境
    CentOS6.8 x86_64
  • OS默认python版本
    python2.6

安装问题

在默认python版本环境(python2.6)下安装了pip,通过pip install scrapy 结果运行的时候出现一下错误:

提示 scrapy1.4.0 依赖python2.7版本

[root@iZwz9e75q2nzsxqdr0ll5yZ ~]# python
Python 2.6.6 (r266:84292, Aug 18 2016, 15:13:37)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-17)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import scrapy
Scrapy 1.4.0 requires Python 2.7
[root@iZwz9e75q2nzsxqdr0ll5yZ ~]#

简单,我们去官网看下,果然是的。

最新版(1.4.0)的scrapy需要python2.7及以上或python3.3以上的版本才支持。

require.png

解决方法

怎么解决?来,我们有两种方法解决这个问题。

方法一: 安装的时候选择版本

大不了我们不安装最新版本,安装旧版本不就行了吗。
验了一下scrapy 1.0.0 及以上都依赖python2.7。

[root@iZwz9e75q2nzsxqdr0ll5yZ distribute-0.6.10]# pip install scrapy==0.9.0
You are using pip version 7.1.0, however version 9.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Collecting scrapy==0.9.0
  Downloading http://mirrors.aliyun.com/pypi/packages/1a/70/fc8948afda8349cfd6dbd099da59248fdfde1ee658af63fc04643a928db3/Scrapy-0.9.tar.gz (771kB)
    100% |████████████████████████████████| 774kB 2.2MB/s
Requirement already satisfied (use --upgrade to upgrade): Twisted>=2.5 in
...(省略)...
/usr/lib/python2.6/site-packages (from Automat>=0.3.0->Twisted>=2.5->scrapy==0.9.0)
Installing collected packages: scrapy
  Running setup.py install for scrapy
Successfully installed scrapy-0.9

安装成功,但是还是要import确认一下

[root@iZwz9e75q2nzsxqdr0ll5yZ distribute-0.6.10]# python
Python 2.6.6 (r266:84292, Aug 18 2016, 15:13:37)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-17)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import scrapy
/usr/lib64/python2.6/site-packages/cryptography/__init__.py:26: DeprecationWarning: Python 2.6 is no longer supported by the Python core team, please upgrade your Python. The next version of cryptography will drop support for Python 2.6
  DeprecationWarning
>>>

OK,没问题。

只是有个提示,python2.6 快成为弃儿了,叫我们赶紧升级。

/usr/lib64/python2.6/site-packages/cryptography/init.py:26: DeprecationWarning: Python 2.6 is no longer supported by the Python core team, please upgrade your Python. The next version of cryptography will drop support for Python 2.6
DeprecationWarning

方法二: 升级python到2.7版本

我个人使用软件包,一般情况下都会选择最新版,我认为最新版若不是最好也是较好的,除非官方提示特殊情况。

来,那我们就看看升级python到2.7然后再安装scrapy1.4.0(最新版)的方法。

  • 安装python2.7

2.7.14 是python2 的最新版

[root@iZwz9e75q2nzsxqdr0ll5yZ ~]# https://www.python.org/ftp/python/2.7.14/Python-2.7.14.tgz
[root@iZwz9e75q2nzsxqdr0ll5yZ ~]# tar -zxvf Python-2.7.14.tgz && cd Python-2.7.14 && ./configure && make all && make install
...(省略)...
[root@iZwz9e75q2nzsxqdr0ll5yZ ~]# cd /usr/bin/; rm -f python && ln -s /usr/local/bin/python2.7 python  
(如果不创建软链接的话,运行python2.7的时候要用python2.7命令,脚本头要写 #!/usr/bin/python2.7)
[root@iZwz9e75q2nzsxqdr0ll5yZ bin]# ll python
lrwxrwxrwx 1 root root 24 11月 23 14:42 python -> /usr/local/bin/python2.7

注:1. 要把 /usr/bin/yum 中的 #!/usr/bin/python 改成 #!/usr/bin/python2.6

  • 安装 setuptools 最新版
[root@iZwz9e75q2nzsxqdr0ll5yZ ~]# wget -O setuptools-37.0.0.zip  https://pypi.python.org/packages/7c/cb/bdfbb0b6a56459d5461768de824d4f40ec4c4c778f3a8fb0b84c25f03b68/setuptools-37.0.0.zip#md5=f905ca70d2db37b7284c0f6314ab6814
[root@iZwz9e75q2nzsxqdr0ll5yZ ~]# unzip setuptools-37.0.0.zip && cd setuptools-37.0.0
[root@iZwz9e75q2nzsxqdr0ll5yZ setuptools-37.0.0]# python setup.py install 
(或者python2.7 setup.py install) 

注:安装成功之后,会看到以下目录

[root@iZwz9e75q2nzsxqdr0ll5yZ ~]# ls -d /usr/local/lib/python2.7/site-packages/setuptools*
/usr/local/lib/python2.7/site-packages/setuptools-37.0.0-py2.7.egg /usr/local/lib/python2.7/site-packages/setuptools.pth
[root@iZwz9e75q2nzsxqdr0ll5yZ ~]#

  • 安装pip最新版
[root@iZwz9e75q2nzsxqdr0ll5yZ ~]# wget -O pip-9.0.1.tar.gz https://pypi.python.org/packages/11/b6/abcb525026a4be042b486df43905d6893fb04f05aac21c32c638e939e447/pip-9.0.1.tar.gz#md5=35f01da33009719497f01a4ba69d63c9
[root@iZwz9e75q2nzsxqdr0ll5yZ ~]# tar -zxvf pip-9.0.1.tar.gz && cd pip-9.0.1
[root@iZwz9e75q2nzsxqdr0ll5yZ pip-9.0.1]# python setup.py install
(或者python2.7 setup.py install) 

注:安装成功之后,会看到以下目录

[root@iZwz9e75q2nzsxqdr0ll5yZ ~]# ls -d /usr/local/lib/python2.7/site-packages/pip*
/usr/local/lib/python2.7/site-packages/pip-9.0.1-py2.7.egg

  • 安装scrapy
[root@iZwz9e75q2nzsxqdr0ll5yZ ~]# pip install scrapy

Collecting Twisted>=13.1.0 (from scrapy)
  Could not find a version that satisfies the requirement Twisted>=13.1.0 (from scrapy) (from versions: )
No matching distribution found for Twisted>=13.1.0 (from scrapy)

wget https://twistedmatrix.com/Releases/Twisted/17.9/Twisted-17.9.0.tar.bz2

以上错误表示scrapy依赖Twisted 13.1.0及以上的版本,但是pip 找不到这个版本(Twisted是用Python实现的基于事件驱动的网络引擎框架),只能手动下载这个版本的安装包来安装了。

[root@iZwz9e75q2nzsxqdr0ll5yZ ~]# wget -O Twisted-17.9.0.tar.bz2  https://pypi.python.org/packages/a2/37/298f9547606c45d75aa9792369302cc63aa4bbcf7b5f607560180dd099d2/Twisted-17.9.0.tar.bz2#md5=6dbedb918f0c7288a4c670f59393ecf8
[root@iZwz9e75q2nzsxqdr0ll5yZ ~]# tar -jxvf Twisted-17.9.0.tar.bz2 && cd Twisted-17.9.0
[root@iZwz9e75q2nzsxqdr0ll5yZ Twisted-17.9.0]# python setup.py install
(或者python2.7 setup.py install)

然后再安装 scrapy

[root@iZwz9e75q2nzsxqdr0ll5yZ ~]# /usr/local/bin/pip install scrapy
Collecting scrapy
  Downloading http://mirrors.aliyun.com/pypi/packages/a8/96/3affe11cf53a5d2105536919113d5b453479038bb486f7387f4ce4a3b83f/Scrapy-1.4.0-py2.py3-none-any.whl (248kB)
    100% |████████████████████████████████| 256kB 2.1MB/s
Requirement already satisfied: queuelib in /usr/local/lib/python2.7/site-packages (from scrapy)
...(省略)...
Requirement already satisfied: pycparser in /usr/local/lib/python2.7/site-packages (from cffi>=1.7; platform_python_implementation != "PyPy"->cryptography>=1.9->pyOpenSSL->scrapy)
Installing collected packages: scrapy
Successfully installed scrapy-1.4.0
[root@iZwz9e75q2nzsxqdr0ll5yZ ~]#

验证import,很完美,没有任何告警

[root@iZwz9e75q2nzsxqdr0ll5yZ ~]# python
Python 2.7.14 (default, Nov 23 2017, 13:28:32)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-18)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import scrapy
>>>

推荐阅读更多精彩内容