大数据的流行一定程序导致的爬虫的流行，有些企业和公司本身不生产数据，那就只能从网上爬取数据，笔者关注相关的内容有一定的时间，也写过很多关于爬虫的系列，现在收集好的框架希望能为对爬虫有兴趣的人，或者想更进一步的研究的人提供索引，也随时欢迎大家star,fork ,或者提issue，让我们一起来完善这个awesome系列.

==>github地址<==

Awesome-crawler

A collection of awesome web crawler,spider and resources in different language

Python

Scrapy - A fast high-level screen scraping and web crawling framework.
pyspider - A powerful spider system.
cola - A distributed crawling framework.
Demiurge - PyQuery-based scraping micro-framework.
feedparser - Universal feed parser.
Grab - Site scraping framework.
MechanicalSoup - A Python library for automating interaction with websites.
portia - Visual scraping for Scrapy.
crawley - Pythonic Crawling / Scraping Framework based on Non Blocking I/O operations.
RoboBrowser - A simple, Pythonic library for browsing the web without a standalone web browser.
MSpider - A simple ,easy spider using gevent and js render.

这是其中的一部分，还有其它相应语言的优秀爬虫框架在github里面，更多的请移步到github中

https://github.com/BruceDone/awesome-crawler

[爬虫资源]各大爬虫资源大汇总,做我们自己的awesome系列

Awesome-crawler

Python

相关文章

最近文章

分类

标签

友情链接

其它