
pip install requests启动idle 测试百度网页,打印出来
看到百度的主页已经被抓取下来
打开request库的源代码,get方法使用了request方法来封装
r.apparent_encoding:根据网页内容分析出的编码方式
用户请求url,服务器做出响应
通过这六个进行管理,每个都是独立的。
get方法爬取一些内容,并向服务器发送一些内容。
request.head(url,**kwargs) requests.post(url,data=None,json=None,**kwargs) requests.put(url,data=None,**kwargs) requests.patch(url,data=None,**kwargs) requests.delete(url,**kwargs)
后六个方法,这六个方法会常用到一些访问控制参数,所以把参数量放到了函数设计里面,那些不常用的,就放倒了可选的访问字段里面。
Python 3.8.8 (default, Apr 13 2021, 15:08:03) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license()" for more information.
>>> import requests
>>> kv = {'wd':'Python'}
>>> r = request.get("http://www.baidu.com/s",params = kv)
Traceback (most recent call last):
File "", line 1, in
r = request.get("http://www.baidu.com/s",params = kv)
NameError: name 'request' is not defined
>>> r = requests.get("http://www.baidu.com/s",params = kv)
>>> r.status_code
200
>>>
请求成功
Python 3.8.8 (default, Apr 13 2021, 15:08:03) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license()" for more information.
>>> import requests
>>> kv = {'wd':'Python'}
>>> r = request.get("http://www.baidu.com/s",params = kv)
Traceback (most recent call last):
File "", line 1, in
r = request.get("http://www.baidu.com/s",params = kv)
NameError: name 'request' is not defined
>>> r = requests.get("http://www.baidu.com/s",params = kv)
>>> r.status_code
200
>>> r.requests.url
Traceback (most recent call last):
File "", line 1, in
r.requests.url
AttributeError: 'Response' object has no attribute 'requests'
>>> r.request.url
'https://wappass.baidu.com/static/captcha/tuxing.html?&logid=8036147062640333629&ak=c27bbc89afca0463650ac9bde68ebe06&backurl=https%3A%2F%2Fwww.baidu.com%2Fs%3Fwd%3DPython&signature=2fa70a21294522eebadb45a7c1695212×tamp=1632972573'
>>> len(r.text)
1545
>>> r.text
'nnn n çx99¾åº¦å®x89åx85¨éªx8cè¯x81 n n n n n n n n n n è¿x94åx9bx9eé¦x96页n n n n éx97®é¢x98åx8fx8dé¦x88
n nnnnn'
Python 3.8.8 (default, Apr 13 2021, 15:08:03) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license()" for more information. >>> import requests >>> path = "E:/tp/abc.jpg" >>> url = "http://www.sinaimg.cn/dy/slidenews/1_img/2016_43/63957_743512_229766.jpg" >>> r = requests.get(url) >>> r.status_code 200 >>> with open(path,'wb') as f: f.write(r.content) 156992 >>> >>> f.close() >>>
代码怎么执行都不会错