个人:yk 坤帝后台回复 requests库详解 获取全部源代码
目录
1. 什么是Requests
2. requests 实例引入
3. 各种请求方式
4. 爬虫GET请求解析,代码步骤分析
4.1 带参数GET请求4.2 解析json数据4.3 获取二进制数据4.4 添加headers
5. 爬虫POST请求解析,代码步骤分析
6. 爬虫响应状态分析
6.1 reponse属性分析6.2 返回状态码判断6.3 异常响应码解析
7. 爬虫高级操作:文件上传,配置应用
8. 如何获取cookie?
9. 网页会话维持,实现模拟登录
10. 合法证书验证
11. ip代理设置
12. 超时设置,检测合格ip
13. 用户认证设置
14. 异常处理
1.什么是Requests
Requests 是⽤Python语⾔编写,基于 urllib,采⽤Apache2 Licensed 开源协议的 HTTP 库。它⽐ urllib 更加⽅便,可以节约我们⼤量的⼯作,完全满⾜HTTP 测试需求。
⼀句话——Python实现的简单易⽤的HTTP库
2.requests 实例引入
import requests response = requests.get( print(type(response)) print(response.status_code) print(type(response.text)) print(response.text) print(response.cookies)3.各种请求方式
import requests requests.post( requests.put( requests.delete( requests.head( requests.options( 4.爬虫GET请求解析,代码步骤分析4.1 带参数GET请求
import requests response = requests.get("?name=germey&age=22") print(response.text) import requests data = { name: germey, age: 22 } response = requests.get("", params=data) print(response.text) 4.2 解析json数据个人:yk 坤帝 后台回复requests库详解获取全部源代码 import requests import json response = requests.get("") print(type(response.text)) print(response.json()) print(json.loads(response.text)) print(type(response.json()))4.3 获取二进制数据
import requests response = requests.get("") print(type(response.text), type(response.content)) print(response.text) print(response.content) import requests response = requests.get("") with open(favicon.ico, wb) as f: f.write(response.content) f.close()4.4 添加headers
import requests response = requests.get("") print(response.text) import requests headers = { User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36 } response = requests.get("", headers=headers) print(response.text)5.爬虫POST请求解析,代码步骤分析
import requests data = {name: germey, age: 22} response = requests.post("", data=data) print(response.text) import requests data = {name: germey, age: 22} headers = { User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36 } response = requests.post("", data=data, headers=headers) print(response.json())6.爬虫响应状态分析
6.1 reponse属性分析
import requests response = requests.get( print(type(response.status_code), response.status_code) print(type(response.headers), response.headers) print(type(response.cookies), response.cookies) print(type(response.url), response.url) print(type(response.history), response.history)6.2 返回状态码判断
import requests response = requests.get( exit() if not response.status_code == requests.codes.not_found else print(404 Not Found)6.3 异常响应码解析
import requests response = requests.get( exit() if not response.status_code == 200 else print(Request Successfully) 100: (continue,), 101: (switching_protocols,), 102: (processing,), 103: (checkpoint,), 122: (uri_too_long, request_uri_too_long), 200: (ok, okay, all_ok, all_okay, all_good, \\o/, ✓), 201: (created,), 202: (accepted,), 203: (non_authoritative_info, non_authoritative_information), 204: (no_content,), 205: (reset_content, reset), 206: (partial_content, partial), 207: (multi_status, multiple_status, multi_stati, multiple_stati), 208: (already_reported,), 226: (im_used,), # Redirection. 300: (multiple_choices,), 301: (moved_permanently, moved, \\o-), 302: (found,), 303: (see_other, other), 304: (not_modified,), 305: (use_proxy,), 306: (switch_proxy,), 307: (temporary_redirect, temporary_moved, temporary), 308: (permanent_redirect, resume_incomplete, resume,), # These 2 to be removed in 3.0 # Client Error. 400: (bad_request, bad), 401: (unauthorized,), 402: (payment_required, payment), 403: (forbidden,), 404: (not_found, -o-), 405: (method_not_allowed, not_allowed), 406: (not_acceptable,), 407: (proxy_authentication_required, proxy_auth, proxy_authentication), 408: (request_timeout, timeout), 409: (conflict,), 410: (gone,), 411: (length_required,), 412: (precondition_failed, precondition), 413: (request_entity_too_large,), 414: (request_uri_too_large,), 415: (unsupported_media_type, unsupported_media, media_type), 416: (requested_range_not_satisfiable, requested_range, range_not_satisfiable), 417: (expectation_failed,), 418: (im_a_teapot, teapot, i_am_a_teapot), 421: (misdirected_request,), 422: (unprocessable_entity, unprocessable), 423: (locked,), 424: (failed_dependency, dependency), 425: (unordered_collection, unordered), 426: (upgrade_required, upgrade), 428: (precondition_required, precondition), 429: (too_many_requests, too_many), 431: (header_fields_too_large, fields_too_large), 444: (no_response, none), 449: (retry_with, retry), 450: (blocked_by_windows_parental_controls, parental_controls), 451: (unavailable_for_legal_reasons, legal_reasons), 499: (client_closed_request,), # Server Error. 500: (internal_server_error, server_error, /o\\, ✗), 501: (not_implemented,), 502: (bad_gateway,), 503: (service_unavailable, unavailable), 504: (gateway_timeout,), 505: (http_version_not_supported, http_version), 506: (variant_also_negotiates,), 507: (insufficient_storage,), 509: (bandwidth_limit_exceeded, bandwidth), 510: (not_extended,), 511: (network_authentication_required, network_auth, network_authentication),7. 爬虫高级操作:文件上传,配置应用
import requests files = {file: open(favicon.ico, rb)} response = requests.post("", files=files) print(response.text)8.如何获取cookie?
import requests response = requests.get("") print(response.cookies) for key, value in response.cookies.items(): print(key + = + value)9.网页会话维持,实现模拟登录
import requests requests.get( response = requests.get( print(response.text) import requests s = requests.Session() s.get( response = s.get( print(response.text)10.合法证书验证
import requests response = requests.get( print(response.status_code) import requests from requests.packages import urllib3 urllib3.disable_warnings() response = requests.get( verify=False) print(response.status_code) import requests response = requests.get( cert=(/path/server.crt, /path/key)) print(response.status_code)11.ip代理设置
个人:yk 坤帝 后台回复requests库详解获取全部源代码 import requests proxies = { "http": ":9743", "https": ":9743", } response = requests.get("", proxies=proxies) print(response.status_code) import requests proxies = { "http": ":[email protected]:9743/", } response = requests.get("", proxies=proxies) print(response.status_code) pip3 install requests[socks] import requests proxies = { http: socks5://127.0.0.1:9742, https: socks5://127.0.0.1:9742 } response = requests.get("", proxies=proxies) print(response.status_code)