大数据语言教你如何控制浏览器访问随机网页

效果展示如下:

如何控制浏览器

python程序控制浏览器访问简书

python程序控制浏览器访问简书

提前准备以下环境:

python3.6 selenium3.141.0

准备chromedriver软件包

# 选择和本机谷歌浏览器匹配的chromedriver版本 下载地址:https://chromedriver.storage.googleapis.com/index.html

Mac环境安装方法如下:

wget unzip chromedriver_mac64.zip mv chromedriver /usr/local/bin/ chmod u+x,o+x /usr/local/bin/chromedriver chromedriver --version

Python代码如下:

#!/usr/bin/env python # -*- coding: utf-8 -*- # @Time: 2020-06-10 14:03 # @Author: Anthony # @Email : [email protected] # @File: selenium_webdriver.py from selenium import webdriver from time import sleep import requests import json import random headers = { User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36} # 自建免费代理 def get_proxies_ip(): try: url = ":35050/api/v1/proxy/" proxies_ips = requests.get(url, headers=headers) proxies_ip = json.loads(proxies_ips.text)[data][proxy] return proxies_ip except Exception as e: print(e) # 公共免费代理地址: # 自建代理方法: # 需要自行获取代理地址,格式::8118 # 获取IP池中任一IP proxies = {"http": "http://" + str(get_proxies_ip())} chromeOptions = webdriver.ChromeOptions() # 设置代理 chromeOptions.add_argument("--proxy-server=%s"%(proxies[http])) #实例化一个浏览器对象 brower_chrome = webdriver.Chrome(chrome_options=chromeOptions) brower_chrome.maximize_window() #指定自动化的行为动作 while True: brower_list = [https://www.jianshu.com, https://www.zhihu.com, https://www.baidu.com, ] brower_chrome.delete_all_cookies() brower_chrome.get(random.choice(brower_list)) sleep(10) brower_chrome.refresh()