Home < Mastering Selenium: How to Use Proxy Servers for Enhanced Web Scraping

Mastering Selenium: How to Use Proxy Servers for Enhanced Web Scraping

Posted on: July 20, 2024

Why Use Proxies with Selenium?

Proxies provide several benefits including avoiding IP bans, accessing geo-restricted content, and enhanced anonymity.

 

Setting Up Proxies in Selenium

 

Using Proxies with Chrome

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

def get_chrome_driver(proxy):
    chrome_options = Options()
    chrome_options.add_argument('--proxy-server=%s' % proxy)
    
    driver = webdriver.Chrome(options=chrome_options)
    return driver

proxy = "http://123.456.789.000:8080"
driver = get_chrome_driver(proxy)
driver.get("http://example.com")

 

Using Proxies with Firefox

from selenium import webdriver
from selenium.webdriver.firefox.options import Options

def get_firefox_driver(proxy):
    firefox_profile = webdriver.FirefoxProfile()
    firefox_profile.set_preference('network.proxy.type', 1)
    firefox_profile.set_preference('network.proxy.http', proxy.split(':')[0])
    firefox_profile.set_preference('network.proxy.http_port', int(proxy.split(':')[1]))
    firefox_profile.update_preferences()
    
    options = Options()
    driver = webdriver.Firefox(firefox_profile=firefox_profile, options=options)
    return driver

proxy = "123.456.789.000:8080"
driver = get_firefox_driver(proxy)
driver.get("http://example.com")

 

Advanced Proxy Configuration

 

Using Authenticated Proxies

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

proxy = "http://username:password@123.456.789.000:8080"
chrome_options = Options()
chrome_options.add_argument('--proxy-server=%s' % proxy)

capabilities = DesiredCapabilities.CHROME.copy()
capabilities['acceptSslCerts'] = True
capabilities['acceptInsecureCerts'] = True

driver = webdriver.Chrome(options=chrome_options, desired_capabilities=capabilities)
driver.get("http://example.com")

 

Using Rotating Proxies

For rotating proxies, you can integrate Selenium with proxy management services or libraries like `scrapy-rotating-proxies` or `PyProxy`.

 

Conclusion

Using proxies with Selenium enhances the effectiveness of your web scraping projects by providing anonymity, avoiding IP bans, and accessing geo-restricted content. With this guide, you can set up and configure proxies in both Chrome and Firefox browsers, ensuring your scraping activities are secure and efficient. Happy scraping!

 

Find a right dataset that you are looking for from crawl feeds store.

Datasets

Submit data request if not able to find right dataset.
Custom request