When parsing data, there are situations when you need to access a site using the selenium driver. Since the site page has scripts running in the code that add data to the code, which simply cannot be accessed using regular requests. But things can be a little worse. For example, the page can be located behind a CDN, such as Cloudflare, with browser checking enabled. In this case, the regular selenium driver will not be able to access the page, since it will be determined that automated testing software is being used. But even in these cases, there is a way out. Let's see how you can bypass this check by disabling certain options in the browser, and also consider a modified version of the driver for Chrome, in which these options are already disabled out of the box. But first things first.

First, to make sure that the Chrome browser does not have access to the site, let's create a simple code that will load the driver and go to a secure page.
What do you need?
This code requires selenium to be installed. To do this, type the following command in the terminal:
Loading a site with automation options enabled
After the required library is installed, import the required modules into the script. We will need the os module to get the path to the driver; time, to set a short pause before closing; platform, to determine the operating system, so that the required driver can be loaded depending on it.
We import webdriver from the selenium library, as well as Service from selenium.webdriver.chrome.service, to pass the required parameters to the driver. In particular, this code passes the path to the webdriver, as well as the log_path parameter, in which you can specify the path to save the logs. In this case, the logs are saved in null.
Well, let's go to the desired page, wait 10 seconds and close the browser.
The page the browser goes to is protected by Cloudflare. We were unable to access the site. We were asked to confirm that we are humans. That is, the browser failed the check and automated software was identified.

Using selenium-stealth to try to bypass protection
This project has helped out in many cases in the past. But in this case, things are not so good and rosy. Let's try to access the site with its help.
What do you need?
Install selenium, as well as the selenium-stealth add-on, which is designed to hide traces of automation. To install them, write the command in the terminal:
Now we will do the same operations that we did when using a regular driver. With one small difference. Here you will need to import the installed selenium-stealth into the script, and also import Options from selenium.webdriver.chrome.options in order to be able to pass various options to the browser, such as "headlessness" and others. In our case, you will need to pass options to disable automation. But let's take it in order. Import the libraries into the script.
We create an instance of the Options class and pass the necessary parameters to it.
After this, we create an object of the browser class, where we pass the path to the driver and the options we set.
Now we execute the function from the selenium_stealth module, which sets the headers, display language and other parameters.
Well, then we'll try to access the page behind Cloudflare again.
And again, failure. Unfortunately, the stealth module did not cope with this task, which is shown in the screenshot below. The browser is still determined under the control of automated software.

Disabling options by running a script in the driver
This method was presented on the Python Today channel, for which we thank them very much. In order to bypass the browser's protection for checking the use of automated software, you need to disable some options that are present in the browser running Selenium, but you will not find them in the original browser.
Here are these options. It is quite easy to view them. To do this, launch the browser running the webdriver, press "F12" to get to the developer tools. Then go to the console and write: window.cdc. And here you will see the parameters that are read by the protection against automated software.
Now it's time to move on to practice.
What do you need?
Apart from installing selenium, there is no need to install any third-party libraries in this code. Write in the terminal to install it:
Now import the necessary libraries into the script.
As you can see, the Options class is also imported here, which means we will be passing parameters to the browser. In this case, we will pass a parameter to disable automation. However, just in case, the channel provides two additional parameters that can be disabled if necessary.
We specify the path to the driver and pass the options and path to the browser object.
Now we need to run the script to remove the parameters we talked about above from the current session. To do this, run the script:
After this, we try to enter the protected site.
And, everything works out. Disabling the option and deleting the window methods helped in this case. We got access to the page.

As we can see, the method works and can be used when parsing protected sites like these.
Gaining access using undetected-chromedriver
Here's another method that uses much less code. This time we will try to access a secure site using the undetected-chromedriver library. The developer of this library went further and created a library that already loads the required driver and immediately modifies it with the installation of certain profiles. Moreover, if you already have selenium installed, you can specify the version of the webdriver, and it will also be modified in undetected. To install the library, write in the terminal:
pip install undetected-chromedriver
And that's it. You don't need to install anything else. The selenium driver, as I wrote above, will be pulled in automatically.
Now that the library is installed, import everything that is necessary for its operation into the script.
That's it. As you can see, just the library and time, just to set a pause until the browser is closed.
When using this library, there is no need to specify the path to the webdriver, it will be found automatically. All that is required in this case is to create a webdriver object.
And after that, we'll go to the secure site. By the way, the site we've been using throughout this article was made by the library developer specifically to test the protection. As he writes, everything that can be enabled is enabled on it.
As shown in the screenshot below, access to the site was successful.

Well, since the Selenium driver is used, it can also be imported into the script and used to perform all the search and click operations that you did in regular Selenium.
This is a small overview. As you can see, using one or another method allows you to access even a site protected from automated software. And at least two methods here are working. Which one to use is up to you, it's a matter of preference. But I use undetected-chromedriver, which allows you to use less code with the same result as in the Python Today code.
And that's probably all.
Thank you for your attention. I hope this information will be useful to you

First, to make sure that the Chrome browser does not have access to the site, let's create a simple code that will load the driver and go to a secure page.
What do you need?
This code requires selenium to be installed. To do this, type the following command in the terminal:
Bash:
pip install selenium
Loading a site with automation options enabled
After the required library is installed, import the required modules into the script. We will need the os module to get the path to the driver; time, to set a short pause before closing; platform, to determine the operating system, so that the required driver can be loaded depending on it.
We import webdriver from the selenium library, as well as Service from selenium.webdriver.chrome.service, to pass the required parameters to the driver. In particular, this code passes the path to the webdriver, as well as the log_path parameter, in which you can specify the path to save the logs. In this case, the logs are saved in null.
Python:
import os
import time
from platform import system
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
exec_path = os.path.join(os.getcwd(), 'driver', 'chromedriver.exe') if system() == "Windows" else \
os.path.join(os.getcwd(), 'driver', 'chromedriver')
driver = webdriver.Chrome(service=Service(log_path=os.devnull, executable_path=exec_path))
Python:
driver.get('https://nowsecure.nl')
time.sleep(10)
driver.close()
driver.quit()

Python:
import os
import time
from platform import system
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
exec_path = os.path.join(os.getcwd(), 'driver', 'chromedriver.exe') if system() == "Windows" else \
os.path.join(os.getcwd(), 'driver', 'chromedriver')
driver = webdriver.Chrome(service=Service(log_path=os.devnull, executable_path=exec_path))
driver.get('https://nowsecure.nl')
time.sleep(10)
driver.close()
driver.quit()
Using selenium-stealth to try to bypass protection
This project has helped out in many cases in the past. But in this case, things are not so good and rosy. Let's try to access the site with its help.
What do you need?
Install selenium, as well as the selenium-stealth add-on, which is designed to hide traces of automation. To install them, write the command in the terminal:
Bash:
pip install selenium selenium-stealth
Now we will do the same operations that we did when using a regular driver. With one small difference. Here you will need to import the installed selenium-stealth into the script, and also import Options from selenium.webdriver.chrome.options in order to be able to pass various options to the browser, such as "headlessness" and others. In our case, you will need to pass options to disable automation. But let's take it in order. Import the libraries into the script.
Code:
import os
import time
from platform import system
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium_stealth import stealth
We create an instance of the Options class and pass the necessary parameters to it.
Python:
options = Options()
# options.add_argument("--headless")
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
After this, we create an object of the browser class, where we pass the path to the driver and the options we set.
Python:
exec_path = os.path.join(os.getcwd(), 'driver', 'chromedriver.exe') if system() == "Windows" else \
os.path.join(os.getcwd(), 'driver', 'chromedriver')
driver = webdriver.Chrome(options=options, service=Service(log_path=os.devnull, executable_path=exec_path))
Now we execute the function from the selenium_stealth module, which sets the headers, display language and other parameters.
Code:
stealth(driver=driver,
user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/83.0.4103.53 Safari/537.36',
languages=["ru-RU", "ru"],
vendor="Google Inc.",
platform="Win32",
webgl_vendor="Intel Inc.",
renderer="Intel Iris OpenGL Engine",
fix_hairline=True,
run_on_insecure_origins=True,
)
Python:
driver.get('https://nowsecure.nl')
time.sleep(20)
driver.close()
driver.quit()

Python:
import os
import time
from platform import system
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
from selenium_stealth import stealth
options = Options()
# options.add_argument("--headless")
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
exec_path = os.path.join(os.getcwd(), 'driver', 'chromedriver.exe') if system() == "Windows" else \
os.path.join(os.getcwd(), 'driver', 'chromedriver')
driver = webdriver.Chrome(options=options, service=Service(log_path=os.devnull, executable_path=exec_path))
stealth(driver=driver,
user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) '
'Chrome/83.0.4103.53 Safari/537.36',
languages=["ru-RU", "ru"],
vendor="Google Inc.",
platform="Win32",
webgl_vendor="Intel Inc.",
renderer="Intel Iris OpenGL Engine",
fix_hairline=True,
run_on_insecure_origins=True,
)
driver.get('https://nowsecure.nl')
time.sleep(20)
driver.close()
driver.quit()
Disabling options by running a script in the driver
This method was presented on the Python Today channel, for which we thank them very much. In order to bypass the browser's protection for checking the use of automated software, you need to disable some options that are present in the browser running Selenium, but you will not find them in the original browser.
Here are these options. It is quite easy to view them. To do this, launch the browser running the webdriver, press "F12" to get to the developer tools. Then go to the console and write: window.cdc. And here you will see the parameters that are read by the protection against automated software.
Python:
window.cdc_adoQpoasnfa76pfcZLmcfl_Array
window.cdc_adoQpoasnfa76pfcZLmcfl_Promise
window.cdc_adoQpoasnfa76pfcZLmcfl_Symbol
Now it's time to move on to practice.
What do you need?
Apart from installing selenium, there is no need to install any third-party libraries in this code. Write in the terminal to install it:
Bash:
pip install selenium
Now import the necessary libraries into the script.
Python:
import os
import time
from platform import system
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
As you can see, the Options class is also imported here, which means we will be passing parameters to the browser. In this case, we will pass a parameter to disable automation. However, just in case, the channel provides two additional parameters that can be disabled if necessary.
Python:
options = Options()
# options.add_argument("--headless")
# options.add_experimental_option("excludeSwitches", ["enable-automation"])
# options.add_experimental_option('useAutomationExtension', False)
options.add_argument("--disable-blink-features=AutomationControlled")
We specify the path to the driver and pass the options and path to the browser object.
Python:
exec_path = os.path.join(os.getcwd(), 'driver', 'chromedriver.exe') if system() == "Windows" else \
os.path.join(os.getcwd(), 'driver', 'chromedriver')
driver = webdriver.Chrome(options=options, service=Service(log_path=os.devnull, executable_path=exec_path))
Now we need to run the script to remove the parameters we talked about above from the current session. To do this, run the script:
Python:
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
'source': '''
delete window.cdc_adoQpoasnfa76pfcZLmcfl_Array;
delete window.cdc_adoQpoasnfa76pfcZLmcfl_Promise;
delete window.cdc_adoQpoasnfa76pfcZLmcfl_Symbol;
'''
})
After this, we try to enter the protected site.
Python:
driver.get('https://nowsecure.nl')
time.sleep(10)
driver.close()
driver.quit()
And, everything works out. Disabling the option and deleting the window methods helped in this case. We got access to the page.

As we can see, the method works and can be used when parsing protected sites like these.
Python:
import os
import time
from platform import system
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service
options = Options()
# options.add_argument("--headless")
# options.add_experimental_option("excludeSwitches", ["enable-automation"])
# options.add_experimental_option('useAutomationExtension', False)
options.add_argument("--disable-blink-features=AutomationControlled")
exec_path = os.path.join(os.getcwd(), 'driver', 'chromedriver.exe') if system() == "Windows" else \
os.path.join(os.getcwd(), 'driver', 'chromedriver')
driver = webdriver.Chrome(options=options, service=Service(log_path=os.devnull, executable_path=exec_path))
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
'source': '''
delete window.cdc_adoQpoasnfa76pfcZLmcfl_Array;
delete window.cdc_adoQpoasnfa76pfcZLmcfl_Promise;
delete window.cdc_adoQpoasnfa76pfcZLmcfl_Symbol;
'''
})
driver.get('https://nowsecure.nl')
time.sleep(10)
driver.close()
driver.quit()
Here's another method that uses much less code. This time we will try to access a secure site using the undetected-chromedriver library. The developer of this library went further and created a library that already loads the required driver and immediately modifies it with the installation of certain profiles. Moreover, if you already have selenium installed, you can specify the version of the webdriver, and it will also be modified in undetected. To install the library, write in the terminal:
pip install undetected-chromedriver
And that's it. You don't need to install anything else. The selenium driver, as I wrote above, will be pulled in automatically.
Now that the library is installed, import everything that is necessary for its operation into the script.
Python:
import time
import undetected_chromedriver as uc
That's it. As you can see, just the library and time, just to set a pause until the browser is closed.
When using this library, there is no need to specify the path to the webdriver, it will be found automatically. All that is required in this case is to create a webdriver object.
Python:
driver = uc.Chrome()
And after that, we'll go to the secure site. By the way, the site we've been using throughout this article was made by the library developer specifically to test the protection. As he writes, everything that can be enabled is enabled on it.
Python:
driver.get('https://nowsecure.nl')
time.sleep(10)
driver.close()
driver.quit()
As shown in the screenshot below, access to the site was successful.

Well, since the Selenium driver is used, it can also be imported into the script and used to perform all the search and click operations that you did in regular Selenium.
Python:
import time
import undetected_chromedriver as uc
driver = uc.Chrome()
driver.get('https://nowsecure.nl')
time.sleep(10)
driver.close()
driver.quit()
This is a small overview. As you can see, using one or another method allows you to access even a site protected from automated software. And at least two methods here are working. Which one to use is up to you, it's a matter of preference. But I use undetected-chromedriver, which allows you to use less code with the same result as in the Python Today code.
And that's probably all.
Thank you for your attention. I hope this information will be useful to you

