Documentation Index Fetch the complete documentation index at: https://docs.zenrows.com/llms.txt
Use this file to discover all available pages before exploring further.
Replace your current web scraping solution with ZenRows to improve reliability, bypass anti-bot protection, and reduce maintenance overhead. This guide shows you how to integrate ZenRows into existing price monitoring systems.
Who Is this for?
This guide is for developers who already have a price monitoring system and want to integrate ZenRows to improve scraping reliability and reduce blocking issues.
What you’ll learn
Replace existing HTTP clients with ZenRows requests
Migrate from HTML parsing to CSS extraction
Optimize performance with concurrency controls
Monitor and control scraping costs
Scale across multiple regions
Prerequisites
An existing price monitoring system
A ZenRows API key (SIGN UP HERE )
Basic understanding of web scraping concepts
Integration Approaches
Choose the integration approach that best fits your current system:
Approach 1: Minimal Integration (HTTP Client Replacement)
Replace your current HTTP client with ZenRows while keeping existing parsing logic.
Before (typical implementation):
import requests
from bs4 import BeautifulSoup
def get_page_content ( url ):
headers = {
"User-Agent" : "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36" ,
"Accept-Language" : "en-US,en;q=0.9" ,
}
response = requests.get(url, headers = headers)
if response.status_code != 200 :
raise Exception ( f "Request failed: { response.status_code } " )
return response.content
def scrape_price_data ( url ):
html_content = get_page_content(url)
soup = BeautifulSoup(html_content, "html.parser" )
# Extract data using BeautifulSoup
name_elem = soup.select_one( "#productTitle" )
price_elem = soup.select_one( "span.aok-offscreen" )
return {
"name" : name_elem.get_text( strip = True ) if name_elem else None ,
"price" : price_elem.get_text( strip = True ) if price_elem else None ,
}
After (ZenRows integration):
import requests
from bs4 import BeautifulSoup
def get_page_content ( url ):
params = {
"url" : url,
"apikey" : "YOUR_ZENROWS_API_KEY" ,
"js_render" : "true" ,
"premium_proxy" : "true" ,
"proxy_country" : "us" ,
"wait" : 2000 ,
}
response = requests.get( "https://api.zenrows.com/v1/" , params = params)
if response.status_code != 200 :
raise Exception ( f "Request failed: { response.status_code } " )
return response.text
# Keep existing parsing logic unchanged
def scrape_price_data ( url ):
html_content = get_page_content(url)
soup = BeautifulSoup(html_content, "html.parser" )
# Same parsing logic as before
name_elem = soup.select_one( "#productTitle" )
price_elem = soup.select_one( "span.aok-offscreen" )
return {
"name" : name_elem.get_text( strip = True ) if name_elem else None ,
"price" : price_elem.get_text( strip = True ) if price_elem else None ,
}
Benefits:
Minimal code changes required
Immediate anti-bot protection
Keep existing data processing logic
Easy rollback if needed
Replace both HTTP client and HTML parsing with ZenRows CSS extraction for cleaner, more maintainable code.
Before:
import requests
from bs4 import BeautifulSoup
def scrape_price_data ( url ):
# HTTP request + HTML parsing
headers = { "User-Agent" : "Mozilla/5.0..." }
response = requests.get(url, headers = headers)
soup = BeautifulSoup(response.content, "html.parser" )
# Manual element extraction
name_elem = soup.select_one( "#productTitle" )
price_elem = soup.select_one( "span.aok-offscreen" )
rating_elem = soup.select_one( "span.a-size-base.a-color-base" )
return {
"name" : name_elem.get_text( strip = True ) if name_elem else None ,
"price" : price_elem.get_text( strip = True ) if price_elem else None ,
"rating" : rating_elem.get_text( strip = True ) if rating_elem else None ,
}
After:
import requests
import json
def scrape_price_data ( url ):
# Define CSS selectors
css_extractor = json.dumps({
"name" : "#productTitle" ,
"price" : "span.aok-offscreen" ,
"rating" : "span.a-size-base.a-color-base" ,
})
# Single ZenRows request with extraction
params = {
"url" : url,
"apikey" : "YOUR_ZENROWS_API_KEY" ,
"js_render" : "true" ,
"premium_proxy" : "true" ,
"proxy_country" : "us" ,
"css_extractor" : css_extractor,
}
response = requests.get( "https://api.zenrows.com/v1/" , params = params)
if response.status_code != 200 :
raise Exception ( f "Request failed: { response.status_code } " )
return response.json()
Benefits:
Eliminates HTML parsing dependencies
Cleaner, more maintainable code
Built-in data extraction
Reduced code complexity
Step-by-Step Integration
Assess Your Current Implementation
Before integrating ZenRows, analyze your current scraping setup to choose the best integration approach.
Identify your current components:
HTTP client (requests, urllib, etc.)
HTML parser (BeautifulSoup, lxml, etc.)
Data extraction logic
Error handling mechanisms
Proxy management (if any)
Common patterns to look for: # Pattern 1: Simple requests + BeautifulSoup
response = requests.get(url, headers = headers)
soup = BeautifulSoup(response.content, "html.parser" )
# Pattern 2: Session-based scraping
session = requests.Session()
session.headers.update(headers)
response = session.get(url)
# Pattern 3: Selenium-based scraping
driver = webdriver.Chrome()
driver.get(url)
element = driver.find_element(By. CSS_SELECTOR , selector)
# Pattern 4: Custom proxy rotation
proxies = { "http" : proxy_url, "https" : proxy_url}
response = requests.get(url, proxies = proxies)
See all 17 lines
Integration complexity assessment:
Low complexity : Simple requests + BeautifulSoup → Use Approach 1
Medium complexity : Custom headers/sessions → Use Approach 1 or 2
High complexity : Selenium/complex proxy logic → Use Approach 2
Replace HTTP Client
Start with the minimal integration approach by replacing your HTTP client with ZenRows. For requests-based systems: # Original function
def fetch_page ( url , headers = None , proxies = None ):
response = requests.get(url, headers = headers, proxies = proxies)
return response.text
# ZenRows replacement
def fetch_page ( url , apikey , country = "us" ):
params = {
"url" : url,
"apikey" : apikey,
"js_render" : "true" ,
"premium_proxy" : "true" ,
"proxy_country" : country,
}
response = requests.get( "https://api.zenrows.com/v1/" , params = params)
return response.text
For Puppeteer/Playwright-based systems: // Original Puppeteer approach
const puppeteer = require ( 'puppeteer' );
async function scrapeWithPuppeteer ( url ) {
const browser = await puppeteer . launch ();
const page = await browser . newPage ();
await page . goto ( url , { waitUntil: 'networkidle2' });
const html = await page . content ();
await browser . close ();
return html ;
}
// ZenRows Scraping Browser replacement
const puppeteer = require ( 'puppeteer-core' );
async function scrapeWithZenRows ( url ) {
const browser = await puppeteer . connect ({
browserWSEndpoint: 'wss://browser.zenrows.com?apikey=YOUR_ZENROWS_API_KEY'
});
const page = await browser . newPage ();
await page . goto ( url , { waitUntil: 'networkidle2' });
const html = await page . content ();
await browser . close ();
return html ;
}
See all 28 lines
# Original Playwright approach
from playwright.sync_api import sync_playwright
def scrape_with_playwright ( url ):
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
page.goto(url)
html = page.content()
browser.close()
return html
# ZenRows Scraping Browser replacement
from playwright.sync_api import sync_playwright
def scrape_with_zenrows_browser ( url ):
with sync_playwright() as p:
browser = p.chromium.connect_over_cdp(
"wss://browser.zenrows.com?apikey=YOUR_ZENROWS_API_KEY"
)
page = browser.new_page()
page.goto(url)
html = page.content()
browser.close()
return html
See all 25 lines
Migrate to CSS Extraction (Optional)
For cleaner code and better maintainability, replace HTML parsing with ZenRows CSS extraction. Identify your current selectors: # Current BeautifulSoup selectors
soup = BeautifulSoup(html, "html.parser" )
name = soup.select_one( "#productTitle" ).get_text( strip = True )
price = soup.select_one( "span.aok-offscreen" ).get_text( strip = True )
rating = soup.select_one( "span.a-size-base" ).get_text( strip = True )
reviews = soup.select_one( "#acrCustomerReviewText" ).get_text( strip = True )
Convert to CSS extractor: import json
# Define CSS extractor
css_extractor = json.dumps({
"name" : "#productTitle" ,
"price" : "span.aok-offscreen" ,
"rating" : "span.a-size-base" ,
"reviews" : "#acrCustomerReviewText" ,
})
def scrape_with_css_extractor ( url , apikey ):
params = {
"url" : url,
"apikey" : apikey,
"js_render" : "true" ,
"premium_proxy" : "true" ,
"css_extractor" : css_extractor,
}
response = requests.get( "https://api.zenrows.com/v1/" , params = params)
return response.json() # Returns structured data directly
Migration Checklist
Pre-Migration
Document current scraping logic and selectors
Identify proxy and header requirements
Test ZenRows with sample requests
Plan rollback strategy
During Migration
Replace HTTP client with ZenRows API calls
Update error handling for ZenRows responses
Test with production URLs
Monitor request costs and concurrency
Post-Migration
Remove old proxy management code
Clean up unused HTML parsing dependencies
Update monitoring and alerting
Document new ZenRows integration
Best Practices
Cost Management
Set daily cost limits to prevent unexpected charges
Monitor request costs with the X-Request-Cost header
Use Concurrency-Limit and Concurrency-Remaining response headers to optimize throughput and avoid IP block errors
Cache results when appropriate to reduce requests
Reliability
Implement retry logic with exponential backoff
Monitor selector stability and update as needed
Use fallback selectors for critical data points
Log all requests responses and errors for debugging and monitoring
Performance
Use appropriate concurrency based on your plan limits
Batch requests when monitoring multiple products
Leverage geographic targeting for region-specific data
Troubleshooting
Selector Compatibility
Test selectors in ZenRows Request Playground before migration
Some selectors may behave differently with JavaScript rendering
Use more specific selectors if extraction returns unexpected data
Cost Optimization
Disable js_render for static content to reduce costs
Use wait_for instead of wait when possible
Monitor Concurrency-Limit and Concurrency-Remaining response headers to maximize throughput
Error Handling
ZenRows returns different error codes. See the API Error Codes documentation for more details.
Implement specific handling for rate limits (429) and quota exceeded (402)
Log ZenRows-specific headers for debugging