Integrate ZenRows with an Existing Price Monitoring System

Replace your current web scraping solution with ZenRows to improve reliability, bypass anti-bot protection, and reduce maintenance overhead. This guide shows you how to integrate ZenRows into existing price monitoring systems.

Who Is this for?

This guide is for developers who already have a price monitoring system and want to integrate ZenRows to improve scraping reliability and reduce blocking issues.

What you’ll learn

Replace existing HTTP clients with ZenRows requests
Migrate from HTML parsing to CSS extraction
Optimize performance with concurrency controls
Monitor and control scraping costs
Scale across multiple regions

Prerequisites

An existing price monitoring system
A ZenRows API key (SIGN UP HERE)
Basic understanding of web scraping concepts

Integration Approaches

Choose the integration approach that best fits your current system:

Approach 1: Minimal Integration (HTTP Client Replacement)

Replace your current HTTP client with ZenRows while keeping existing parsing logic. Before (typical implementation):

Python

import requests
from bs4 import BeautifulSoup

def get_page_content(url):
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
        "Accept-Language": "en-US,en;q=0.9",
    }
    response = requests.get(url, headers=headers)
    
    if response.status_code != 200:
        raise Exception(f"Request failed: {response.status_code}")
    
    return response.content

def scrape_price_data(url):
    html_content = get_page_content(url)
    soup = BeautifulSoup(html_content, "html.parser")
    
    # Extract data using BeautifulSoup
    name_elem = soup.select_one("#productTitle")
    price_elem = soup.select_one("span.aok-offscreen")
    
    return {
        "name": name_elem.get_text(strip=True) if name_elem else None,
        "price": price_elem.get_text(strip=True) if price_elem else None,
    }

After (ZenRows integration):

Python

import requests
from bs4 import BeautifulSoup

def get_page_content(url):
    params = {
        "url": url,
        "apikey": "YOUR_ZENROWS_API_KEY",
        "js_render": "true",
        "premium_proxy": "true",
        "proxy_country": "us",
        "wait": 2000,
    }
    response = requests.get("https://api.zenrows.com/v1/", params=params)
    
    if response.status_code != 200:
        raise Exception(f"Request failed: {response.status_code}")
    
    return response.text

# Keep existing parsing logic unchanged
def scrape_price_data(url):
    html_content = get_page_content(url)
    soup = BeautifulSoup(html_content, "html.parser")
    
    # Same parsing logic as before
    name_elem = soup.select_one("#productTitle")
    price_elem = soup.select_one("span.aok-offscreen")
    
    return {
        "name": name_elem.get_text(strip=True) if name_elem else None,
        "price": price_elem.get_text(strip=True) if price_elem else None,
    }

Benefits:

Minimal code changes required
Immediate anti-bot protection
Keep existing data processing logic
Easy rollback if needed

Approach 2: Full Integration (CSS Extraction)

Replace both HTTP client and HTML parsing with ZenRows CSS extraction for cleaner, more maintainable code. Before:

Python

import requests
from bs4 import BeautifulSoup

def scrape_price_data(url):
    # HTTP request + HTML parsing
    headers = {"User-Agent": "Mozilla/5.0..."}
    response = requests.get(url, headers=headers)
    soup = BeautifulSoup(response.content, "html.parser")
    
    # Manual element extraction
    name_elem = soup.select_one("#productTitle")
    price_elem = soup.select_one("span.aok-offscreen")
    rating_elem = soup.select_one("span.a-size-base.a-color-base")
    
    return {
        "name": name_elem.get_text(strip=True) if name_elem else None,
        "price": price_elem.get_text(strip=True) if price_elem else None,
        "rating": rating_elem.get_text(strip=True) if rating_elem else None,
    }

After:

Python

import requests
import json

def scrape_price_data(url):
    # Define CSS selectors
    css_extractor = json.dumps({
        "name": "#productTitle",
        "price": "span.aok-offscreen",
        "rating": "span.a-size-base.a-color-base",
    })
    
    # Single ZenRows request with extraction
    params = {
        "url": url,
        "apikey": "YOUR_ZENROWS_API_KEY",
        "js_render": "true",
        "premium_proxy": "true",
        "proxy_country": "us",
        "css_extractor": css_extractor,
    }
    
    response = requests.get("https://api.zenrows.com/v1/", params=params)
    
    if response.status_code != 200:
        raise Exception(f"Request failed: {response.status_code}")
    
    return response.json()

Benefits:

Eliminates HTML parsing dependencies
Cleaner, more maintainable code
Built-in data extraction
Reduced code complexity

Step-by-Step Integration

Assess Your Current Implementation

Before integrating ZenRows, analyze your current scraping setup to choose the best integration approach. Identify your current components:

HTTP client (requests, urllib, etc.)
HTML parser (BeautifulSoup, lxml, etc.)
Data extraction logic
Error handling mechanisms
Proxy management (if any)

Common patterns to look for:

Python

# Pattern 1: Simple requests + BeautifulSoup
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.content, "html.parser")

# Pattern 2: Session-based scraping
session = requests.Session()
session.headers.update(headers)
response = session.get(url)

# Pattern 3: Selenium-based scraping
driver = webdriver.Chrome()
driver.get(url)
element = driver.find_element(By.CSS_SELECTOR, selector)

# Pattern 4: Custom proxy rotation
proxies = {"http": proxy_url, "https": proxy_url}
response = requests.get(url, proxies=proxies)

Integration complexity assessment:

Low complexity: Simple requests + BeautifulSoup → Use Approach 1
Medium complexity: Custom headers/sessions → Use Approach 1 or 2
High complexity: Selenium/complex proxy logic → Use Approach 2

Replace HTTP Client

Start with the minimal integration approach by replacing your HTTP client with ZenRows.For requests-based systems:

Python

# Original function
def fetch_page(url, headers=None, proxies=None):
    response = requests.get(url, headers=headers, proxies=proxies)
    return response.text

# ZenRows replacement
def fetch_page(url, apikey, country="us"):
    params = {
        "url": url,
        "apikey": apikey,
        "js_render": "true",
        "premium_proxy": "true",
        "proxy_country": country,
    }
    response = requests.get("https://api.zenrows.com/v1/", params=params)
    return response.text

For Puppeteer/Playwright-based systems:

JavaScript

// Original Puppeteer approach
const puppeteer = require('puppeteer');

async function scrapeWithPuppeteer(url) {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto(url, { waitUntil: 'networkidle2' });
    
    const html = await page.content();
    await browser.close();
    return html;
}

// ZenRows Scraping Browser replacement
const puppeteer = require('puppeteer-core');

async function scrapeWithZenRows(url) {
    const browser = await puppeteer.connect({
        browserWSEndpoint: 'wss://browser.zenrows.com?apikey=YOUR_ZENROWS_API_KEY'
    });
    
    const page = await browser.newPage();
    await page.goto(url, { waitUntil: 'networkidle2' });
    
    const html = await page.content();
    await browser.close();
    return html;
}

Python

# Original Playwright approach
from playwright.sync_api import sync_playwright

def scrape_with_playwright(url):
    with sync_playwright() as p:
        browser = p.chromium.launch()
        page = browser.new_page()
        page.goto(url)
        html = page.content()
        browser.close()
        return html

# ZenRows Scraping Browser replacement
from playwright.sync_api import sync_playwright

def scrape_with_zenrows_browser(url):
    with sync_playwright() as p:
        browser = p.chromium.connect_over_cdp(
            "wss://browser.zenrows.com?apikey=YOUR_ZENROWS_API_KEY"
        )
        page = browser.new_page()
        page.goto(url)
        html = page.content()
        browser.close()
        return html

Migrate to CSS Extraction (Optional)

For cleaner code and better maintainability, replace HTML parsing with ZenRows CSS extraction.Identify your current selectors:

Python

# Current BeautifulSoup selectors
soup = BeautifulSoup(html, "html.parser")
name = soup.select_one("#productTitle").get_text(strip=True)
price = soup.select_one("span.aok-offscreen").get_text(strip=True)
rating = soup.select_one("span.a-size-base").get_text(strip=True)
reviews = soup.select_one("#acrCustomerReviewText").get_text(strip=True)

Convert to CSS extractor:

Python

import json

# Define CSS extractor
css_extractor = json.dumps({
    "name": "#productTitle",
    "price": "span.aok-offscreen", 
    "rating": "span.a-size-base",
    "reviews": "#acrCustomerReviewText",
})

def scrape_with_css_extractor(url, apikey):
    params = {
        "url": url,
        "apikey": apikey,
        "js_render": "true",
        "premium_proxy": "true",
        "css_extractor": css_extractor,
    }
    response = requests.get("https://api.zenrows.com/v1/", params=params)
    return response.json()  # Returns structured data directly

Migration Checklist

Pre-Migration

Document current scraping logic and selectors
Identify proxy and header requirements
Test ZenRows with sample requests
Plan rollback strategy

During Migration

Replace HTTP client with ZenRows API calls
Update error handling for ZenRows responses
Test with production URLs
Monitor request costs and concurrency

Post-Migration

Remove old proxy management code
Clean up unused HTML parsing dependencies
Update monitoring and alerting
Document new ZenRows integration

Best Practices

Cost Management

Set daily cost limits to prevent unexpected charges
Monitor request costs with the X-Request-Cost header
Use Concurrency-Limit and Concurrency-Remaining response headers to optimize throughput and avoid IP block errors
Cache results when appropriate to reduce requests

Reliability

Implement retry logic with exponential backoff
Monitor selector stability and update as needed
Use fallback selectors for critical data points
Log all requests responses and errors for debugging and monitoring

Performance

Use appropriate concurrency based on your plan limits
Batch requests when monitoring multiple products
Leverage geographic targeting for region-specific data

Troubleshooting

Selector Compatibility

Test selectors in ZenRows Request Playground before migration
Some selectors may behave differently with JavaScript rendering
Use more specific selectors if extraction returns unexpected data

Cost Optimization

Disable js_render for static content to reduce costs
Use wait_for instead of wait when possible
Monitor Concurrency-Limit and Concurrency-Remaining response headers to maximize throughput

Error Handling

ZenRows returns different error codes. See the API Error Codes documentation for more details.
Implement specific handling for rate limits (429) and quota exceeded (402)
Log ZenRows-specific headers for debugging

Monitoring & Tracking

Real Estate

General Topics

Integrate ZenRows with an Existing Price Monitoring System

Who Is this for?

What you’ll learn

Prerequisites

Integration Approaches

Approach 1: Minimal Integration (HTTP Client Replacement)

Approach 2: Full Integration (CSS Extraction)

Step-by-Step Integration

Migration Checklist

Best Practices

Troubleshooting

Monitoring & Tracking

Real Estate

General Topics

Documentation Index

​Who Is this for?

​What you’ll learn

​Prerequisites

​Integration Approaches

​Approach 1: Minimal Integration (HTTP Client Replacement)

​Approach 2: Full Integration (CSS Extraction)

​Step-by-Step Integration

​Migration Checklist

​Best Practices

​Troubleshooting

Who Is this for?

What you’ll learn

Prerequisites

Integration Approaches

Approach 1: Minimal Integration (HTTP Client Replacement)

Approach 2: Full Integration (CSS Extraction)

Step-by-Step Integration

Migration Checklist

Best Practices

Troubleshooting