CSS Selectors and HTML Parsing

ZenRows® provides multiple ways to extract and format data from web pages. You can use CSS Selectors for direct JSON extraction, apply output filters for data transformation, or retrieve raw HTML for custom processing. This guide covers three main approaches to data extraction with ZenRows.

Using CSS Selectors
Using Output Filters
Using External Libraries

CSS Selectors are a query language for selecting HTML elements. When you enable the css_extractor parameter, ZenRows returns structured JSON data instead of raw HTML.Let’s say you want to scrape the title from the ScrapingCourse eCommerce page. The title is contained in an h1 tag.To extract it, send the css_extractor parameter with the value {"title": "h1"}. Make sure the parameter is properly encoded!

import requests

api_key = "YOUR_ZENROWS_API_KEY"
url = "https://www.scrapingcourse.com/ecommerce/"
css_extractor = {"title": "h1"}

response = requests.get(
    "https://api.zenrows.com/v1/",
    params={
        "apikey": api_key,
        "url": url,
        "css_extractor": css_extractor
    }
)

print(response.json())

This code sends a request to ZenRows with the CSS selector h1 mapped to the key “title”. ZenRows extracts the content from the first h1 element and returns it as structured JSON data.

Extracting Multiple Elements

Now let’s extract multiple elements. Add the product names using the selector .product-name:

import requests

api_key = "YOUR_ZENROWS_API_KEY"
url = "https://www.scrapingcourse.com/ecommerce/"
css_extractor = {
    "title": "h1",
    "products": ".product-name"
}

response = requests.get(
    "https://api.zenrows.com/v1/",
    params={
        "apikey": api_key,
        "url": url,
        "css_extractor": css_extractor
    }
)

print(response.json())

This request extracts both the page title and all product names. When a CSS selector matches multiple elements, ZenRows automatically returns them as an array.The response looks like this:

{
    "title": "E-commerce Products",
    "products": [
        "Product 1",
        "Product 2",
        "Product 3"
        // ...
    ]
}

Extracting attributes

You might need product links to continue scraping individual product details. To extract the href attribute instead of text content, add @href to your selector.Let’s filter links to only include those starting with /product/:

import requests

api_key = "YOUR_ZENROWS_API_KEY"
url = "https://www.scrapingcourse.com/ecommerce/"
css_extractor = {
    "title": "h1",
    "products": ".product-name",
    "links": "a[href*='/product/'] @href"
}

response = requests.get(
    "https://api.zenrows.com/v1/",
    params={
        "apikey": api_key,
        "url": url,
        "css_extractor": css_extractor
    }
)

print(response.json())

The @href syntax tells ZenRows to extract the href attribute value instead of the element’s text content. The [href*='/product/'] part filters links to only include those containing /product/ in their href attribute.This returns:

{
    "title": "Shop",
    "products": [
        "Product 1",
        "Product 2",
        "Product 3"
        // ...
    ],
    "links": [
        "/product/1",
        "/product/2",
        "/product/3"
        // ...
    ]
}

The outputs parameter extracts predefined data types from scraped HTML. This allows you to efficiently retrieve only the data types you need, reducing processing time and focusing on relevant information.The parameter accepts a comma-separated list of filter names and returns results in structured JSON format.

Use outputs=* to retrieve all available data types.

Extracting page structure

Get headings, links, and menu items to understand page structure:

import requests
api_key = "YOUR_ZENROWS_API_KEY"
url = "https://www.scrapingcourse.com/ecommerce/"

response = requests.get(
    "https://api.zenrows.com/v1/",
    params={
        "apikey": api_key,
        "url": url,
        "outputs": "headings,links,menus"
    }
)

print(response.json())

This extracts heading text from h1 through h6 elements, URLs from a tags, and menu items from li elements inside menu tags.

Extracting media content

Get all images, videos, and audio files from a page:

import requests
api_key = "YOUR_ZENROWS_API_KEY"
url = "https://www.scrapingcourse.com/ecommerce/"

response = requests.get(
    "https://api.zenrows.com/v1/",
    params={
        "apikey": api_key,
        "url": url,
        "outputs": "images,videos,audios"
    }
)

print(response.json())

This extracts image sources from img tags, video sources from source elements inside video tags, and audio sources from source elements inside audio tags.

For complete output filter options, see our Output Filters Documentation page.

If you prefer using your favorite HTML parsing library, you can retrieve raw HTML from ZenRows and process it with tools like BeautifulSoup or Cheerio.

Python with BeautifulSoup

scraper.py

# pip install requests beautifulsoup4
import requests
from bs4 import BeautifulSoup

zenrows_api_base = "https://api.zenrows.com/v1/?apikey=YOUR_ZENROWS_API_KEY"
url = "https://www.scrapingcourse.com/ecommerce/"

response = requests.get(zenrows_api_base, params={'url': url})
soup = BeautifulSoup(response.text, "html.parser")

title = soup.find("h1").text
products = [product.text for product in soup.select(".product-title")]
links = [link.get("href") for link in soup.select("a[href^='/product/']")]

result = {
    "title": title,
    "products": products,
    "links": links,
}
print(result)

This approach gives you full control over HTML parsing. You first retrieve the raw HTML from ZenRows, then use BeautifulSoup to parse and extract the data you need.

JavaScript with Cheerio

scraper.js

// npm i axios cheerio
const axios = require("axios");
const cheerio = require("cheerio");

const zenrows_api_base = "https://api.zenrows.com/v1/?apikey=YOUR_ZENROWS_API_KEY";
const url = "https://www.scrapingcourse.com/ecommerce/";

axios
    .get(zenrows_api_base, { params: { url } })
    .then((response) => {
        const $ = cheerio.load(response.data);

        const title = $("h1").text();
        const products = $(".product-title")
            .map((_, a) => $(a).text())
            .toArray();
        const links = $("a[href^='/product/']")
            .map((_, a) => $(a).attr("href"))
            .toArray();

        console.log({ title, products, links });
    })
    .catch((error) => console.log(error));

Cheerio provides a jQuery-like interface for server-side HTML manipulation. This example shows how to extract the same data using Cheerio’s familiar syntax.Both approaches—CSS extractors and external libraries—offer flexibility for different use cases. CSS extractors provide immediate JSON output, while external libraries give you more control over complex parsing logic.

Testing Your Selectors

Before implementing your scraper at scale, test your CSS selectors using our Playground. The Playground shows you the extracted data in real-time and generates code in multiple programming languages for easy integration.

For more details, check CSS Selectors documentation.

When to use each method

Choose your data extraction method based on your specific needs:

CSS Selectors - Best for custom data extraction when you know exactly what elements you need. Returns clean JSON data with your own key names and structure.
Output Filters - Ideal for extracting common data types like emails, phone numbers, images, and links. Perfect when you need standard web data without custom parsing.
External Libraries - Perfect when you need complex parsing logic, custom data transformations, or when integrating with existing parsing workflows.

Monitoring & Tracking

Real Estate

General Topics

CSS Selectors and HTML Parsing

Extracting Multiple Elements

Extracting attributes

Extracting page structure

Extracting media content

Python with BeautifulSoup

JavaScript with Cheerio

Testing Your Selectors

When to use each method

Further Reading

Monitoring & Tracking

Real Estate

General Topics

Documentation Index

​Extracting Multiple Elements

​Extracting attributes

​Extracting page structure

​Extracting media content

​Python with BeautifulSoup

​JavaScript with Cheerio

​Testing Your Selectors

​When to use each method

​Further Reading

Extracting Multiple Elements

Extracting attributes

Extracting page structure

Extracting media content

Python with BeautifulSoup

JavaScript with Cheerio

Testing Your Selectors

When to use each method

Further Reading