Field processor API

API reference of provided field processors.

Built-in field processors

zyte_common_items.processors.brand_processor(value: Union[Selector, HtmlElement], page: Any) Any

Convert the data into a brand name if possible.

Supported inputs are Selector, SelectorList and HtmlElement. Other inputs are returned as is.

zyte_common_items.processors.breadcrumbs_processor(value: Any, page: Any) Any

Convert the data into a list of Breadcrumb objects if possible.

Supported inputs are Selector, SelectorList, HtmlElement and an iterable of zyte_parsers.Breadcrumb objects. Other inputs are returned as is.

zyte_common_items.processors.description_processor(value: Any, page: Any) Any

Convert the data into a cleaned up text if possible.

Uses the clear-html library.

Supported inputs are Selector, SelectorList and HtmlElement. Other inputs are returned as is.

Puts the cleaned HtmlElement object into page._description_node and the cleaned text into page._description_str.

zyte_common_items.processors.description_html_processor(value: Union[Selector, HtmlElement], page: Any) Any

Convert the data into a cleaned up HTML if possible.

Uses the clear-html library.

Supported inputs are Selector, SelectorList and HtmlElement. Other inputs are returned as is.

Puts the cleaned HtmlElement object into page._descriptionHtml_node.

zyte_common_items.processors.gtin_processor(value: Union[SelectorList, Selector, HtmlElement, str], page: Any) Any

Convert the data into a list of Gtin objects if possible.

Supported inputs are str, Selector, SelectorList, HtmlElement, an iterable of str and an iterable of zyte_parsers.Gtin objects. Other inputs are returned as is.

zyte_common_items.processors.price_processor(value: Union[Selector, HtmlElement], page: Any) Any

Convert the data into a price string if possible.

Uses the price-parser library.

Supported inputs are Selector, SelectorList and HtmlElement. Other inputs are returned as is.

Puts the parsed Price object into page._parsed_price.

zyte_common_items.processors.rating_processor(value: Any, page: Any) Any

Convert the data into an AggregateRating object if possible.

Supported inputs are selector-like objects (Selector, SelectorList, or HtmlElement).

The input can also be a dictionary with one or more of the AggregateRating fields as keys. The values for those keys can be either final values, to be assigned to the corresponding fields, or selector-like objects.

If a returning dictionary is missing the bestRating field and ratingValue is a selector-like object, bestRating may be extracted.

For example, for the following input HTML:

<span class="rating">3.8 out of 5 stars</span>
<a class="reviews">See all 7 reviews</a>

You can use:

@field
def aggregateRating(self):
    return {
        "ratingValue": self.css(".rating"),
        "reviewCount": self.css(".reviews"),
    }

To get:

AggregateRating(
    bestRating=5.0,
    ratingValue=3.8,
    reviewCount=7,
)
zyte_common_items.processors.simple_price_processor(value: Union[Selector, HtmlElement], page: Any) Any

Convert the data into a price string if possible.

Uses the price-parser library.

Supported inputs are Selector, SelectorList and HtmlElement. Other inputs are returned as is.