Field processor API
API reference of provided field processors.
Built-in field processors
- zyte_common_items.processors.brand_processor(value: Any, page: Any) Any
Convert the data into a brand name if possible.
If inputs are either
Selector,SelectorListorHtmlElement, attempts to extract brand data from it.If value is a string, uses it to create a
Brandinstance.Other inputs are returned unchanged.
- zyte_common_items.processors.breadcrumbs_processor(value: Any, page: Any) Any
Convert the data into a list of
Breadcrumbobjects if possible.Supported inputs are
Selector,SelectorList,HtmlElementand an iterable ofzyte_parsers.Breadcrumbobjects. Other inputs are returned as is.
- zyte_common_items.processors.description_processor(value: Any, page: Any) Any
Convert the data into a cleaned up text if possible.
Uses the clear-html library.
Supported inputs are
Selector,SelectorListandHtmlElement. Other inputs are returned as is.Puts the cleaned HtmlElement object into
page._description_nodeand the cleaned text intopage._description_str.
- zyte_common_items.processors.description_html_processor(value: Selector | HtmlElement, page: Any) Any
Convert the data into a cleaned up HTML if possible.
Uses the clear-html library.
Supported inputs are
Selector,SelectorListandHtmlElement. Other inputs are returned as is.Puts the cleaned HtmlElement object into
page._descriptionHtml_node.
- zyte_common_items.processors.gtin_processor(value: SelectorList | Selector | HtmlElement | str, page: Any) Any
Convert the data into a list of
Gtinobjects if possible.Supported inputs are
str,Selector,SelectorList,HtmlElement, an iterable ofstrand an iterable ofzyte_parsers.Gtinobjects. Other inputs are returned as is.
- zyte_common_items.processors.images_processor(value: Any, page: Any) Any
Convert the data into a list of
Imageobjects if possible.If the input is a string, it’s used as a url for returning image object.
If input is either an iterable of strings or mappings with “url” key, they are used to populate image objects.
Other inputs are returned unchanged.
- zyte_common_items.processors.metadata_processor(metadata: BaseMetadata | None, page)
Processor for a metadata field that ensures that the output metadata object uses the metadata class declared by page.
- zyte_common_items.processors.price_processor(value: Any, page: Any) Any
Convert the data into a price string if possible.
Uses the price-parser library.
Supported inputs are
Selector,SelectorList,HtmlElementand numeric values.Other inputs are returned as is.
Puts the parsed Price object into
page._parsed_price.
- zyte_common_items.processors.rating_processor(value: Any, page: Any) Any
Convert the data into an
AggregateRatingobject if possible.Supported inputs are selector-like objects (
Selector,SelectorList, orHtmlElement).The input can also be a dictionary with one or more of the
AggregateRatingfields as keys. The values for those keys can be either final values, to be assigned to the corresponding fields, or selector-like objects.If a returning dictionary is missing the
bestRatingfield andratingValueis a selector-like object,bestRatingmay be extracted.For example, for the following input HTML:
<span class="rating">3.8 out of 5 stars</span> <a class="reviews">See all 7 reviews</a>
You can use:
@field def aggregateRating(self): return { "ratingValue": self.css(".rating"), "reviewCount": self.css(".reviews"), }
To get:
AggregateRating( bestRating=5.0, ratingValue=3.8, reviewCount=7, )
- zyte_common_items.processors.simple_price_processor(value: Any, page: Any) Any
Convert the data into a price string if possible.
Uses the price-parser library.
Supported inputs are
Selector,SelectorList,HtmlElementand numeric values.Other inputs are returned as is.