Item API

Product

class zyte_common_items.Product(**kwargs)

Product from an e-commerce website.

url is the only required attribute.

classmethod from_dict(item: Optional[Dict]): Read an item from a dictionary.

classmethod from_list(items: Optional[List[Dict]], *, trail: Optional[str] = None) → List: Read items from a list.

get_probability() → Optional[float]: Returns the item probability if available, otherwise None.

additionalProperties: Optional[List[AdditionalProperty]]

List of name-value pais of data about a specific, otherwise unmapped feature.

Additional properties usually appear in product pages in the form of a specification table or a free-form specification list.

Additional properties that require 1 or more extra requests may not be extracted.

See also url.

color: Optional[str]

Color.

It is extracted as displayed (e.g. "white").

See also url.

color: Optional[str]

Color.

It is extracted as displayed (e.g. "white").

Product list

class zyte_common_items.ProductList(**kwargs)

Product list from a product listing page of an e-commerce webpage.

It represents, for example, a single page from a category.

url is the only required attribute.

classmethod from_dict(item: Optional[Dict]): Read an item from a dictionary.

classmethod from_list(items: Optional[List[Dict]], *, trail: Optional[str] = None) → List: Read items from a list.

get_probability() → Optional[float]: Returns the item probability if available, otherwise None.

breadcrumbs: Optional[List[Breadcrumb]]: Webpage breadcrumb trail.

canonicalUrl: Optional[str]

Canonical form of the URL, as indicated by the website.

See also url.

categoryName: Optional[str]

Name of the product listing as it appears on the webpage (no post-processing).

For example, if the webpage is one of the pages of the Robots category, categoryName is 'Robots'.

metadata: Optional[ProductListMetadata]: Data extraction process metadata.

pageNumber: Optional[int]

Current page number, if displayed explicitly on the list page.

Numeration starts with 1.

paginationNext: Optional[Link]: Link to the next page.

products: Optional[List[ProductFromList]]

List of products.

It only includes product information found in the product listing page itself. Product information that requires visiting each product URL is not meant to be covered.

The order of the products reflects their position on the rendered page. Product order is top-to-bottom, and left-to-right or right-to-left depending on the webpage locale.

url: str

Main URL from which the data has been extracted.

Product navigation

class zyte_common_items.ProductNavigation(**kwargs)

Represents the navigational aspects of a product listing page on an e-commerce website

classmethod from_dict(item: Optional[Dict]): Read an item from a dictionary.

classmethod from_list(items: Optional[List[Dict]], *, trail: Optional[str] = None) → List: Read items from a list.

get_probability() → Optional[float]: Returns the item probability if available, otherwise None.

categoryName: Optional[str]

Name of the category/page with the product list.

Format:

trimmed (no whitespace at the beginning or the end of the description string)

items: Optional[List[ProbabilityRequest]]: List of product links found on the page category ordered by their position in the page.

metadata: Optional[ProductNavigationMetadata]: Data extraction process metadata.

nextPage: Optional[Request]: A link to the next page, if available.

pageNumber: Optional[int]

Number of the current page.

It should only be extracted if the webpage shows a page number.

It must be 1-based. For example, if the first page of a listing is numbered as 0 on the website, it should be extracted as 1 nonetheless.

subCategories: Optional[List[ProbabilityRequest]]: List of sub-category links ordered by their position in the page.

url: str: Main URL from which the data is extracted.

class zyte_common_items.ProductNavigationMetadata(**kwargs)

Metadata class for zyte_common_items.ProductNavigation.metadata.

dateDownloaded: Optional[str]: Date and time when the product data was downloaded, in UTC timezone and the following format: YYYY-MM-DDThh:mm:ssZ.

validationMessages: Optional[Dict[str, List[str]]]: Contains paths to fields with the description of issues found with their values.

Article

class zyte_common_items.Article(**kwargs)

Article, typically seen on online news websites, blogs, or announcement sections.

url is the only required attribute.

classmethod from_dict(item: Optional[Dict]): Read an item from a dictionary.

classmethod from_list(items: Optional[List[Dict]], *, trail: Optional[str] = None) → List: Read items from a list.

get_probability() → Optional[float]: Returns the item probability if available, otherwise None.

articleBody: Optional[str]

Clean text of the article, including sub-headings, with newline separators.

Format:

trimmed (no whitespace at the beginning or the end of the body string),
line breaks included,
no length limit,
no normalization of Unicode characters.

articleBodyHtml: Optional[str]

Simplified and standardized HTML of the article, including sub-headings, image captions and embedded content (videos, tweets, etc.).

Format: HTML string normalized in a consistent way.

audios: Optional[List[Audio]]: All audios.

authors: Optional[List[Author]]: All authors of the article.

breadcrumbs: Optional[List[Breadcrumb]]: Webpage breadcrumb trail.

canonicalUrl: Optional[str]

Canonical form of the URL, as indicated by the website.

See also url.

dateModified: Optional[str]

Date when the article was most recently modified.

Format: ISO 8601 format: “YYYY-MM-DDThh:mm:ssZ” or “YYYY-MM-DDThh:mm:ss±zz:zz”.

With timezone, if available.

dateModifiedRaw: Optional[str]: Same date as dateModified, but :before parsing/normalization, i.e. as it appears on the website.

datePublished: Optional[str]

Publication date of the article.

Format: ISO 8601 format: “YYYY-MM-DDThh:mm:ssZ” or “YYYY-MM-DDThh:mm:ss±zz:zz”.

With timezone, if available.

If the actual publication date is not found, the value of dateModified is used instead.

datePublishedRaw: Optional[str]: Same date as datePublished, but :before parsing/normalization, i.e. as it appears on the website.

description: Optional[str]

A short summary of the article.

It can be either human-provided (if available), or auto-generated.

headline: Optional[str]: Headline or title.

images: Optional[List[Image]]: All images.

inLanguage: Optional[str]

Language of the article, as an ISO 639-1 language code.

Sometimes the article language is not the same as the web page overall language.

mainImage: Optional[Image]: Main image.

metadata: Optional[ArticleMetadata]: Data extraction process metadata.

url: str

The main URL of the article page.

The URL of the final response, after any redirects.

Required attribute.

In case there is no article data on the page or the page was not reached, the returned “empty” item would still contain this URL field.

videos: Optional[List[Video]]: All videos.

class zyte_common_items.ArticleMetadata(**kwargs)

Metadata class for zyte_common_items.Article.metadata.

dateDownloaded: Optional[str]: Date and time when the product data was downloaded, in UTC timezone and the following format: YYYY-MM-DDThh:mm:ssZ.

probability: Optional[float]

The probability (0 for 0%, 1 for 100%) that the resource features the expected data type.

For example, if the extraction of a product from a given URL is requested, and that URL points to the webpage of a product with complete certainty, the value should be 1. If with complete certainty the webpage features a job listing instead of a product, the value should be 0. When there is no complete certainty, the value could be anything in between (e.g. 0.96).

validationMessages: Optional[Dict[str, List[str]]]: Contains paths to fields with the description of issues found with their values.

Article list

class zyte_common_items.ArticleList(**kwargs)

Article list from an article listing page.

The url attribute is the only required attribute, all other fields are optional.

classmethod from_dict(item: Optional[Dict]): Read an item from a dictionary.

classmethod from_list(items: Optional[List[Dict]], *, trail: Optional[str] = None) → List: Read items from a list.

get_probability() → Optional[float]: Returns the item probability if available, otherwise None.

articles: Optional[List[ArticleFromList]]

List of article details found on the page.

The order of the articles reflects their position on the page.

breadcrumbs: Optional[List[Breadcrumb]]: Webpage breadcrumb trail.

canonicalUrl: Optional[str]

Canonical form of the URL, as indicated by the website.

See also url.

metadata: Optional[ArticleListMetadata]: Data extraction process metadata.

url: str

The main URL of the article list.

The URL of the final response, after any redirects.

Required attribute.

In case there is no article list data on the page or the page was not reached, the returned item still contain this URL field and all the other available datapoints.

class zyte_common_items.ArticleFromList(**kwargs)

Article from an article list from an article listing page.

See ArticleList.

classmethod from_dict(item: Optional[Dict]): Read an item from a dictionary.

classmethod from_list(items: Optional[List[Dict]], *, trail: Optional[str] = None) → List: Read items from a list.

get_probability() → Optional[float]: Returns the item probability if available, otherwise None.

articleBody: Optional[str]

Clean text of the article, including sub-headings, with newline separators.

Format:

trimmed (no whitespace at the beginning or the end of the body string),
line breaks included,
no length limit,
no normalization of Unicode characters.

authors: Optional[List[Author]]: All authors of the article.

datePublished: Optional[str]

Publication date of the article.

Format: ISO 8601 format: “YYYY-MM-DDThh:mm:ssZ” or “YYYY-MM-DDThh:mm:ss±zz:zz”.

With timezone, if available.

If the actual publication date is not found, the date of the last modification is used instead.

datePublishedRaw: Optional[str]: Same date as datePublished, but :before parsing/normalization, i.e. as it appears on the website.

headline: Optional[str]: Headline or title.

images: Optional[List[Image]]: All images.

inLanguage: Optional[str]

Language of the article, as an ISO 639-1 language code.

Sometimes the article language is not the same as the web page overall language.

mainImage: Optional[Image]: Main image.

metadata: Optional[ProbabilityMetadata]: Data extraction process metadata.

url: Optional[str]: Main URL.

class zyte_common_items.ArticleListMetadata(**kwargs)

Metadata class for zyte_common_items.ArticleList.metadata.

dateDownloaded: Optional[str]: Date and time when the product data was downloaded, in UTC timezone and the following format: YYYY-MM-DDThh:mm:ssZ.

validationMessages: Optional[Dict[str, List[str]]]: Contains paths to fields with the description of issues found with their values.

Article navigation

class zyte_common_items.ArticleNavigation(**kwargs)

Represents the navigational aspects of an article listing webpage.

See ArticleList.

classmethod from_dict(item: Optional[Dict]): Read an item from a dictionary.

classmethod from_list(items: Optional[List[Dict]], *, trail: Optional[str] = None) → List: Read items from a list.

get_probability() → Optional[float]: Returns the item probability if available, otherwise None.

categoryName: Optional[str]

Name of the category/page.

Format:

trimmed (no whitespace at the beginning or the end of the description string)

items: Optional[List[ProbabilityRequest]]: Links to listed items in order of appearance.

metadata: Optional[ArticleNavigationMetadata]: Data extraction process metadata.

nextPage: Optional[Request]: A link to the next page, if available.

pageNumber: Optional[int]

Number of the current page.

It should only be extracted if the webpage shows a page number.

It must be 1-based. For example, if the first page of a listing is numbered as 0 on the website, it should be extracted as 1 nonetheless.

subCategories: Optional[List[ProbabilityRequest]]: List of sub-category links ordered by their position in the page.

url: str: Main URL from which the data is extracted.

class zyte_common_items.ArticleNavigationMetadata(**kwargs)

Metadata class for zyte_common_items.ArticleNavigation.metadata.

dateDownloaded: Optional[str]: Date and time when the product data was downloaded, in UTC timezone and the following format: YYYY-MM-DDThh:mm:ssZ.

validationMessages: Optional[Dict[str, List[str]]]: Contains paths to fields with the description of issues found with their values.

Business place

class zyte_common_items.BusinessPlace(**kwargs)

Business place, with properties typically seen on maps or business listings.

url is the only required attribute.

classmethod from_dict(item: Optional[Dict]): Read an item from a dictionary.

classmethod from_list(items: Optional[List[Dict]], *, trail: Optional[str] = None) → List: Read items from a list.

get_probability() → Optional[float]: Returns the item probability if available, otherwise None.

actions: Optional[List[NamedLink]]: List of actions that can be performed directly from the URLs on the place page, including URLs.

additionalProperties: Optional[List[AdditionalProperty]]: List of name-value pais of any unmapped additional properties specific to the place.

address: Optional[Address]: The address details of the place.

aggregateRating: Optional[AggregateRating]: The overall rating, based on a collection of reviews or ratings.

amenityFeatures: Optional[List[Amenity]]: List of amenities of the place.

categories: Optional[List[str]]: List of categories the place belongs to.

containedInPlace: Optional[ParentPlace]: If the place is located inside another place, these are the details of the parent place.

description: Optional[str]

The description of the place.

Stripped of white spaces.

features: Optional[List[str]]: List of frequently mentioned features of this place.

images: Optional[List[Image]]: A list of URL values of all images of the place.

isVerified: Optional[bool]: If the information is verified by the owner of this place.

map: Optional[str]: URL to a map of the place.

metadata: Optional[BusinessPlaceMetadata]: Data extraction process metadata.

name: Optional[str]: The name of the place.

openingHours: Optional[List[OpeningHoursItem]]: Ordered specification of opening hours, including data for opening and closing time for each day of the week.

placeId: Optional[str]: Unique identifier of the place on the website.

priceRange: Optional[str]: How is the price range of the place viewed by its customers (from z to zzzz).

reservationAction: Optional[NamedLink]: The details of the reservation action, e.g. table reservation in case of restaurants or room reservation in case of hotels.

reviewSites: Optional[List[NamedLink]]: List of partner review sites.

starRating: Optional[StarRating]: Official star rating of the place.

tags: Optional[List[str]]: List of the tags associated with the place.

telephone: Optional[str]: The phone number associated with the place, as it appears on the page.

timezone: Optional[str]

Which timezone is the place situated in.

Standard: Name compliant with IANA tz database (tzdata).

url: Optional[str]

The main URL that the place data was extracted from.

The URL of the final response, after any redirects.

In case there is no product data on the page or the page was not reached, the returned “empty” item would still contain url field and metadata field with dateDownloaded.

website: Optional[str]: The URL pointing to the official website of the place.

class zyte_common_items.BusinessPlaceMetadata(**kwargs)

Metadata class for zyte_common_items.BusinessPlace.metadata.

dateDownloaded: Optional[str]: Date and time when the product data was downloaded, in UTC timezone and the following format: YYYY-MM-DDThh:mm:ssZ.

probability: Optional[float]

The probability (0 for 0%, 1 for 100%) that the resource features the expected data type.

For example, if the extraction of a product from a given URL is requested, and that URL points to the webpage of a product with complete certainty, the value should be 1. If with complete certainty the webpage features a job listing instead of a product, the value should be 0. When there is no complete certainty, the value could be anything in between (e.g. 0.96).

searchText: Optional[str]: The search text used to find the item.

validationMessages: Optional[Dict[str, List[str]]]: Contains paths to fields with the description of issues found with their values.

Real estate

class zyte_common_items.RealEstate(**kwargs)

Real state offer, typically seen on real estate offer aggregator websites.

url is the only required attribute.

classmethod from_dict(item: Optional[Dict]): Read an item from a dictionary.

classmethod from_list(items: Optional[List[Dict]], *, trail: Optional[str] = None) → List: Read items from a list.

get_probability() → Optional[float]: Returns the item probability if available, otherwise None.

additionalProperties: Optional[List[AdditionalProperty]]: A name-value pair field holding information pertaining to specific features. Usually in a form of a specification table or freeform specification list.

address: Optional[Address]: The details of the address of the real estate.

area: Optional[RealEstateArea]: Real estate area details.

breadcrumbs: Optional[List[Breadcrumb]]: Webpage breadcrumb trail.

currency: Optional[str]: The currency of the price, in 3-letter ISO 4217 format.

currencyRaw: Optional[str]: Currency associated with the price, as appears on the page (no post-processing).

datePublished: Optional[str]

Publication date of the real estate offer.

Format: ISO 8601 format: “YYYY-MM-DDThh:mm:ssZ”

With timezone, if available.

datePublishedRaw: Optional[str]: Same date as datePublished, but before parsing/normalization, i.e. as it appears on the website.

description: Optional[str]

The description of the real estate.

Format:

trimmed (no whitespace at the beginning or the end of the description string),
line breaks included,
no length limit,
no normalization of Unicode characters,
no concatenation of description from different parts of the page.

images: Optional[List[Image]]: A list of URL values of all images of the real estate.

mainImage: Optional[Image]: The details of the main image of the real estate.

metadata: Optional[RealEstateMetadata]: Contains metadata about the data extraction process.

name: Optional[str]: The name of the real estate.

numberOfBathroomsTotal: Optional[int]: The total number of bathrooms in the real estate.

numberOfBedrooms: Optional[int]: The number of bedrooms in the real estate.

numberOfFullBathrooms: Optional[int]: The number of full bathrooms in the real estate.

numberOfPartialBathrooms: Optional[int]: The number of partial bathrooms in the real estate.

numberOfRooms: Optional[int]: The number of rooms (excluding bathrooms and closets) of the real estate.

price: Optional[str]: The offer price of the real estate.

propertyType: Optional[str]: Type of the property, e.g. flat, house, land.

realEstateId: Optional[str]: The identifier of the real estate, usually assigned by the seller and unique within a website, similar to product SKU.

rentalPeriod: Optional[str]: The rental period to which the rental price applies, only available in case of rental. Usually weekly, monthly, quarterly, yearly.

tradeType: Optional[str]: Type of a trade action: buying or renting.

url: str: The url of the final response, after any redirects.

virtualTourUrl: Optional[str]: The URL of the virtual tour of the real estate.

yearBuilt: Optional[int]: The year the real estate was built.

class zyte_common_items.RealEstateMetadata(**kwargs)

Metadata class for zyte_common_items.RealEstate.metadata.

dateDownloaded: Optional[str]: Date and time when the product data was downloaded, in UTC timezone and the following format: YYYY-MM-DDThh:mm:ssZ.

probability: Optional[float]

The probability (0 for 0%, 1 for 100%) that the resource features the expected data type.

For example, if the extraction of a product from a given URL is requested, and that URL points to the webpage of a product with complete certainty, the value should be 1. If with complete certainty the webpage features a job listing instead of a product, the value should be 0. When there is no complete certainty, the value could be anything in between (e.g. 0.96).

validationMessages: Optional[Dict[str, List[str]]]: Contains paths to fields with the description of issues found with their values.

Job posting

class zyte_common_items.JobPosting(**kwargs)

A job posting, typically seen on job posting websites or websites of companies that are hiring.

url is the only required attribute.

classmethod from_dict(item: Optional[Dict]): Read an item from a dictionary.

classmethod from_list(items: Optional[List[Dict]], *, trail: Optional[str] = None) → List: Read items from a list.

get_probability() → Optional[float]: Returns the item probability if available, otherwise None.

baseSalary: Optional[BaseSalary]: The base salary of the job or of an employee in the proposed role.

dateModified: Optional[str]

The date when the job posting was most recently modified.

Format: ISO 8601 format: “YYYY-MM-DDThh:mm:ssZ”

With timezone, if available.

dateModifiedRaw: Optional[str]: Same date as dateModified, but before parsing/normalization, i.e. as it appears on the website.

datePublished: Optional[str]

Publication date of the job posting.

Format: ISO 8601 format: “YYYY-MM-DDThh:mm:ssZ”

With timezone, if available.

datePublishedRaw: Optional[str]: Same date as datePublished, but before parsing/normalization, i.e. as it appears on the website.

description: Optional[str]

A description of the job posting including sub-headings, with newline separators.

Format:

trimmed (no whitespace at the beginning or the end of the description string),
line breaks included,
no length limit,
no normalization of Unicode characters.

descriptionHtml: Optional[str]: Simplified HTML of the description, including sub-headings, image captions and embedded content.

employmentType: Optional[str]: Type of employment (e.g. full-time, part-time, contract, temporary, seasonal, internship).

headline: Optional[str]: The headline of the job posting.

hiringOrganization: Optional[HiringOrganization]: Information about the organization offering the job position.

jobLocation: Optional[JobLocation]: A (typically single) geographic location associated with the job position.

jobPostingId: Optional[str]: The identifier of the job posting.

jobStartDate: Optional[str]

Job start date

Format: ISO 8601 format: “YYYY-MM-DDThh:mm:ssZ”

With timezone, if available.

jobStartDateRaw: Optional[str]: Same date as jobStartDate, but before parsing/normalization, i.e. as it appears on the website.

jobTitle: Optional[str]: The title of the job posting.

metadata: Optional[JobPostingMetadata]: Contains metadata about the data extraction process.

remoteStatus: Optional[str]: Specifies the remote status of the position.

requirements: Optional[List[str]]: Candidate requirements for the job.

url: str: The url of the final response, after any redirects.

validThrough: Optional[str]

The date after which the job posting is not valid, e.g. the end of an offer.

Format: ISO 8601 format: “YYYY-MM-DDThh:mm:ssZ”

With timezone, if available.

validThroughRaw: Optional[str]: Same date as validThrough, but before parsing/normalization, i.e. as it appears on the website.

class zyte_common_items.JobPostingMetadata(**kwargs)

Metadata class for zyte_common_items.JobPosting.metadata.

dateDownloaded: Optional[str]: Date and time when the product data was downloaded, in UTC timezone and the following format: YYYY-MM-DDThh:mm:ssZ.

probability: Optional[float]

The probability (0 for 0%, 1 for 100%) that the resource features the expected data type.

For example, if the extraction of a product from a given URL is requested, and that URL points to the webpage of a product with complete certainty, the value should be 1. If with complete certainty the webpage features a job listing instead of a product, the value should be 0. When there is no complete certainty, the value could be anything in between (e.g. 0.96).

validationMessages: Optional[Dict[str, List[str]]]: Contains paths to fields with the description of issues found with their values.

Social media post

class zyte_common_items.SocialMediaPost(**kwargs)

Represents a single social media post.

classmethod from_dict(item: Optional[Dict]): Read an item from a dictionary.

classmethod from_list(items: Optional[List[Dict]], *, trail: Optional[str] = None) → List: Read items from a list.

get_probability() → Optional[float]: Returns the item probability if available, otherwise None.

author: Optional[SocialMediaPostAuthor]

Details of the author of the post.

No easily identifiable information can be contained in here, such as usernames.

datePublished: Optional[str]

The timestamp at which the post was created.

Format: Timezone: UTC. ISO 8601 format: “YYYY-MM-DDThh:mm:ssZ”

hashtags: Optional[List[str]]: The list of hashtags contained in the post.

mediaUrls: Optional[List[Url]]: The list of URLs of media files (images, videos, etc.) linked from the post.

metadata: Optional[SocialMediaPostMetadata]: Contains metadata about the data extraction process.

postId: Optional[str]: The identifier of the post.

reactions: Optional[Reactions]: Details of reactions to the post.

text: Optional[str]: The text content of the post.

url: str: The URL of the final response, after any redirects.

class zyte_common_items.SocialMediaPostMetadata(**kwargs)

Metadata class for zyte_common_items.SocialMediaPost.metadata.

dateDownloaded: Optional[str]: Date and time when the product data was downloaded, in UTC timezone and the following format: YYYY-MM-DDThh:mm:ssZ.

probability: Optional[float]

The probability (0 for 0%, 1 for 100%) that the resource features the expected data type.

For example, if the extraction of a product from a given URL is requested, and that URL points to the webpage of a product with complete certainty, the value should be 1. If with complete certainty the webpage features a job listing instead of a product, the value should be 0. When there is no complete certainty, the value could be anything in between (e.g. 0.96).

searchText: Optional[str]: The search text used to find the item.

validationMessages: Optional[Dict[str, List[str]]]: Contains paths to fields with the description of issues found with their values.

Search Request templates

class zyte_common_items.SearchRequestTemplate(**kwargs)

Request template to build a search Request.

classmethod from_dict(item: Optional[Dict]): Read an item from a dictionary.

classmethod from_list(items: Optional[List[Dict]], *, trail: Optional[str] = None) → List: Read items from a list.

get_probability() → Optional[float]: Returns the item probability if available, otherwise None.

request(*, keyword: str) → Request: Return a Request to search for keyword.

body: Optional[str]

Jinja template for Request.body.

It must be a plain str, not bytes or a Base64-encoded str. Base64-encoding is done by request() after rendering this value as a Jinja template.

Defining a non-UTF-8 body is not supported.

headers: Optional[List[Header]]

List of Header, for Request.headers, where every name and value is a Jinja template.

When a header name template renders into an empty string (after stripping spacing), that header is removed from the resulting list of headers.

metadata: Optional[SearchRequestTemplateMetadata]: Data extraction process metadata.

method: str: Jinja template for Request.method.

url: str: Jinja template for Request.url.

class zyte_common_items.SearchRequestTemplateMetadata(**kwargs)

Metadata class for zyte_common_items.SearchRequestTemplate.metadata.

dateDownloaded: Optional[str]: Date and time when the product data was downloaded, in UTC timezone and the following format: YYYY-MM-DDThh:mm:ssZ.

probability: Optional[float]

The probability (0 for 0%, 1 for 100%) that the resource features the expected data type.

For example, if the extraction of a product from a given URL is requested, and that URL points to the webpage of a product with complete certainty, the value should be 1. If with complete certainty the webpage features a job listing instead of a product, the value should be 0. When there is no complete certainty, the value could be anything in between (e.g. 0.96).

validationMessages: Optional[Dict[str, List[str]]]: Contains paths to fields with the description of issues found with their values.

Custom items

Subclass Item to create your own item classes.

class zyte_common_items.base.ProbabilityMixin(**kwargs)

Provides get_probability() to make it easier to access the probability of an item or item component that is nested under its metadata attribute.

get_probability() → Optional[float]: Returns the item probability if available, otherwise None.

class zyte_common_items.Item(**kwargs)

Base class for items.

_unknown_fields_dict: dict: Contains unknown attributes fed into the item through from_dict() or from_list().

classmethod from_dict(item: Optional[Dict]): Read an item from a dictionary.

classmethod from_list(items: Optional[List[Dict]], *, trail: Optional[str] = None) → List: Read items from a list.

get_probability() → Optional[float]: Returns the item probability if available, otherwise None.