Changelog
0.29.0 (2025-10-16)
Allowed passing
Noneas theurlfield value to the item classes.Explicitly re-export public names.
0.28.0 (2025-09-02)
Added official Python 3.14 support.
Switched attribute documentation from Sphinx comments (
#:) to docstrings.This allows IDEs to show them when hovering attributes, and allows ItemAdapter.get_json_schema() to include them as descriptions.
Added an
"llmHint"JSON Schema metadata field to some item types and fields.LLMs should have an easier time writing extraction code when given the corresponding JSON schema (generated with ItemAdapter.get_json_schema()).
We now guarantee that all types importable from
zyte_common_itemshave type hints retrievable at run time withget_type_hints(), i.e. not hidden withTYPE_CHECKING.This ensures they can be used with scrapy-poet.
0.27.1 (2025-06-26)
Added
displayedUrlTexttoSerpOrganicResult.Importing
DropLowProbabilityItemPipelineno longer triggers a warning about the deprecation of thezyte_common_items.aemodule.
0.27.0 (2025-01-16)
The
DropLowProbabilityItemPipelinenow supports nested items, i.e.dictobjects with items as values.Added an add-on to make Scrapy configuration easier.
Metadatanow also has all fields fromSerpMetadata.Messages about dropped items, e.g. due to low probability, are now logged as information and not as warnings.
0.26.2 (2024-11-12)
- Fixed the package build missing all nested packages:
zyte_common_items.componentszyte_common_items.itemszyte_common_items.pages
0.26.1 (2024-11-12)
Note
This version was yanked, see 0.26.2 (2024-11-12).
Migrated from
setup.pytopyproject.toml.Fixed
Serp.from_dictreturning an instance whereorganicResultslist items weredictinstead of instances ofSerpOrganicResult.
0.26.0 (2024-11-11)
Added
ForumThreadand related classes.
0.25.0 (2024-11-11)
Removed Python 3.8 support, added Python 3.13 support.
Backward-incompatible change:
SearchRequestTemplatePagenow subclassesPage, adding a dependency onHttpResponse. A newBaseSearchRequestTemplatePagethat subclassesBasePagehas been added as well.Tip
Where a dependency on
HttpResponseis not needed,BaseSearchRequestTemplatePageis a better replacement for theSearchRequestTemplatePageclass from zyte-common-items 0.24.0 and lower, as it only depends onweb_poet.page_inputs.http.RequestUrl.The
keywordparameter ofSearchRequestTemplate.request()has been deprecated in favor ofquery. As a result, Jinja templates inSearchRequestTemplatefield values should now use thequeryvariable (e.g.{{ query|quote_plus }}) instead of thekeywordvariable.Unexpected variables in Jinja templates of
SearchRequestTemplatefield values (e.g.{{ foo }}), which used to be silently removed, will now trigger anUndefinedErrorexception when callingSearchRequestTemplate.request().Fixed coverage data generation during tests.
0.24.0 (2024-10-02)
Added
JobPostingNavigationand related classes.
0.23.0 (2024-09-19)
Added
CustomAttributesand related classes.
0.22.0 (2024-09-09)
Added
Serpand related classes.
0.21.0 (2024-08-27)
The new
images_processor(), used by default inimagesfields, can convert a string, a list of strings or a list of dicts into anImagelist. Strings becomeImage.url. Dicts get theirurlkey mapped asImage.url.brand_processor()now converts strings intoBrandobjects with the input string asBrand.name.price_processor()andsimple_price_processor()now convert numeric values into strings with 2 decimal positions.metadata_processor()no longer assumes that the input metadata is notNone.
0.20.0 (2024-06-19)
Now fields of auto page object classes have
auto_fieldset toTruein their field metadata, to make it easier to check if a page object subclass is overriding a given field.
0.19.0 (2024-04-24)
Now requires
attrs >= 22.2.0.New deprecations:
zyte_common_items.components.request_list_processor(usezyte_common_items.processors.probability_request_list_processor)zyte_common_items.items.RequestListCaster(usezyte_common_items.converters.to_probability_request_list)zyte_common_items.util.metadata_processor(usezyte_common_items.processors.metadata_processor)
Added
DropLowProbabilityItemPipelinethat drops items with theprobabilityvalue lower than a set threshold.Added the
BaseMetadata,ListMetadata, andDetailsMetadataclasses (they were previously private).Added the
ListMetadata.validationMessagesattribute.Added the
ListMetadata.get_date_downloaded_parsed()method.Added the
zyte_common_items.convertersmodule with useful attrs converters.Reorganized the module structure.
Documentation improvements.
Test and CI fixes and improvements.
0.18.0 (2024-03-15)
Initial support for request templates, starting with search requests.
0.17.1 (2024-03-13)
Added Python 3.12 support.
description_processor()anddescription_html_processor()now raise an exception when they receive an unsupported input value such as a non-HtmlElement node.Documentation improvements.
0.17.0 (2024-02-14)
Implement the
zyte_common_items.aemodule and thezyte_common_items.pipelines.AEPipelineitem pipeline to make it easier to migrate from Zyte Automatic Extraction to Zyte API automatic extraction.
0.16.0 (2024-02-06)
Auto-prefixed versions of page objects, such asAutoProductPage(), now have all their fields defined as synchronous instead of asynchronous.
0.15.0 (2024-01-30)
Now requires
zyte-parsers >= 0.5.0.Added
SocialMediaPostand related classes.Added
ProductFromListExtractor,ProductFromListSelectorExtractor,ProductVariantExtractorandProductVariantSelectorExtractor.Added
zyte_common_items.processors.rating_processor()and enabled it for theaggregateRatingfields in the page classes forBusinessPlaceandProduct.Improved the documentation about the processors.
0.14.0 (2024-01-16)
Now requires
zyte-parsers >= 0.4.0.Added
zyte_common_items.processors.gtin_processor()and enabled it for thegtinfields in the page classes forProduct.Improved the API documentation.
0.13.0 (2023-11-09)
Added
Auto-prefixed versions of page objects, such asAutoProductPage(), that return data from Zyte API automatic extraction from their fields by default, and can be used to more easily override that data with custom parsing logic.
0.12.0 (2023-10-27)
Added
get_probability()helper method in item classes (e.g.Product,Article) andProbabilityRequest.
0.11.0 (2023-09-08)
Now requires
clear-html >= 0.4.0.Added
zyte_common_items.processors.description_processor()and enabled it for thedescriptionfields in the page classes forBusinessPlace,JobPosting,ProductandRealEstate.Added
zyte_common_items.processors.description_html_processor()and enabled it for thedescriptionHtmlfields in the page classes forJobPostingandProduct.Added default implementations for the
description(in the page classes forBusinessPlace,JobPosting,ProductandRealEstate) anddescriptionHtml(in the page classes forJobPostingandProduct) fields: if one of these fields is user-defined, another one will use it.price_processor()andsimple_price_processor()now keep at least two decimal places when formatting the result.
0.10.0 (2023-08-24)
Now requires
price-parser >= 0.3.4(a new dependency) andzyte-parsers >= 0.3.0(a version increase).Added
zyte_common_items.processors.price_processor()and enabled it for thepricefields.Added
zyte_common_items.processors.simple_price_processor()and enabled it for theregularPricefields.Added default implementations for the
currency(uses theCURRENCYattribute on the page class) andcurrencyRaw(uses the data extracted by thepricefield) fields.
0.9.0 (2023-08-03)
Now requires
web-poet >= 0.14.0.Fixed detection of the
HasMetadatabase class.
0.8.0 (2023-07-27)
Updated minimum versions for the following requirements:
attrs >= 22.1.0web-poet >= 0.9.0zyte-parsers >= 0.2.0
Added
JobPostingand related classes.Added
zyte_common_items.processors.brand_processor()and enabled it for thebrandfields.Added
zyte_common_items.Request.to_scrapy()to convertzyte_common_items.Requestinstances toscrapy.Requestinstances.
0.7.0 (2023-07-11)
Now requires
zyte-parsers.Added navigation classes:
ArticleNavigation,ProductNavigation, the page classes that produce them, and other related classes.Improved the metadata field handling, also fixing some bugs:
Added item-specific metadata classes. The
metadataitem fields were changed to use them.Backward-incompatible change: the
DateDownloadedMetadataclass was removed. The item-specific ones are now used instead.Backward-incompatible change:
ArticleFromListno longer has aprobabilityfield and instead has ametadatafield like all other similar classes.Backward-incompatible change: while in most items the old and the new type of the
metadatafield have the same fields, the one inArticlenow hasprobability, the one inProductListno longer hasprobability, and the one inProductFromListno longer hasdateDownloaded.The default
probabilityvalue is now1.0instead ofNone.Added the
HasMetadatamixin which is used similarly toReturnsto set the page metadata class.Metadata objects assigned to the
metadatafields of the items or returned from themetadata()methods of the pages are now converted to suitable classes.
Added
zyte_common_items.processors.breadcrumbs_processor()and enabled it for thebreadcrumbsfields.
0.6.0 (2023-07-05)
Added
ArticleandArticleList.Added support for Python 3.11 and dropped support for Python 3.7.
0.5.0 (2023-05-10)
Now requires
itemadapter >= 0.8.0.Added
RealEstate.Added the
zyte_common_items.BasePage.no_item_found()andzyte_common_items.Page.no_item_found()methods.Improved the error message for invalid input.
Added
ZyteItemKeepEmptyAdapterand documented how to use it andZyteItemAdapterin custom subclasses ofitemadapter.ItemAdapter.
0.4.0 (2023-03-27)
Added support for business places.
0.3.1 (2023-03-17)
Fixed fields from
BasePagesubclasses leaking across subclasses. (#29, #30)Improved how the
from_dict()andfrom_list()methods report issues in the input data. (#25)
0.3.0 (2023-02-03)
Added page object classes for e-commerce product detail and product list pages.
0.2.0 (2022-09-22)
Supports
web_poet.RequestUrlandweb_poet.ResponseUrland automatically convert them into a string on URL fields likeProduct.url.Bumps the
web_poetdependency version from0.4.0to0.5.0which fully supports type hints using thepy.typedmarker.This package now also supports type hints using the
py.typedmarker. This means mypy would properly use the type annotations in the items when using it in your project.Minor improvements in tests and annotations.
0.1.0 (2022-07-29)
Initial release.