Changelog
0.19.0 (2024-04-24)
Now requires
attrs >= 22.2.0
.New deprecations:
zyte_common_items.components.request_list_processor
(usezyte_common_items.processors.probability_request_list_processor
)zyte_common_items.items.RequestListCaster
(usezyte_common_items.converters.to_probability_request_list
)zyte_common_items.util.metadata_processor
(usezyte_common_items.processors.metadata_processor
)
Added
DropLowProbabilityItemPipeline
that drops items with theprobability
value lower than a set threshold.Added the
BaseMetadata
,ListMetadata
, andDetailMetadata
classes (they were previously private).Added the
ListMetadata.validationMessages
attribute.Added the
ListMetadata.get_date_downloaded_parsed()
method.Added the
zyte_common_items.converters
module with useful attrs converters.Reorganized the module structure.
Documentation improvements.
Test and CI fixes and improvements.
0.18.0 (2024-03-15)
Initial support for request templates, starting with search requests.
0.17.1 (2024-03-13)
Added Python 3.12 support.
description_processor()
anddescription_html_processor()
now raise an exception when they receive an unsupported input value such as a non-HtmlElement node.Documentation improvements.
0.17.0 (2024-02-14)
Implement the
zyte_common_items.ae
module and thezyte_common_items.pipelines.AEPipeline
item pipeline to make it easier to migrate from Zyte Automatic Extraction to Zyte API automatic extraction.
0.16.0 (2024-02-06)
Auto
-prefixed versions of page objects, such asAutoProductPage()
, now have all their fields defined as synchronous instead of asynchronous.
0.15.0 (2024-01-30)
Now requires
zyte-parsers >= 0.5.0
.Added
SocialMediaPost
and related classes.Added
ProductFromListExtractor
,ProductFromListSelectorExtractor
,ProductVariantExtractor
andProductVariantSelectorExtractor
.Added
zyte_common_items.processors.rating_processor()
and enabled it for theaggregateRating
fields in the page classes forBusinessPlace
andProduct
.Improved the documentation about the processors.
0.14.0 (2024-01-16)
Now requires
zyte-parsers >= 0.4.0
.Added
zyte_common_items.processors.gtin_processor()
and enabled it for thegtin
fields in the page classes forProduct
.Improved the API documentation.
0.13.0 (2023-11-09)
Added
Auto
-prefixed versions of page objects, such asAutoProductPage()
, that return data from Zyte API automatic extraction from their fields by default, and can be used to more easily override that data with custom parsing logic.
0.12.0 (2023-10-27)
Added
get_probability()
helper method in item classes (e.g.Product
,Article
) andProbabilityRequest
.
0.11.0 (2023-09-08)
Now requires
clear-html >= 0.4.0
.Added
zyte_common_items.processors.description_processor()
and enabled it for thedescription
fields in the page classes forBusinessPlace
,JobPosting
,Product
andRealEstate
.Added
zyte_common_items.processors.description_html_processor()
and enabled it for thedescriptionHtml
fields in the page classes forJobPosting
andProduct
.Added default implementations for the
description
(in the page classes forBusinessPlace
,JobPosting
,Product
andRealEstate
) anddescriptionHtml
(in the page classes forJobPosting
andProduct
) fields: if one of these fields is user-defined, another one will use it.price_processor()
andsimple_price_processor()
now keep at least two decimal places when formatting the result.
0.10.0 (2023-08-24)
Now requires
price-parser >= 0.3.4
(a new dependency) andzyte-parsers >= 0.3.0
(a version increase).Added
zyte_common_items.processors.price_processor()
and enabled it for theprice
fields.Added
zyte_common_items.processors.simple_price_processor()
and enabled it for theregularPrice
fields.Added default implementations for the
currency
(uses theCURRENCY
attribute on the page class) andcurrencyRaw
(uses the data extracted by theprice
field) fields.
0.9.0 (2023-08-03)
Now requires
web-poet >= 0.14.0
.Fixed detection of the
HasMetadata
base class.
0.8.0 (2023-07-27)
Updated minimum versions for the following requirements:
attrs >= 22.1.0
web-poet >= 0.9.0
zyte-parsers >= 0.2.0
Added
JobPosting
and related classes.Added
zyte_common_items.processors.brand_processor()
and enabled it for thebrand
fields.Added
zyte_common_items.Request.to_scrapy()
to convertzyte_common_items.Request
instances toscrapy.http.Request
instances.
0.7.0 (2023-07-11)
Now requires
zyte-parsers
.Added navigation classes:
ArticleNavigation
,ProductNavigation
, the page classes that produce them, and other related classes.Improved the metadata field handling, also fixing some bugs:
Added item-specific metadata classes. The
metadata
item fields were changed to use them.Backwards incompatible change: the
DateDownloadedMetadata
class was removed. The item-specific ones are now used instead.Backwards incompatible change:
ArticleFromList
no longer has aprobability
field and instead has ametadata
field like all other similar classes.Backwards incompatible change: while in most items the old and the new type of the
metadata
field have the same fields, the one inArticle
now hasprobability
, the one inProductList
no longer hasprobability
, and the one inProductFromList
no longer hasdateDownloaded
.The default
probability
value is now1.0
instead ofNone
.Added the
HasMetadata
mixin which is used similarly toReturns
to set the page metadata class.Metadata objects assigned to the
metadata
fields of the items or returned from themetadata()
methods of the pages are now converted to suitable classes.
Added
zyte_common_items.processors.breadcrumbs_processor()
and enabled it for thebreadcrumbs
fields.
0.6.0 (2023-07-05)
Added
Article
andArticleList
.Added support for Python 3.11 and dropped support for Python 3.7.
0.5.0 (2023-05-10)
Now requires
itemadapter >= 0.8.0
.Added
RealEstate
.Added the
zyte_common_items.BasePage.no_item_found()
andzyte_common_items.Page.no_item_found()
methods.Improved the error message for invalid input.
Added
ZyteItemKeepEmptyAdapter
and documented how to use it andZyteItemAdapter
in custom subclasses ofitemadapter.ItemAdapter
.
0.4.0 (2023-03-27)
Added support for business places.
0.3.1 (2023-03-17)
Fixed fields from
BasePage
subclasses leaking across subclasses. (#29, #30)Improved how the
from_dict()
andfrom_list()
methods report issues in the input data. (#25)
0.3.0 (2023-02-03)
Added page object classes for e-commerce product detail and product list pages.
0.2.0 (2022-09-22)
Supports
web_poet.RequestUrl
andweb_poet.ResponseUrl
and automatically convert them into a string on URL fields likeProduct.url
.Bumps the
web_poet
dependency version from0.4.0
to0.5.0
which fully supports type hints using thepy.typed
marker.This package now also supports type hints using the
py.typed
marker. This means mypy would properly use the type annotations in the items when using it in your project.Minor improvements in tests and annotations.
0.1.0 (2022-07-29)
Initial release.