Improvements
- Added new scraper
CoinMarketCap::LastPrice - Scraper
SE::Yandex::Register has been completely rewritten, the ability to register accounts has been restored - Scrapers
SE::Yandex::WordStat,
SE::Yandex::WordStat::ByDate and
SE::Yandex::WordStat::ByRegion have been completely rewritten, new functionality has been added (authorization method selection, account registration "on the fly", sessions...)
HTML::EmailExtractor is completely rewritten, thanks to the use
HTML::LinkExtractor as a basis, in addition to the actual mail scraping, many features have become available: link collection, Parse to level, Chrome engine and etc.- All Instagram parsers rewritten to JS APIv2 and adapted to changes on the source
- Increased maximum number of pages in
SE::Google to 100 - Added date collection from snippet in Google scrapers
- In parser
SE::Bing added Fix pagination bug option to fix a bug in Bing search that causes 2nd and subsequent pages to be returned blank - In parser
Shop::Wildberries::ProductInfo added data collection about the seller and the ability to determine the availability of goods - In parser
SE::Startpage added Links per page option and updated list of available variants in existing options - In
SE::DuckDuckGo has been added Use HTTP/2 option
Net::HTTP: added option Ban Proxy Code RegEx- Added the ability to set an arbitrary level for subqueries (query.add)
- Added option needResults for this.parser.request
Corrections due to changes in the issue
- Adaptation to changes in Google and Yandex layout
- Restored work with recaptchas in
SE::Google - Fixed scraping of $title in
Shop::Wildberries::ProductInfo
SE::Google::TrustCheck,
SE::Google::Images,
Shop::Yandex::Market,
Shop::Wildberries::ProductsList,
SE::Dogpile,
SE::Startpage
Bug fixes
- Fixed HTTP keep-alive, in some cases the socket was closed prematurely
- Fixed bug in Follow common redirects
- Redis API: fixed the work of some scrapers, the problem arose with scrapers using result optimization
