1.2.912 - NodeJS update, performance improvement, adaptation to changes in recaptchas

Support Artur · Jun 11, 2020

We have completed the transition to NodeJS as the main engine for scrapers and present a new stable version 1.2.912 with support for NodeJS 14.2.0. This update combines many improvements, including increased performance, reduced memory consumption, a completely new network stack, as well as support for native NodeJS modules, allowing you to use the full power of the npmjs directory in A-Parser

Also, this update includes changes in working with ReCaptcha2 in the Google scraper, our team was one of the first to find a solution to circumvent the new version of the recaptcha and tested it together with the RuCaptcha service, for which they have a separate respect. At the moment, the correct captcha bypass has been tested with RuCaptcha, Anti-Captcha, XEvil and CapMonster.

In addition, many optimizations were made in the core of A-Parser, and performance was significantly increased when using a large number of tasks or large proxy lists. The scraper

Rank::CMS has been completely rewritten and stabilized, support for the new apps.json format and support for user rules have been added.

Improvements

NodeJS updated to v14.0.0, v8 to 8.1
Added support for the data-s parameter in recaptures for SE::Google, also added the ReCaptcha2 pass proxy option
Increased thread limit to 10,000 for Windows OS
Significantly improved performance with a large number of active proxies and / or jobs, completely rewritten the stack for working with proxies, optimized work with large lists
Added new scraper Rank::KeysSo
Completely rewritten in JS SE::Yahoo::Suggest[parser], [parser]Rank::Alexa::API and Rank::Archive
Improved performance when using regular expressions, as well as improved compatibility
In SE::Google::KeywordPlanner added automatic token retrieval
In SE::Bing added the ability to scrap links to cached pages, as well as the ability to scrap mobile results
In the scraper Util::ReCaptcha2, when choosing the provider Capmonster or Xevil it is now optional to specify the Provider url
In SE::Google::Trends added the ability to specify an arbitrary date range
In Rank::CMS added the choice of a regular engine and support for its own file with features
In SE::Yandex::ByImage added option Don't scrape if no other sizes, which allows you to disable the collection of results if the desired image is not in other sizes
[NodeJS] Fixed this.cookies.getAll()
[NodeJS] Added protection against endless loops and long regulars
[JS scrapers] Added follow_meta_refresh option for this.request
[JS scrapers] Added bypass_cloudflare option for this.request
[JS scrapers] Underscore replaced by Lodash
[JS scrapers] Added a mark in the log when calling other scrapers
[JS scrapers] Using the previous proxy after a request to another scraper
[JS scrapers] Added destroy() method

Corrections due to changes in the issuance

Many fixes in SE::Google
Fixed SE::Youtube, incl. scraping by tags
Fixed collection of links in Shop::eBay
Fixed phone scraping in Maps::Google
Fixed work with captchas in SE::Yandex::ByImage
In Rank::Social::Signal the variable $facebook_comment was deleted due to irrelevance
SE::Startpage, Rank::Linkpad, Social::Instagram::post, SE::Yandex::Translate

Corrections

Fixed a bug due to which the selected proxy checker was ignored
Fixed work of Decode HTML entities and Extract domain functions in Result Constructor
Fixed problem with encoding detection
Fixed error using $tools.query
Fixed bug in Rank::MajesticSEO in which all attempts were used in the absence of results
Fixed work of http2
Fixed a bug when the scraper crashes due to the inability to write in alive.txt
Fixed captcha capturing in SE::Yandex::Register and Check::RosKomNadzor
Fixed the difference in requests sent via Net::HTTP and JS
Fixed bug in SE::Yahoo
Bugs fixed in Rank::CMS when choosing an application without a category
[NodeJS] Fixed calculation of scraper code execution time
[JS scrapers] When the body is empty, the content-length header was not transmitted when posting a request
[JS scrapers] Fixed work of CloudFlare bypass
[JS scrapers] Fixed work with sessions
[JS scrapers] Fixed work with overrides for this.parser.request
[JS scrapers] Fixed error in encoding detection in JS scrapers

1.2.912 - NodeJS update, performance improvement, adaptation to changes in recaptchas

Support Artur

A-Parser Enterprise License

About

Quick navigation

Social media

Support