1.2.1076 - 3 new scrapers. Сompleting the transition to Node.js. Integration of puppeteer into build

Support Artur

A-Parser Enterprise License
A-Parser Enterprise
1.2.1076.png

Improvements

  • In connection with the transfer of the main built-in scrapers to the new Node.js platform, the scrapers have been completely rewritten and updated:
  • Major improvements from migration scrapers data to Node.js:
    • performance increase in ~ 1.5 times
    • unification of HTTP engine with JavaScript scrapers, unified bypass of CloudFlare
  • Added new scrapers:
  • In HTML::EmailExtractor HTML::EmailExtractor added Skip non-HTML blocks option to disable collection of mails inside script, style tags, etc.
  • In SE::Google::Translate SE::Google::Translate added new variables:
    • $translit_orig - original text in transliteration
    • $translit_translated - translated text in transliteration
    • $variants.$i.text - a list of translation options for the original text
  • In SE::Bing SE::Bing updated list of regions and languages
  • In Social::Instagram::Profile Social::Instagram::Profile и Social::Instagram::Post Social::Instagram::Post added the ability to collect the number of video views
  • In SE::Yandex::Translate SE::Yandex::Translate added the ability to disable the use of sessions
  • In Net::HTTP Net::HTTP added the ability to specify user-agent for Chrome
  • In scraper Rank::MOZ Rank::MOZ fixed the error that occurred when calling the scraper from the JS method this.parser.request().
  • In Rank::CMS Rank::CMS added support for new apps.json and the ability to use Net::HTTP Net::HTTP
  • In Net::Whois Net::Whois updated support for all zones
  • Added option for proxycheckers Exclude from "All", and also made changes in logic:
    • "All" - uses all proxies selected for tasks
    • specific proxychecker - uses it even if it is not selected in the task
  • Added support for outdated versions SSL
  • JS scrapers: Added option tlsOpts for this.request(), allows you to transfer settings for https connections
  • JS scrapers: updating Node.js с 14.2.0 to 14.15.0
  • JS scrapers: the puppeteer module is included in the A-Parser build and does not require a separate installation
Corrections due to changes in the SERP
Bug fixes
  • In SE::Yandex SE::Yandex fixed work Extra query string
  • Fixed regex in HTML::EmailExtractor HTML::EmailExtractor to correct errors in some cases
  • Fixed scraper behavior SE::Google::KeywordPlanner SE::Google::KeywordPlanner in the absence of results on request
  • Maps::Yandex Maps::Yandex fixed and translated to puppeteer
  • Fixed a bug in the priorities of choosing a proxychecker
  • JS scrapers: fixed follow_meta_refresh
  • API: fixed rawResults parameter work

 
Back
Top