Statistics of the parsing results

Support

Administrator
Staff member
A-Parser Enterprise
AIsCm.png

Code:
eyJwcmVzZXQiOiJTdGF0IiwidmFsdWUiOnsicHJlc2V0IjoiU3RhdCIsInBhcnNl
cnMiOltbIlNFOjpHb29nbGUiLCJkZWZhdWx0Il1dLCJyZXN1bHRzRm9ybWF0Ijoi
JHAxLnByZXNldFxuWyUtIEZPUkVBQ0ggcDEuc2VycDtcblx0ZG9tYWluID0gbGlu
ay5tYXRjaCgnKCg/Omh0fGYpdHBzPzpcXC9cXC8uKz8pKD86XFwvfCQpJywgMSku
MDtcblx0c3RhdERvbWFpbi4kZG9tYWluID0gc3RhdERvbWFpbi4kZG9tYWluICsg
MTtcbkVORDtcbklGIHAxLmluZm8uc3VjY2VzcyA9PSAwO1xuXHRmYWlsZWQucHVz
aChxdWVyeSk7XG5FTkQgJV0iLCJyZXN1bHRzU2F2ZVRvIjoiZmlsZSIsInJlc3Vs
dHNGaWxlTmFtZSI6IiRkYXRlZmlsZS5mb3JtYXQoKS50eHQiLCJhZGRpdGlvbmFs
Rm9ybWF0cyI6W10sInJlc3VsdHNVbmlxdWUiOiJubyIsInF1ZXJ5Rm9ybWF0Ijpb
IiRxdWVyeSJdLCJ1bmlxdWVRdWVyaWVzIjpmYWxzZSwic2F2ZUZhaWxlZFF1ZXJp
ZXMiOmZhbHNlLCJpdGVyYXRvck9wdGlvbnMiOnsib25BbGxMZXZlbHMiOmZhbHNl
LCJxdWVyeUJ1aWxkZXJzQWZ0ZXJJdGVyYXRvciI6ZmFsc2UsInF1ZXJ5QnVpbGRl
cnNPbkFsbExldmVscyI6ZmFsc2V9LCJyZXN1bHRzT3B0aW9ucyI6eyJvdmVyd3Jp
dGUiOmZhbHNlfSwiZG9Mb2ciOiJubyIsImtlZXBVbmlxdWUiOiJObyIsIm1vcmVP
cHRpb25zIjp0cnVlLCJyZXN1bHRzUHJlcGVuZCI6IlslIHN0YXREb21haW4gPSB7
fTtcbmZhaWxlZCA9IFtdICVdIiwicmVzdWx0c0FwcGVuZCI6IlxcbioqKioqXHUw
NDIxXHUwNDQyXHUwNDMwXHUwNDQyXHUwNDM4XHUwNDQxXHUwNDQyXHUwNDM4XHUw
NDNhXHUwNDMwIFx1MDQzN1x1MDQzMFx1MDQzNFx1MDQzMFx1MDQzZFx1MDQzOFx1
MDQ0Zjpcblx1MDQxZFx1MDQzNVx1MDQ0M1x1MDQzNFx1MDQzMFx1MDQ0N1x1MDQz
ZFx1MDQ0Ylx1MDQ0NSBcdTA0MzdcdTA0MzBcdTA0M2ZcdTA0NDBcdTA0M2VcdTA0
NDFcdTA0M2VcdTA0MzI6ICRmYWlsZWQuc2l6ZVxuWyUgSUYgZmFpbGVkLnNpemUg
PiAwO1xuXHRmYWlsZWQuam9pbignLCAnKSBfIFwiXFxuXCI7XG5FTkQgJV1cblx1
MDQxYVx1MDQzZVx1MDQzYi1cdTA0MzJcdTA0M2UgXHUwNDQxXHUwNDQxXHUwNDRi
XHUwNDNiXHUwNDNlXHUwNDNhIFx1MDQzZlx1MDQzZSBcdTA0M2FcdTA0MzBcdTA0
MzZcdTA0MzRcdTA0M2VcdTA0M2NcdTA0NDMgXHUwNDM0XHUwNDNlXHUwNDNjXHUw
NDM1XHUwNDNkXHUwNDQzOlxuWyUgRk9SRUFDSCBrZXkgSU4gc3RhdERvbWFpbi5u
c29ydC5yZXZlcnNlO1xuXHRrZXkgXyAnOiAnIF8gc3RhdERvbWFpbi4ka2V5IF8g
XCJcXG5cIjtcbkVORCAlXSIsInF1ZXJ5QnVpbGRlcnMiOltdLCJyZXN1bHRzQnVp
bGRlcnMiOltdLCJjb25maWdPdmVycmlkZXMiOltdfX0=
  • For working this preset required version of A-Parser is not lower than 1.1.336.
  • In this preset for example used a parser SE::Google SE::Google.
  • Besides statistics outputs the standard result, it can be changed as usual.
  • To the general format of the result is added the template, which counts the desired statistics:
    al90A.png
  • Domains are extracted in template by a regular expression (it is necessary to calculate the count of links for each domain).
  • In the Prepend text variables are declared:
    CiJGc.png
  • In the Append text statistics are outputs:
    434sm.png
That is written in the Prepend and Append texts, output at the beginning and end of the file (which is equivalent to the beginning and end of parsing). Thanks to common space variables, it is possible to organize a one-time output of statistics or other data with support Template Toolkit templates.

An example of output statistics:

*****Statistics of a task:
Failed queries: 37
a-parser ew, a-parser q, a-parser ih, a-parser ht, a-parser dd, a-parser ki, a-parser h, a-parser z, a-parser ea, a-parser cc, a-parser es, a-parser cr, a-parser ky, a-parser fn, a-parser ga, a-parser gi, a-parser ap, a-parser lc, a-parser hn, a-parser a, a-parser lt, a-parser em, a-parser ff, a-parser fe, a-parser iz, a-parser as, a-parser dv, a-parser gy, a-parser hj, a-parser cg, a-parser eu, a-parser if, a-parser go, a-parser nj, a-parser tk, a-parser mk, a-parser en

Number of links on each domain:
https://books.google.com: 6590
https://github.com: 812
http://stackoverflow.com: 773
https://en.wikipedia.org: 608
http://dl.acm.org: 602
http://link.springer.com: 598
https://searchcode.com: 583
http://www.sciencedirect.com: 574
http://arxiv.org: 560
http://citeseerx.ist.psu.edu: 549
...
 
Back
Top