I think I'm having a character encoding problem but I can't tell if it is with how A-parser is setup or possibly something on my server.
I have a custom Net::HTTP parser that queries Google and then saves the results page as a html file. It works great but for some reason a few queries will not save as a file.
They appear to work fine in the parser test and the queries aren't failing they simple aren't being saved. All the queries that are failing are non-english words/phrases.
I've include the parser code below but the one way I was able to get it to work is change this line for the results file name:
serp_raw/[% IF p1.info.success == 1 %][% USE Math; "test_4"_ Math.int(query.num / 2500) _"/"_ query _".html" %][% END %]
To this:
serp_raw/[% IF p1.info.success == 1 %][% USE Math; "test_4"_ Math.int(query.num / 2500) _"/test.html" %][% END %]
With the updated line I can perform one query at a time and the file will be generated but of course the file naming is no longer dynamic.
And I can then manually re-name the files with the correct query name with no problems.
Here is the parser code that includes the queries failing:
I have a custom Net::HTTP parser that queries Google and then saves the results page as a html file. It works great but for some reason a few queries will not save as a file.
They appear to work fine in the parser test and the queries aren't failing they simple aren't being saved. All the queries that are failing are non-english words/phrases.
I've include the parser code below but the one way I was able to get it to work is change this line for the results file name:
serp_raw/[% IF p1.info.success == 1 %][% USE Math; "test_4"_ Math.int(query.num / 2500) _"/"_ query _".html" %][% END %]
To this:
serp_raw/[% IF p1.info.success == 1 %][% USE Math; "test_4"_ Math.int(query.num / 2500) _"/test.html" %][% END %]
With the updated line I can perform one query at a time and the file will be generated but of course the file naming is no longer dynamic.
And I can then manually re-name the files with the correct query name with no problems.
Here is the parser code that includes the queries failing:
Code:
eyJwcmVzZXQiOiJUZXN0IC0gUkFXIEhUTUwiLCJ2YWx1ZSI6eyJwcmVzZXQiOiJU
ZXN0IC0gUkFXIEhUTUwiLCJwYXJzZXJzIjpbWyJOZXQ6OkhUVFAiLCJkZWZhdWx0
Iix7InR5cGUiOiJvdmVycmlkZSIsImlkIjoidXNlci1hZ2VudCIsInZhbHVlIjoi
TW96aWxsYS81LjAgKFdpbmRvd3MgTlQgNi4xOyBXT1c2NCkgQXBwbGVXZWJLaXQv
NTM3LjM2IChLSFRNTCwgbGlrZSBHZWNrbykgQ2hyb21lLzQ3LjAuMjUyNi4xMTEg
U2FmYXJpLzUzNy4zNiJ9LHsidHlwZSI6Im92ZXJyaWRlIiwiaWQiOiJmb3JtYXRy
ZXN1bHQiLCJ2YWx1ZSI6IlslIElGIGluZm8uc3VjY2VzcyA9PSAxICVdJHBhZ2Vz
LmZvcm1hdCgnJGRhdGFcXG4nKVslIEVORCAlXSJ9LHsidHlwZSI6Im92ZXJyaWRl
IiwiaWQiOiJwcm94eXJldHJpZXMiLCJ2YWx1ZSI6IjIwMCJ9LHsidHlwZSI6Im92
ZXJyaWRlIiwiaWQiOiJ1c2Vwcm94eSIsInZhbHVlIjp0cnVlfSx7InR5cGUiOiJv
dmVycmlkZSIsImlkIjoiZ29vZENvZGUiLCJ2YWx1ZSI6WzIwMF19LHsidHlwZSI6
Im92ZXJyaWRlIiwiaWQiOiJwcm94eWJhbm5lZGNsZWFudXAiLCJ2YWx1ZSI6IjAi
fSx7InR5cGUiOiJvdmVycmlkZSIsImlkIjoicXVlcnlmb3JtYXQiLCJ2YWx1ZSI6
Imh0dHBzOi8vd3d3Lmdvb2dsZS5jb20vc2VhcmNoP3E9JHF1ZXJ5JnB3cz0wJnV1
bGU9dytDQUlRSUNJTlZXNXBkR1ZrSUZOMFlYUmxjdyJ9LHsidHlwZSI6Im92ZXJy
aWRlIiwiaWQiOiJyZXF1ZXN0ZGVsYXkiLCJ2YWx1ZSI6IjAifSx7InR5cGUiOiJv
dmVycmlkZSIsImlkIjoidGltZW91dCIsInZhbHVlIjoiMzAifV1dLCJyZXN1bHRz
Rm9ybWF0IjoiJHAxLnByZXNldCIsInJlc3VsdHNTYXZlVG8iOiJmaWxlIiwicmVz
dWx0c0ZpbGVOYW1lIjoic2VycF9yYXcvWyUgSUYgcDEuaW5mby5zdWNjZXNzID09
IDEgJV1bJSBVU0UgTWF0aDsgXCJ0ZXN0XzRcIl8gTWF0aC5pbnQocXVlcnkubnVt
IC8gMjUwMCkgX1wiL1wiXyBxdWVyeSBfXCIuaHRtbFwiICVdWyUgRU5EICVdIiwi
YWRkaXRpb25hbEZvcm1hdHMiOltbImZhaWxlZC9mYWlsZWQudHh0IiwiWyUgSUYg
cDEuaW5mby5zdWNjZXNzID09IDAgJV0kcXVlcnlcXG5bJSBFTkQgJV0iXV0sInJl
c3VsdHNVbmlxdWUiOiJubyIsInF1ZXJpZXNGcm9tIjoidGV4dCIsInF1ZXJ5Rm9y
bWF0IjpbIiRxdWVyeSJdLCJ1bmlxdWVRdWVyaWVzIjp0cnVlLCJzYXZlRmFpbGVk
UXVlcmllcyI6ZmFsc2UsIml0ZXJhdG9yT3B0aW9ucyI6eyJvbkFsbExldmVscyI6
ZmFsc2UsInF1ZXJ5QnVpbGRlcnNBZnRlckl0ZXJhdG9yIjpmYWxzZSwicXVlcnlC
dWlsZGVyc09uQWxsTGV2ZWxzIjpmYWxzZX0sInJlc3VsdHNPcHRpb25zIjp7Im92
ZXJ3cml0ZSI6ZmFsc2V9LCJkb0xvZyI6ImRiIiwia2VlcFVuaXF1ZSI6Ik5vIiwi
bW9yZU9wdGlvbnMiOmZhbHNlLCJyZXN1bHRzUHJlcGVuZCI6IiIsInJlc3VsdHNB
cHBlbmQiOiIiLCJxdWVyeUJ1aWxkZXJzIjpbXSwicmVzdWx0c0J1aWxkZXJzIjpb
XSwiY29uZmlnT3ZlcnJpZGVzIjpbXSwicXVlcmllcyI6Ilx1YmU0NVx1YmM0NW1v
bnN0ZXJcdWI0ZTNcdWFlMzBcblx1MDQzNFx1MDQ0ZFx1MDQ0MyBcdTA0NDJcdTA0
MzhcdTA0M2FcdTA0M2UgXHUwNDNlXHUwNDQyXHUwNDM3XHUwNDRiXHUwNDMyXHUw
NDRiXG5nXHUyNjZkIG1ham9yXG5kciBwZXJvIHZyXHUwMTdlb2dpXHUwMTA3In19