Customize Google Images Scraper

smilesmile

New Member
I need to extract some additional information from Google Images results and am not sure how to go about it.

On the Google image results page each image generates a url like this:

href="http://www.google.com/imgres?imgurl...BQ&tbm=isch&ved=0CDQQMygCMAI&biw=1366&bih=631"

I need to extract the values for these parameters:

imgurl=
imgrefurl=
tbnid=

And finally, is there a way to extract the filetype of the image into a variable as well (jpg, png, etc)? Something like $filetype?

So for the final result I would like stored on each line:
$query;$loop.count;$imgurl;$imgrefurl;$tbnid.$filetype\n
 
The source code for the pages is contained in the $pages array.
After analyzing it, you can see that each picture is represented by a JSON object, which has all the data that you need. Therefore, the task is reduced to scraping these objects and outputting the necessary data.
x24fz_181015135900.png

Code:
eJxtVE1v2zAM/SsG0aJJEXjoYRf3C2nXbC26uEvaU5QVQswYTmXJk+QshZH/Pkp2
7CZrDo5JkY+Pj7QqsNy8mSeNBq2BaFZB4d8hggSXvBQWBlBwbVC74xlM76Lou1Kp
wCi6z3mKhgLa0Arse4GUvCiNVfkETY2g65doRlhNCrcc5u4kxQ0lXCTZuncdBSsj
eY6XDGa/GcxPGfSvg4XgxpBLp685Wr47ueoxVoWn14xt+xeMfSGEK2gQn2saJm2L
Nx6uNX8np/8fUynyZRZz0wa6PmFV89vO5617pHTOnTCz42AUT+6Gtz8Clxncj4Pi
LPQg50wyuzJKBpeBVUqY0Gv3MI3HPRcQetw+hQX0+1Oifg9eg5PzE3oKpYpwoUpp
W5dDClW5b+sDO0sObOswGTDiAlTpbvwtOJ537U35Gp8V9bHMBHbuEVmNHkdEEt1p
uPQ99/qh3bgx8iTJbKYkF7UYTqpOoBeZUUeULxXFuuYyNCOtcnJZ9AC+452QMzjy
tluD0uf+qnMgWnJhcACGqI44EUkOT0hMza3SceH4kL8CJYdCPOIaRRfm8W/KTCS0
v8MlJd03iZ+HxP9hbNv2PpZao/6riUOL4q2b+GeXlahHle7EeEMsWnnGzpMrjS1i
A9IUoq+xQJlQZDedYdG59hjvTWDfuVBymaUxcdVZgrvIUj7TJx/LW5UXAl0LshSC
JmBw0m3C0DSKO6MjeJh860vsXRZ+7R+mNdVCZ7RpXx3BnET7WLWBXHAhXiaPH0+g
2x6/OcbBLmglU0XL4u4ov00RKO2bHQBuCi4TJH3OtvTNtjdWe69Vn91bUbWlOa3M
Ux3sOvXOAZBkhibj4P4BnL/CrQ==
 
Back
Top