SE::Yandex::ByImage - 按链接搜索图片

爬虫工具概览
Yandex 以图搜图爬虫工具。通过 SE::Yandex::ByImages 爬虫工具,您可以获取图片链接库或可供进一步使用的图片。您可以直接使用在 Yandex 搜索栏中输入的原始查询。
A-Parser 的功能允许您保存 Yandex 爬虫工具的数据抓取设置以便日后使用(预设),设置数据抓取计划等。您可以使用自动查询扩展、从文件替换子查询、遍历数字字母组合和列表,以获取尽可能多的结果。
由于内置了强大的 Template Toolkit 模板引擎,您可以按所需的格式和结构保存结果,该引擎允许对结果应用额外的逻辑,并以各种格式输出数据,包括 JSON、SQL 和 CSV。
爬虫工具应用案例
通过链接下载图片
A-Parser 允许使用任务链,在第一个任务完成后,将开始执行第二个任务,第一个任务中的链接将作为第二个任务的查询。
下载示例
eJyNVl1P2zAU/SvI4qFsbRIm7SUvrLBVY2KUQXmYSid5zU1mcOxgO6VV6H/ftROS
NAtlb7F9P33POU5BDNUP+kqBBqNJOC9I5r5JSH5SEcE6DE835ylN4CCST4JLGpEh
yajSoKz9nNx8CcOOKVpEENOcG7JYDAkGxE89kSqlNvBhduxVWerDG7qCmcTDmHFo
tie4uqQpWK+IGrCnXuwCDY48s7YRaBQxw6SgvMxgy2qy3gr2mFt/bRQTCdrjUjHQ
EyVT3DbggtjNzUuFc3Lo1gTD5M7/R+lDwphyDUOisdwJxWKi7gkzoKiRaprZmnC/
IFKMOb+AFfDGzMU/zRmP8B7HMTqdV479JtN/YmzrFtupVqCeFNZQR3Gr0+n3xiuS
FzLBzqPf2DdnKTO41mcyF3Y4AW4+AGT1vV1K3EmlgjqNUTnUyRE6GYgIDZupjbNm
a6eLncnsbi6liFkyxfoVi+DFMhczxOdUnMk042DbIg5h+uBzhUZQmCTXcN0AZqyr
odhFXW431JlLaO+hguqQGCm5/nZTFp4phnj8aMtN8VrbNVRXu6Sc315f7FTX4AsX
f4zJdOj7dFQSxlvK1Gdp4nOZyBHNPn1Ye5nD5BKxnUhEHLa9HRY7BLsEE4ZfZ7Or
Fq3QREECa0yCF2dAmJHZZBB6706YvR9/gHGf7zM4SZ4TFh+52aD9DI3Qh1naKEU3
Fbds7+XJMtdGpuVl1gO1rQB1s3rZqgLFnLZIbJlj6eRaWFHu8COkgCY+Eti4kUmH
JstLELqdqeRjaVdlEdLAY0452bblpJEG17L2S9J6Ik9Hh0X1jQZbzwpOGbYHUCLn
/BUe9EG8Vz3+B1x7MRo0tBTyTU3riuUrQGyDql+FUVJpD0cLomWuljZMqYOWGna4
9jrJYlhjzx/Mf/mL90d3d97gJNyF3GEP5ioQlO7bxbB5bPpY3aMmHTUO+rSqJm6P
SAdtenYZvyNywSs61X1W3LjeEv1gv+B3j3fEPtj2SVCw58nql9J9b0XQfidsQjcD
vPNjN6RSjOqfhKL3zQ8L9Gu0Cpe4vtdXpbOFuS5tMK121D/e/gXAoPIV
采集的数据


- 与图片相关的关键词
- 图片链接
- 放置图片的页面链接和域名
- 代码片段 (Snippets)
- 锚文本
- 图片的宽度和高度
功能
- 将缩短的链接转换为完整链接
- 如果找不到其他尺寸的目标图片,允许禁用结果采集
使用场景
- 采集图片供个人使用
- 构建图片库
- 采集图片描述
- 采集图片链接
- 采集与图片相关的关键词
查询
必须指定图片链接作为查询,例如:
https://a-parser.com/img/[email protected]
结果输出示例
A-Parser 凭借内置的模板引擎 Template Toolkit 支持灵活的结果格式化,这使其能够以任意形式输出结果,以及 CSV 或 JSON 等结构化格式。
默认输出
结果格式:
$serp.format('$link\n')
结果示例:
https://c7.hotpng.com/preview/982/127/829/logo-brand-trademark-design.jpg
https://img2.freepng.ru/20180512/zhe/kisspng-logo-brand-trademark-5af7aa709338e4.2161971915261804646031.jpg
https://a-parser.com/img/[email protected]
https://openssource.info/proxy.php?image=https%3A%2F%2Ffiles.a-parser.com%2Fimg%2Ffvvik_200716143725.png&hash=5c3e010f0b33ccadf7b5215b42435bef
https://a-parser.com/img/scr/g58tg.png
https://openssource.info/proxy.php?image=https%3A%2F%2Ffiles.a-parser.com%2Fimg%2F1.2.799.png&hash=89f3b5f010ba5d9c846c104d1df3e174
https://w7.pngwing.com/pngs/982/127/png-transparent-logo-brand-trademark-design.png
https://w7.pngwing.com/pngs/982/127/png-transparent-logo-brand-trademark-design.png
https://a-parser.com/wp-content/uploads/2020/10/[email protected]
https://cdn-front.kwork.ru/pics/t3/44/5340106-1584381244.jpg
https://cdn-front.kwork.ru/pics/t3/44/5340106-1584381244.jpg
https://cdn-front.kwork.ru/pics/t3/44/5340106-1584381244.jpg
将关键词输出到 CSV
结果格式:
[% FOREACH item IN keywords;
tools.CSVline(query, item.key);
END %]
结果示例:
https://a-parser.com/img/[email protected],"logo"
https://a-parser.com/img/[email protected],"爬虫工具 logo"
https://a-parser.com/img/[email protected],"品牌 logo"
https://a-parser.com/img/[email protected],"logo 文本"
https://a-parser.com/img/[email protected],"mobilebase logo"
以 SQL 格式保存关键词
结果格式:
[% FOREACH keywords; "INSERT INTO serp VALUES('" _ query _ "', '"; key _ "')\n"; END %]
结果示例:
INSERT INTO serp VALUES('https://a-parser.com/img/[email protected]', 'logo')
INSERT INTO serp VALUES('https://a-parser.com/img/[email protected]', 'mobilebase logo')
INSERT INTO serp VALUES('https://a-parser.com/img/[email protected]', '爬虫工具 logo')
将关键词转储为 JSON
通用结果格式:
[% IF notFirst;
",\n";
ELSE;
notFirst = 1;
END;
obj = {};
obj.query = query;
obj.keywords = [];
FOREACH item IN p1.keywords;
obj.keywords.push({
key = item.key
});
END;
obj.json %]
起始文本:
[
结束文本:
]
结果示例:
[
{
"keywords": [
{
"key": "爬虫工具 logo"
},
{
"key": "logo"
},
{
"key": "品牌 logo"
},
{
"key": "免费 logo"
},
{
"key": "系统"
}
],
"query": "https://a-parser.com/img/[email protected]"
}
]
提示
为了使“Prepend text”和“Append text”选项在任务编辑器中可用,需要激活“More options”。
输出到 CSV 表格
结果格式:
[% FOREACH item IN serp;
tools.CSVline(query, item.link, item.width, item.height, item.domain, item.anchor, item.snippet);
END %]
结果示例:
https://a-parser.com/img/[email protected],"http://yandex.ru/clck/jsredir?from=yandex.ru%3Bimages%2Fsearch%3Bimages%3B%3B&text=&etext=9185.ETuLDRARHEaaFwvbgNKw9uM4q71GWnOQWYj5gryTT3A.dce07e8678375a61f9da58b9d746be75b7c4d624&uuid=&state=iric5OQ0sS2054x1_o8yG9mmGMT8WeQxqpuwa4Ft4KVzd9aE_Y4Dfw,,&data=eEwyM2lDYU9Gd1VROE1ZMXhZYkJTV1BmN1I3WUhYbU8yYzNOUTBxMk5pV0xtNWV1LU1RcVVLRzlVeDVPdkgwWGNEaUtRQ3g2VmdOTEJwNHlCeFVfeWtMUXJFUnc3UnNHLVNrcVpaRDVnSkdnUENXUGVtaTN2RTFCbE9BV2t1c3M,&sign=34fd31e6b6c4280c4b1db67ed6a734e1&keyno=IMGS_0&b64e=2&l10n=ru",800,150,Hotpng.com,"Logo 品牌 商标, 设计 PNG HotPNG","SEO. 艺术. 爬虫工具."
https://a-parser.com/img/[email protected],"http://yandex.ru/clck/jsredir?from=yandex.ru%3Bimages%2Fsearch%3Bimages%3B%3B&text=&etext=9185.ETuLDRARHEaaFwvbgNKw9uM4q71GWnOQWYj5gryTT3A.dce07e8678375a61f9da58b9d746be75b7c4d624&uuid=&state=iric5OQ0sS2054x1_o8yG9mmGMT8WeQxqpuwa4Ft4KVzd9aE_Y4Dfw,,&data=eEwyM2lDYU9Gd1VROE1ZMXhZYkJTUTlnNkFnVWVCb2pvdVhLTGZ5bjVyTTFYVlRTWmx3NWM3Z3NvTmhQTjVHSjh3QkFodW5UQVJhNUZTRlkwNE8waUNMNXdfZzhDQ1JSWUtGVDA3MWVCbmNxSldZazRrdkM1QSws&sign=718910eff1f976158209921f37155f74&keyno=IMGS_0&b64e=2&l10n=ru",900,180,Freepng.ru,"logo, 品牌, 商标","免费 logo, 品牌, 商标透明图像"
https://a-parser.com/img/[email protected],"http://yandex.ru/clck/jsredir?from=yandex.ru%3Bimages%2Fsearch%3Bimages%3B%3B&text=&etext=9185.ETuLDRARHEaaFwvbgNKw9uM4q71GWnOQWYj5gryTT3A.dce07e8678375a61f9da58b9d746be75b7c4d624&uuid=&state=iric5OQ0sS2054x1_o8yG9mmGMT8WeQxqpuwa4Ft4KVzd9aE_Y4Dfw,,&data=eEwyM2lDYU9Gd1VROE1ZMXhZYkJTWDVrTGhVcE1wemlkSk5EM3laa2tHWV94OHNXcHk4RnRlc1FIVklQNWt0VGhiclNzek1jUjFLRkREbDgzZFZWY09USTgxcmhDaWRvQlFUS3QwQlBOY3FpcnlWTjhzdVljdyws&sign=bc640a009f27c908c8e933b2c21f23a1&keyno=IMGS_0&b64e=2&l10n=ru",812,168,A-parser.com,"摩尔多瓦 Anti-DDos 服务器, 共享主机, 虚拟服务器 - AlexHost.md A-Parser - 面向 SEO 专家的爬虫工具","参与者姓名 (用逗号分隔)."
https://a-parser.com/img/[email protected],"http://yandex.ru/clck/jsredir?from=yandex.ru%3Bimages%2Fsearch%3Bimages%3B%3B&text=&etext=9185.ETuLDRARHEaaFwvbgNKw9uM4q71GWnOQWYj5gryTT3A.dce07e8678375a61f9da58b9
以 SQL 格式保存
结果格式:
[% FOREACH serp; "INSERT INTO serp VALUES('" _ query _ "', '"; link _ "', '"; anchor _ "', '"; snippet _ "')\n"; END %]
结果示例:
INSERT INTO serp VALUES('https://a-parser.com/img/[email protected]', 'http://yandex.ru/clck/jsredir?from=yandex.ru%3Bimages%2Fsearch%3Bimages%3B%3B&text=&etext=9185.uxDvfCNKxEc5m2Ng0E898hRKXfLpKX45_I37SUneIIw.835ff0ed4890d11f17ca31577ed7f5655791c30d&uuid=&state=iric5OQ0sS2054x1_o8yG9mmGMT8WeQxqpuwa4Ft4KVzd9aE_Y4Dfw,,&data=eEwyM2lDYU9Gd1VROE1ZMXhZYkJTV1BmN1I3WUhYbU8yYzNOUTBxMk5pV0xtNWV1LU1RcVVLRzlVeDVPdkgwWGNEaUtRQ3g2VmdOTEJwNHlCeFVfeXJFcUJ3VzYxM2U5U3p0aU9VeDBUWVF4ZmpfeXJWYTRPVzI4MGNIcVVVdXM,&sign=d97654624d5d234f495a10f2357e86af&keyno=IMGS_0&b64e=2&l10n=ru', 'Logo 品牌 商标, 设计 PNG HotPNG', 'SEO. 艺术. 爬虫工具.')
INSERT INTO serp VALUES('https://a-parser.com/img/[email protected]', 'http://yandex.ru/clck/jsredir?from=yandex.ru%3Bimages%2Fsearch%3Bimages%3B%3B&text=&etext=9185.uxDvfCNKxEc5m2Ng0E898hRKXfLpKX45_I37SUneIIw.835ff0ed4890d11f17ca31577ed7f5655791c30d&uuid=&state=iric5OQ0sS2054x1_o8yG9mmGMT8WeQxqpuwa4Ft4KVzd9aE_Y4Dfw,,&data=eEwyM2lDYU9Gd1VROE1ZMXhZYkJTUTlnNkFnVWVCb2pvdVhLTGZ5bjVyTTFYVlRTWmx3NWM3Z3NvTmhQTjVHSjh3QkFodW5UQVJhMzktQThKb3poMGhneTNjUW85bWd3T0xOWG1sc2NfVTBDR0dqSGpsM1hvZyws&sign=017aec6f768d2737acb2e14d46ef1d29&keyno=IMGS_0&b64e=2&l10n=ru', 'logo, 品牌, 商标', '免费 logo, 品牌, 商标透明图像')
INSERT INTO serp VALUES('https://a-parser.com/img/[email protected]', 'http://yandex.ru/clck/jsredir?from=yandex.ru%3Bimages%2Fsearch%3Bimages%3B%3B&text=&etext=9185.uxDvfCNKxEc5m2Ng0E898hRKXfLpKX45_I37SUneIIw.835ff0ed4890d11f17ca31577ed7f5655791c30d&uuid=&state=iric5OQ0sS2054x1_o8yG9mmGMT8WeQxqpuwa4Ft4KVzd9aE_Y4Dfw,,&data=eEwyM2lDYU9Gd1VROE1ZMXhZYkJTWDVrTGhVcE1wemlkSk5EM3laa2tHWV94OHNXcHk4RnRlc1FIVklQNWt0VGhiclNzek1jUjFJQkh3QU1mQ3RYMzRLemtzWWFOUkNHVWMtQjBuNG9MNE1EUXY2WTRHdlF6USws&sign=36d07408817d9f6cb632a07a1b8fdf27&keyno=IMGS_0&b64e=2&l10n=ru', '摩尔多瓦 Anti-DDos 服务器, 共享主机, 虚拟服务器 - AlexHost.md A-Parser - 面向 SEO 专家的爬虫工具', '参与者姓名 (用逗号分隔).')
INSERT INTO serp VALUES('https://a-parser.com/img/[email protected]', 'http://yandex.ru/clck/jsredir?from=yandex.ru%3Bimages%2Fsearch%3Bimages%3B%3B&text=&etext=9185.uxDvfCNKxEc5m2Ng0E898hRKXfLpKX45_I37SUneIIw.835ff0ed4890d11f17ca31577ed7f5655791c30d&uuid=&state=iric5OQ0sS2054x1_o8yG9mmGMT8WeQxqpuwa4Ft4KVzd9aE_Y4Dfw,,&data=eEwyM2lDYU9Gd1VROE1ZMXhZYkJTWUpKSVpuZ1NOanZJbFJRTUVtX3VvWGpMWklYSjUzU0k0a0lzX05oWHctQ1VtbmtiSFZja3NreVlRZUJWQ19iZjZfRU1SbzRFc0JDOWxwOXB1b0hjdGRVYjdJellvZFNJYUdhRVluMEwwN0Z4VkZpN3Zpa09GMzNnNjl3cE1vVkktNFpId1FTUUhDVmdNUzVFMFdrNW5ybGZnN2MwbHBsbEZPRDZTemhZMkszS1FpYk1qSFEtYzdvSDFKeVhxYkl0UFREVl9JdFl4aG5VM25XN2VIMU1TZyw,&sign=d9e51f729589a46e246c862e189bfd9c&keyno=IMGS_0&b64e=2&l10n=ru', '出售 - A-Parser 1.1 - 高级搜索引擎爬虫工具, Suggest, PR, DMOZ, Whois, etc 第 6 页 Openssource 论坛 - 付费', '改进.')
INSERT INTO serp VALUES('https://a-parser.com/img/[email protected]', 'http://yandex.ru/clck/jsredir?from=yandex.ru%3Bimages%2Fsearch%3Bimages%3B%3B&text=&etext=9185.uxDvfCNKxEc5m2Ng0E898hRKXfLpKX45_I37SUneIIw.835ff0ed4890d11f17ca31577ed7f5655791c30d&uuid=&state=iric5OQ0sS2054x1_o8yG9mmGMT8WeQxqpuwa4Ft4KVzd9aE_Y4Dfw,,&data=eEwyM2lDYU9Gd1VROE1ZMXhZYkJTWDVrTGhVcE1wemlkSk5EM3laa2tHWV94OHNXcHk4RnRWWXVjbVdIS0pBRXVKT0Vqam9ZYzhJb0JqWE1NVXJ2bzJZNmdZRDVKUmh3RGtxa1B6T0VJaFdoODZzaVlNaFJzZyws&sign=2eca863b00a2bab3476f52a9606630fb&keyno=IMGS_0&b64e=2&l10n=ru', '1.2.31 - x64 Windows 版, JS 引擎更新, 已保存任务的操作改进 A-Parser - 面向 SEO 专家的爬虫工具', '改进')
将结果转储为 JSON
通用结果格式:
[% IF notFirst;
",\n";
ELSE;
notFirst = 1;
END;
obj = {};
obj.query = query;
obj.images = [];
FOREACH item IN p1.serp;
obj.images.push({
width = item.width
height = item.height
link = item.link
domain = item.domain
anchor = item.anchor
snippet = item.snippet
});
END;
obj.json %]
起始文本:
[
结束文本:
]
结果示例:
[
{
"images": [
{
"link": "http://yandex.ru/clck/jsredir?from=yandex.ru%3Bimages%2Fsearch%3Bimages%3B%3B&text=&etext=9185.fPvS0vLPfHWHZDPPGXubS8FigbFnHfCJbYCF6xqFopg.f1cf96ba17ad739c8628c9c0f74bb0f4d7deeaa0&uuid=&state=iric5OQ0sS2054x1_o8yG9mmGMT8WeQxqpuwa4Ft4KVzd9aE_Y4Dfw,,&data=eEwyM2lDYU9Gd1VROE1ZMXhZYkJTV1BmN1I3WUhYbU8yYzNOUTBxMk5pV0xtNWV1LU1RcVVLRzlVeDVPdkgwWGNEaUtRQ3g2VmdOTEJwNHlCeFVfeXVFRkowMXBsZ3BVcnpZZmVHTEYxUGRvOFV2QUpvczV2cTRuc2xORGhMZDQ,&sign=bba7f70e675fb2aad9c8551b3cd8b6e9&keyno=IMGS_0&b64e=2&l10n=ru",
"width": "800",
"snippet": "SEO. 艺术. 爬虫工具.",
"anchor": "Logo 品牌 商标, 设计 PNG HotPNG",
"page": "Hotpng.com",
"height": "150"
},
{
"link": "http://yandex.ru/clck/jsredir?from=yandex.ru%3Bimages%2Fsearch%3Bimages%3B%3B&text=&etext=9185.fPvS0vLPfHWHZDPPGXubS8FigbFnHfCJbYCF6xqFopg.f1cf96ba17ad739c8628c9c0f74bb0f4d7deeaa0&uuid=&state=iric5OQ0sS2054x1_o8yG9mmGMT8WeQxqpuwa4Ft4KVzd9aE_Y4Dfw,,&data=eEwyM2lDYU9Gd1VROE1ZMXhZYkJTUTlnNkFnVWVCb2pvdVhLTGZ5bjVyTTFYVlRTWmx3NWM3Z3NvTmhQTjVHSjh3QkFodW5UQVJaTTRERF92dEZhZFBza21oYnlLc0pZSDhQeGdFaUNFdU16SFJNLWNaclFXQSws&sign=a47c000c53fc80767795a2b0819ea6f7&keyno=IMGS_0&b64e=2&l10n=ru",
"width": "900",
"snippet": "免费 logo, 品牌, 商标透明图像",
"anchor": "logo, 品牌, 商标",
"page": "Freepng.ru",
"height": "180"
},
{
"link": "http://yandex.ru/clck/jsredir?from=yandex.ru%3Bimages%2Fsearch%3Bimages%3B%3B&text=&etext=9185.fPvS0vLPfHWHZDPPGXubS8FigbFnHfCJbYCF6xqFopg.f1cf96ba17ad739c8628c9c0f74bb0f4d7deeaa0&uuid=&state=iric5OQ0sS2054x1_o8yG9mmGMT8WeQxqpuwa4Ft4KVzd9aE_Y4Dfw,,&data=eEwyM2lDYU9Gd1VROE1ZMXhZYkJTWDVrTGhVcE1wemlkSk5EM3laa2tHWV94OHNXcHk4RnRlc1FIVklQNWt0VGhiclNzek1jUjFJcU5MZFJfR3NyX0FoZVNOdnZPVm5TdzBlUnVQb3pIWjFWZng0Q2ZpcXFFUSws&sign=5988df2675527240c78df4632a0bf184&keyno=IMGS_0&b64e=2&l10n=ru",
"width": "812",
"snippet": "参与者姓名 (用逗号分隔).",
"anchor": "摩尔多瓦 Anti-DDos 服务器, 共享主机, 虚拟服务器 - AlexHost.md A-Parser - 面向 SEO 专家的爬虫工具",
"page": "A-parser.com",
"height": "168"
}
],
"query": "https://a-parser.com/img/[email protected]"
}
]
提示
为了使“Prepend text”和“Append text”选项在任务编辑器中可用,需要激活“More options”。
可用设置
| 参数 | 默认值 | 描述 |
|---|---|---|
| AntiGate preset | default | 选择 Util::AntiGate 预设,更多设置详情请见此处 |
| AntiGate preset for old captcha | default | 与 AntiGate preset 类似,但仅用于普通(旧式,单张图片形式)验证码。如果此处未选择预设,则此类验证码将使用在 AntiGate preset 中选择的预设。 |
| Experimental img captcha max count | 5 | 每次尝试中重复验证码图片的最大次数 |
| Preffered captcha type | Click | 选择首选验证码类型:Click 或 Puzzle |
| Yandex domain | yandex.ru | 用于数据抓取的 Yandex 域名,支持所有域名 |
| Filter pages | Moderate filter | 过滤结果中的不良内容 |
| Don't scrape if no other sizes | ☐ | 如果找不到其他尺寸的目标图片,允许禁用结果采集 |
| Use sessions | ☑ | 保存良好的会话,从而可以更快地抓取并减少错误数量 |
