SE::Google::ByImage - 按链接搜索图片

爬虫工具概览
Google 按图搜索爬虫工具。通过 SE::Google::ByImage 爬虫工具,您可以获取图像链接库或准备好供进一步使用的图像。您可以直接使用在 Google 搜索栏中输入的原始查询格式。
A-Parser 的功能允许您保存 Google 爬虫工具的数据抓取设置以便日后使用(预设),设置数据抓取计划等等。您可以使用自动查询扩展、从文件中替换子查询、遍历数字字母组合和列表,以获取尽可能多的结果。
由于内置了强大的 Template Toolkit 模板引擎,您可以按所需的格式和结构保存结果,该引擎允许对结果应用额外的逻辑并以各种格式输出数据,包括 JSON, SQL 和 CSV。
爬虫工具应用场景
按链接下载图片
A-Parser 允许使用任务链,在第一个任务完成后,将开始执行第二个任务,第一个任务中的链接将作为第二个任务的查询。
下载示例
eJyNVk1T2zAQ/SuMJofQgm0604svNKGlpUMJhXAKdEaNN65AlowkAxmT/95d2bHj
YGhv3tVqP98+uWSO2zt7bsCCsyyelSz33yxmX7VOJeyMlyLjKex81o9Kap6wPZZz
Y8GQ+YxdfonjyjKOx8sTMkWLBBa8kI7tlcwtc0Bv+gGMEQkdigTl3OinpQFnBFjU
PXBZkNlBFLHVzc0ewyzQgT3WJuOUzSA/COrUmsNL/gBTjYcLIaFVH6N0xjNyN0i4
AzoNFt7RcDdwT+SBJ4lwQisuqwhUTBv1Sol7n47F/FSK9ihSpsdGZ6h24J2QcrnO
cMYGXmbopvD3f1Z3WLzg0sIes5juMcdkku0T4cBwp80kp5xQXzKtRlKewgPI1sz7
HxdCJtj90QIvndQX+00mL3ysmhI3Q+FoHg3m0Hjx0njyo72V6FOdYuXJb6xbikw4
lO2RLhQNJ0LlHUDe9O1MoybTBpowzhTQBEe85aAIBu3URnmr6lTRmUxXOddqIdJJ
Da21ZaGmCOqJOtJZLoHKYh6XtsEwGAxSWLhoATOy9VBIaNLddnXkA1If1gBnTmtp
v19WiedGIB4/UroZtnUzh7q1cy7l1cVpJ7sWXyj8cS63cRjy/WrNgrnOQpGlodSp
3uf5pw9PQe4xOUdspxoRh2WvcNN6CleFlH174VfahhViA1Vk+4Oy/kaDVUDbVmF8
c9fPwMXxt+n0vLvh3Bi+rB37eabwNK22Xqxl/MZhOVBunwghDt4d+hzCIdbyfJvD
YfqcisUuNbS6Oi+s01k1oRYltGeUWIOIOtBC8taKugjcw4T6suYWpRWxhPaQpOUG
Zdt4yBLO46L6qP0q7eC+4HLTN8X3HNW7B30Q72WP/wHXmxiN2rVUegsPrwLtFep8
hQ+3F7FkVhdmTr4qsiP800wINtVQ+ocfDme/wpv3u9fXwfAw7s580A6hMl+9wv9I
5ryfMJpHq3fRu2wc9XFVs7jtIrJexo62Nr6XlzrMF718Vvy4XlJE9O+XIHr7Fdg+
7rwA0eqNJ6ufSt96K6LNd4J8+4bjQA5ofjVvNH8WZe+fQlzivZZWUET51p5Xlwnm
trLBsNZv7cHqL+EHA0s=
采集的数据
- 图片链接
- 页面链接
- 摘要 (Snippets)
- 锚文本
- 图片宽度和高度
- 搜索结果数量
- 查询中链接对应的图片宽度和高度
选项
- 采集图片供个人使用
- 建立图片库
- 采集图片描述
- 采集图片链接
查询
必须在查询中指定 Google 上的图片链接,例如:
https://a-parser.com/img/[email protected]
结果输出选项
A-Parser 支持通过内置的模板引擎 Template Toolkit 灵活格式化结果,这使其能够以任意形式以及结构化形式(如 CSV 或 JSON)输出结果。
默认输出
结果格式:
$serp.format('$link\n')
结果示例:
https://en.a-parser.com/img/[email protected]
https://en.a-parser.com/img/[email protected]
https://en.a-parser.com/img/[email protected]
https://en.a-parser.com/img/[email protected]
https://proxylist4you.com/wp-content/uploads/2018/09/[email protected]
https://proxylist4you.com/wp-content/uploads/2018/09/[email protected]
输出到 CSV 表格
结果格式:
[% FOREACH item IN serp;
tools.CSVline(query, item.link, item.width, item.height, item.anchor, item.snippet);
END %]
结果示例:
https://a-parser.com/img/[email protected],https://en.a-parser.com/,812,168,,"A-Parser - scraper for SEO professionals","A-Parser - scraper of search engines, WordStat, Whois, PR, YouTube, Alexa, Ahrefs, MajesticSEO, etc."
https://a-parser.com/img/[email protected],https://en.a-parser.com/online/,812,168,,"Current Visitors | A-Parser - scraper for SEO professionals","This is a list of all visitors currently browsing A-Parser - scraper for SEO professionals."
https://a-parser.com/img/[email protected],https://en.a-parser.com/wiki/unique/,812,168,,"Usage of the unique feature | A-Parser - scraper for SEO ...","Unique, deduplication, removing duplicates - all this implies that we don't need the repeating results. In A-Parser is 2 methods of unique, we ..."
https://a-parser.com/img/[email protected],https://en.a-parser.com/pages/support/knowledge-base,812,168,,"Knowledge Base | A-Parser - scraper for SEO professionals","A-Parser has been built with a vast understanding of extracting and processing large volumes of information. We strive to produce only market leading software ..."
https://a-parser.com/img/[email protected],https://proxylist4you.com/,812,168,,"Private Residental Rotating Proxies – Buy Cheapest Private ...
保存为 SQL 格式
结果格式:
[% FOREACH serp; "INSERT INTO serp VALUES('" _ query _ "', '"; link _ "', '"; anchor _ "', '"; snippet _ "')\n"; END %]
结果示例:
INSERT INTO serp VALUES('https://a-parser.com/img/[email protected]', 'https://en.a-parser.com/', 'A-Parser - scraper for SEO professionals', 'A-Parser - scraper of search engines, WordStat, Whois, PR, YouTube, Alexa, Ahrefs, MajesticSEO, etc.')
INSERT INTO serp VALUES('https://a-parser.com/img/[email protected]', 'https://en.a-parser.com/online/', 'Current Visitors | A-Parser - scraper for SEO professionals', 'This is a list of all visitors currently browsing A-Parser - scraper for SEO professionals.')
INSERT INTO serp VALUES('https://a-parser.com/img/[email protected]', 'https://en.a-parser.com/wiki/unique/', 'Usage of the unique feature | A-Parser - scraper for SEO ...', 'Unique, deduplication, removing duplicates - all this implies that we don't need the repeating results. In A-Parser is 2 methods of unique, we ...')
INSERT INTO serp VALUES('https://a-parser.com/img/[email protected]', 'https://en.a-parser.com/wiki/settings-and-presets/', 'Settings and presets | A-Parser - scraper for SEO professionals', 'Configs presets - settings of threads and methods of unique of tasks; Parsers presets - opportunity to set up each separate parcer; Proxy checker ...')
INSERT INTO serp VALUES('https://a-parser.com/img/[email protected]', 'https://proxylist4you.com/', 'Private Residental Rotating Proxies – Buy Cheapest Private ...', 'For you business is ready more than 11,000,000 unique monthly HTPP\HTTPS\Socks5\Socks4 Private Proxies from 170 countries all over the world with real ...')
将结果转储为 JSON
通用结果格式:
[% IF notFirst;
",\n";
ELSE;
notFirst = 1;
END;
obj = {};
obj.query = query;
obj.images = [];
FOREACH item IN p1.serp;
obj.images.push({
width = item.width
height = item.height
link = item.link
anchor = item.anchor
snippet = item.snippet
});
END;
obj.json %]
起始文本:
[
结束文本:
]
结果示例:
[{
"images": [
{
"link": "https://en.a-parser.com/",
"width": "812",
"snippet": "A-Parser - scraper of search engines, WordStat, Whois, PR, YouTube, Alexa, Ahrefs, MajesticSEO, etc.",
"anchor": "A-Parser - scraper for SEO professionals",
"height": "168"
},
{
"link": "https://en.a-parser.com/online/",
"width": "812",
"snippet": "This is a list of all visitors currently browsing A-Parser - scraper for SEO professionals.",
"anchor": "Current Visitors | A-Parser - scraper for SEO professionals",
"height": "168"
},
{
"link": "https://en.a-parser.com/wiki/unique/",
"width": "812",
"snippet": "Unique, deduplication, removing duplicates - all this implies that we don't need the repeating results. In A-Parser is 2 methods of unique, we ...",
"anchor": "Usage of the unique feature | A-Parser - scraper for SEO ...",
"height": "168"
},
{
"link": "https://en.a-parser.com/pages/support/knowledge-base",
"width": "812",
"snippet": "A-Parser has been built with a vast understanding of extracting and processing large volumes of information. We strive to produce only market leading software ...",
"anchor": "Knowledge Base | A-Parser - scraper for SEO professionals",
"height": "168"
},
{
"link": "https://proxylist4you.com/",
"width": "812",
"snippet": "For you business is ready more than 11,000,000 unique monthly HTPP\\HTTPS\\Socks5\\Socks4 Private Proxies from 170 countries all over the world with real ...",
"anchor": "Private Residental Rotating Proxies – Buy Cheapest Private ...",
"height": "168"
},
{
"link": "https://proxylist4you.com/index.php/buyprivateproxies/",
"width": "812",
"snippet": "Worldwide Mixed Residential Reverse Backconnect Rotating Private Proxies. This proxies support HTTP, HTTPS, Socks4, Socks5 protocols. · Worldwide ...",
"anchor": "All of our Proxy Packages – Private Residental Rotating Proxies",
"height": "168"
}
],
"query": "https://a-parser.com/img/[email protected]"
}]
提示
要在任务编辑器中使“Prepend text”和“Append text”选项可用,需要激活“More options”。
可能的设置
| 参数 | 默认值 | 描述 |
|---|---|---|
| Pages count | 5 | 要抓取的页面数量 |
| Google domain | www.google.com | 用于抓取的 Google 域名,支持所有域名 |
| Util::ReCaptcha2 preset | default | Util::ReCaptcha2 爬虫工具预设。需要预先配置 Util::ReCaptcha2 爬虫工具 - 指定您的访问密钥和其他参数,然后在此处选择创建的预设 |
| Interface language | English | 选择 Google 界面语言,以使爬虫工具和浏览器中的结果达到最大一致性 |
| Results language | Auto (Based on IP) | 选择结果语言(参数 lr=) |
| Search from country | Auto (Based on IP) | 选择进行搜索的国家(地理相关搜索,参数 gl=) |
