SE::Bing::Video - Bing 视频爬虫工具

爬虫工具概览
Bing 视频搜索爬虫工具。通过
SE::Bing::Video 爬虫工具,您可以获取视频链接库。您可以直接使用在 Bing 搜索框中输入的查询格式。
A-Parser 的功能允许保存
SE::Bing::Video 爬虫工具的数据抓取设置以供后续使用(预设),设置数据抓取计划等。您可以使用查询自动扩展、从文件中替换子查询、循环遍历字母数字组合及列表,以获取尽可能多的结果。
得益于内置强大的 Template Toolkit 模板引擎,您可以按所需的格式和结构保存结果,该引擎允许对结果应用额外逻辑,并以各种格式输出数据,包括 JSON、SQL 和 CSV。
采集的数据
- 视频链接
- 标题
- 视频所在的平台名称
- 时长、观看次数和发布日期
- 视频预览图链接

功能
- 选择抓取的页数
- 选择地区
应用场景
- 采集视频用于填充自己的博客、视频站、门页站(doorways)...
- 采集文本数据
查询
查询时需要指定搜索词,例如:
Cats
Football
Waterfall
Speak in english
cars
查询宏替换
您可以使用 内置宏 来扩展查询,例如我们想要获取一个非常大的论坛数据库,可以指定几个不同语言的基础查询:
forum
论坛
foro
论坛
在查询格式中指定从 a 到 zzzz 的字符遍历,这种方法可以最大限度地轮换搜索结果并获得大量新的唯一结果:
$query {az:a:zzzz}
该宏将为每个原始搜索查询创建 475254 个额外查询,总计将产生 4 x 475254 = 1901016 个搜索查询。这个数字虽然惊人,但对于 A-Parser 来说完全不是问题。在每分钟 2000 个查询的速度下,此类任务仅需 16 小时即可处理完毕。
结果输出示例
A-Parser 凭借内置的模板引擎 Template Toolkit 支持灵活的结果格式化,这使其能够以任意形式以及结构化形式(如 CSV 或 JSON)输出结果。
默认输出
结果格式:
$serp.format('$link\n')
结果示例:
https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=79AF507BCEEA455ACC1679AF507BCEEA455ACC16&&FORM=VRDGAR
https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=86FB4CDD27E041A3F95586FB4CDD27E041A3F955&&FORM=VRDGAR
https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=3AD36B1FAFC111F9C6F03AD36B1FAFC111F9C6F0&&FORM=VRDGAR
https://www.msn.com/en-gb/sport/golf/benefits-of-winning-the-masters-golf/vi-AA1lNwOI
https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=D8EB9E5532894EACFB73D8EB9E5532894EACFB73&&FORM=VRDGAR
https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=9CB33DC7E23801445F3F9CB33DC7E23801445F3F&&FORM=VRDGAR
https://talksport.com/football/1685319/troy-deeney-forest-green-rovers-manager/
https://ca.sports.yahoo.com/news/best-30-mens-cricketers-britain-140144281.html
https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=B9593E6DF96A59F4D941B9593E6DF96A59F4D941&&FORM=VRDGAR
https://www.msn.com/en-gb/sport/golf/6-golf-tips-golf-monthly/vi-AA1lNrLU
https://sports.yahoo.com/best-30-mens-cricketers-britain-140144281.html
https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=086DF2129F5807EC02F1086DF2129F5807EC02F1&&FORM=VRDGAR
https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=14632A97F627B502518514632A97F627B5025185&&FORM=VRDGAR
输出到 CSV 表格
结果格式:
[% FOREACH item IN serp;
tools.CSVline(query, item.link, item.name, item.preview, item.duration);
END %]
结果示例:
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=79AF507BCEEA455ACC1679AF507BCEEA455ACC16&&FORM=VRDGAR,"England's Mary Earps wins 2023 Sports Personality of th",w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,3:35
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=86FB4CDD27E041A3F95586FB4CDD27E041A3F955&&FORM=VRDGAR,"1972 Chevy Super Sport Nova",w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,0:51
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=3AD36B1FAFC111F9C6F03AD36B1FAFC111F9C6F0&&FORM=VRDGAR,"1968 Super Sport Chevelle",w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,0:51
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=FBBB3E08963152230A54FBBB3E08963152230A54&&FORM=VRDGAR,"We had to have some hard conversations - Marsters",https://tse2.mm.bing.net/th?id=OVF.O3Nq%2bBQ%2bjnbhZnbfYxDA7w&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,7:51
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=086DF2129F5807EC02F1086DF2129F5807EC02F1&&FORM=VRDGAR,"Ja Morant Hits Buzzer-Beater, Seals Victory Post-Suspension",https://tse2.mm.bing.net/th?id=OVF.ON%2fSFfXw5e9WwzZEMbbEeQ&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,1:09
sport,https://www.bbc.co.uk/sport/football/67723705,"Ollie Watkins: Aston Villa striker explains controversia",https://tse3.mm.bing.net/th?id=OVF.Hc9LkZQ9XhYo%2bFbAtxpLGg&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,
sport,https://www.bbc.com/sport/articles/c2vyevn0g7zo,"Anthony Ogogo: 'Why I used to hide being a Norwich City fan'",https://tse3.mm.bing.net/th?id=OVF.kvcGexXDQxqqCSiNRXEkRg&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,1:15
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=9FDCCE66150310EB99CB9FDCCE66150310EB99CB&&FORM=VRDGAR,"Aaron Rodgers Eyes Future Beyond 40 Despite Achilles ",https://tse4.mm.bing.net/th?id=OVF.fMSU0FvKihMc8q2TjXg%2fkw&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,1:13
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=361720861BF1297ADE98361720861BF1297ADE98&&FORM=VRDGAR,"Dillon Brooks, Ime Udoka Penalized For Outbursts At R",https://tse1.mm.bing.net/th?id=OVF.3TNSq7yVIFY84%2fQsm5KyJQ&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,1:12
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=B9593E6DF96A59F4D941B9593E6DF96A59F4D941&&FORM=VRDGAR,"Manchester United, Arsenal and the battle for Mary Earps",https://tse3.mm.bing.net/th?id=OVF.bK8xXZhzmQ0PD8CbFvDaGg&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,1:18
sport,https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=14632A97F627B502518514632A97F627B5025185&&FORM=VRDGAR,"Miller desperate for debut",https://tse2.mm.bing.net/th?id=OVF.a8MhMzLvFmPQ5fqRbc3l0g&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1,3:38
以 SQL 格式保存
结果格式:
[% FOREACH serp; "INSERT INTO serp VALUES('" _ query _ "', '"; directLink _ "', '"; name.replace("\n", '\n') _ "', '"; author _ "')\n"; END %]
结果示例:
INSERT INTO serp VALUES('sport', 'https://www.youtube.com/watch?v=d5sxT8CACHM', 'England's Mary Earps wins 2023 Sports Personality of th', 'BBC Sport')
INSERT INTO serp VALUES('sport', 'https://sports.yahoo.com/best-30-mens-cricketers-britain-140144281.html', 'Best 30 men's cricketers in Britain right now', 'Tim Wigmore')
INSERT INTO serp VALUES('sport', 'https://www.msn.com/en-us/sports/more-sports/when-usain-bolt-and-andre-de-grasse-smile-the-whole-world-smiles-with-them-olympic-memories/vi-AA1lMZ2W', 'When Usain Bolt and Andre de Grasse smile, the whole worl', 'The Independent News')
INSERT INTO serp VALUES('sport', 'https://www.msn.com/en-us/sports/more-sports/1968-super-sport-chevelle/vi-AA1lMLLn', '1968 Super Sport Chevelle', 'FOX 13 Tampa Bay')
INSERT INTO serp VALUES('sport', 'https://www.msn.com/en-gb/sport/golf/benefits-of-winning-the-masters-golf/vi-AA1lNwOI', 'Benefits Of Winning The Masters Golf', 'Dailymotion')
INSERT INTO serp VALUES('sport', 'https://www.independent.co.uk/sport/darts/world-darts-championship-live-stream-scores-results-b2467256.html', 'PDC World Darts Championship LIVE: Results', 'Michael Jones')
INSERT INTO serp VALUES('sport', 'https://www.msn.com/en-us/sports/nfl/aaron-rodgers-eyes-future-beyond-40-despite-achilles-setback/vi-AA1lNg0R', 'Aaron Rodgers Eyes Future Beyond 40 Despite Achilles S', 'unbranded - Sport')
INSERT INTO serp VALUES('sport', 'https://www.msn.com/en-gb/sport/golf/6-golf-tips-golf-monthly/vi-AA1lNrLU', '6 Golf Tips | Golf Monthly', 'Dailymotion')
INSERT INTO serp VALUES('sport', 'https://www.msn.com/en-us/autos/news/1972-chevy-super-sport-nova/vi-AA1lN3Px', '1972 Chevy Super Sport Nova', 'FOX 13 Tampa Bay')
INSERT INTO serp VALUES('sport', 'https://www.youtube.com/watch?v=1DtqwboJVFc', 'Desi Cricket Pakistan Final Match Bhutto Club Vs GB Cal', 'Desi Sport GB')
INSERT INTO serp VALUES('sport', 'https://ca.sports.yahoo.com/news/best-30-mens-cricketers-britain-140144281.html', 'Best 30 men's cricketers in Britain right now', 'Tim Wigmore')
INSERT INTO serp VALUES('sport', 'https://www.independent.co.uk/sport/football/mary-earps-manchester-united-arsenal-spoty-b2467111.html', 'Manchester United, Arsenal and the battle for Mary Earps', 'Ben Fleming')
将结果转储为 JSON
通用结果格式:
[% IF notFirst;
",\n";
ELSE;
notFirst = 1;
END;
obj = {};
obj.query = query;
obj.videos = [];
FOREACH item IN p1.serp;
obj.videos.push({
link = item.link
name = item.name
duration = item.duration
author = item.author
preview = item.preview
});
END;
obj.json %]
起始文本:
[
结束文本:
]
结果示例:
{
"videos": [{
"link": "https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=79AF507BCEEA455ACC1679AF507BCEEA455ACC16&&FORM=VRDGAR",
"preview": "https://tse1.mm.bing.net/th?id=OVF.BbkN01YgJzwRV0nBF%2ff%2fQQ&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1",
"name": "England's Mary Earps wins 2023 Sports Personality of th",
"author": "BBC Sport",
"duration": "3:35"
}, {
"link": "https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=86FB4CDD27E041A3F95586FB4CDD27E041A3F955&&FORM=VRDGAR",
"preview": "https://tse3.mm.bing.net/th?id=OVF.SPaQMo8Zrt%2fF5bGyKS0rQA&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1",
"name": "1972 Chevy Super Sport Nova",
"author": "FOX 13 Tampa Bay",
"duration": "0:51"
}, {
"link": "https://www.bing.com/videos/riverview/relatedvideo?&q=sport&&mid=3AD36B1FAFC111F9C6F03AD36B1FAFC111F9C6F0&&FORM=VRDGAR",
"preview": "https://tse3.mm.bing.net/th?id=OVF.d1Q3sVw%2fHfzK9x2Z%2fV5Qkg&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1",
"name": "1968 Super Sport Chevelle",
"author": "FOX 13 Tampa Bay",
"duration": "0:51"
}, {
"link": "https://www.msn.com/en-gb/sport/golf/benefits-of-winning-the-masters-golf/vi-AA1lNwOI",
"preview": "https://tse4.mm.bing.net/th?id=OVF.0Qa9k1McfmxqQgQudnQ%2bnw&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1",
"name": "Benefits Of Winning The Masters Golf",
"author": "Dailymotion",
"duration": "1:46"
}, {
"link": "https://www.skysports.com/watch/video/13034880/radek-szaganskis-142-checkout-propels-him-to-round-1-victory",
"preview": "https://tse4.mm.bing.net/th?id=OVF.GBYcZsZ4KRxIcMCTRyvclw&w=309&h=173&c=7&rs=1&qlt=90&o=5&pid=2.1",
"name": "Radek Szaganski’s 142 checkout propels him to Rou",
"author": "",
"duration": "0:41"
}], "query": "sport"
}
提示
要在任务编辑器中显示“Prepend text”和“Append text”选项,需要激活“More options”。
可能的设置
| 参数 | 默认值 | 描述 |
|---|---|---|
| Pages count | 1 | 要抓取的页数 |
| Region | Based on IP | 选择地区。地区列表。 |
| Interface language | Any | 选择界面语言。语言列表。 |