跳转到主要内容

Social::Instagram::Post - Instagram 帖子爬虫工具

Social::Instagram::Post

爬虫工具概览

Social::Instagram::PostSocial::Instagram::Post – 从 Instagram 抓取帖子。自动采集所有数据:帖子类型、文本、发布日期、评论数量、点赞数量等。允许在 3 分钟内抓取多达 1200 条帖子评论。除了评论文本外,还会采集作者的个人资料链接、点赞数以及创建时间。
备注

除了 $video_link 之外,所有简单的结果都可以在不使用身份验证的情况下获取。 要采集评论和 $video_link,需要进行身份验证,必须指定 cookie 选项的值。操作方式与爬虫工具相同 Social::Instagram::Profile.

爬虫工具应用案例

采集帖子下的评论

使用示例
  1. 指定身份验证所需的选项值(cookiex-csrftokenx-ig-app-id)。
  2. 覆盖 Page count 选项,在列表中选择 100
  3. 覆盖 Result format 选项,将值设置为 $comments.format('$text\n')
  4. 在查询中指定帖子链接。
下载示例

如何将示例导入 A-Parser

eJyNVN1v2jAQ/1+sSl0lROBhe8gbICExsYYV+kRRZSWXzMPxubYDraL87zvnE7p1
6pvv7ve7b1/JHLdHuzFgwVkW7kum6zcLWQIpL6RjI6a5sWC8ec+2GAsuw3ClrOOZ
4XkYbtB6VI8vmXvTQB7wBMaIBMgoEpI1zyDGQnn0icuCMNPJpPqYECMeBQxoxv4H
tibd4RHUJ/Fc61XySWyKJueOOtM0pKPcxJjnoJwdN4Avt53m2cGre3pSt3esOhxG
rKHaZQ3zTD0dt53ujVt+gh36aELCoF6SdM/zOl7CHXhrF+9u7F69B54kwglUXDYR
/KyGqI9KvNT5KiQsPY0AuzSYk8rn2Srfuuz27KaWGbkoau7PhsPClEsLI2Yp1SWn
RJL3FuHAcIcm0j4f0pcM1UzKNZxADrDa/7wQMqHFmqVEWrXEf0Oiv3xUfXmXoWhu
Z0M59F5qaR79GFgJrjHrmiFFLhzJdlHvZcgmpDwC6L5n9x6Wo4E+TOu5jU5/R4Py
SzKMbKYH1VUZV2O5VsaoUpFF7eJ1yELt6INGaoG5luDrUoWUNBYLD8N6zGw7Bi8M
Cb4nL+oQV1/bIUr7fdukqo2g9fvqE8ypk5dRW5cxl/LxYX1pYcNKkfDLOW3DIDif
z2PRnYgxfYpAB4sXO32eq/U2/hYw78tBhrRlVGl16K9Mf4vKD29NWFY0x9920zB8
0R5POuqepSHRXan+AOqBsTk=

采集数据

  • 帖子类型
  • 文本
  • 发布日期
  • 评论数量
  • 点赞数量
  • 图片/视频的宽度和高度
  • 作者个人资料链接
  • 作者昵称
  • 地理位置
  • 点赞信息
  • 评论信息
  • 位置链接
  • 评论
    • 作者个人资料链接
    • 文本
    • 点赞数
    • 创建时间

使用场景

  • 采集帖子信息

查询

必须在查询中指定帖子链接,例如:

https://www.instagram.com/p/Cqs1_BnLSc6/
https://www.instagram.com/p/CqvaaM5MHVW/

结果输出示例

A-Parser 凭借内置的模板引擎 Template Toolkit 支持灵活的结果格式化,允许以任意形式以及结构化形式(如 CSV 或 JSON)输出结果。

默认输出

结果格式:

$query: comments: $comments_count, likes: $likes_count\nText: $text

结果示例:

https://www.instagram.com/p/Cqs1_BnLSc6/: comments: 7268, likes: 362777
Text: “Like anyone else, I get lonely or insecure. Making my art is like creating companions or more well-formed versions of myself. My artworks are like mirrors, but ones where I can manipulate what’s being reflected at me.” —Artist @gitte_maria (Gitte Maria Moller)

Realized across mixed media, including wire frames, 3D sculpture, print plexiglass, panel paintings and site-specific installations, Gitte’s art explores themes in multitude: icons of childhood, 90s computer games, walls, fences and dead ends, pain and joy.

“I draw inspiration across time and history, from the ancient to the contemporary, but I like to think that these things all exist on the same plane and have equal value. Things only have value or importance because we ascribe it, so I like to play around with that, both in the materials I work with and in the content.

I try not to be too goal oriented when I am making work. It’s rather a feeling of something interior that spreads outward. The meaning becomes clearer to me when I’ve had some distance from the reality that generated the artwork. I like to be surprised in this way, so I create a lot of space for chance.”

Art by @gitte_maria

帖子评论者的个人资料链接

结果格式:

$comments.format('$author\n')

结果示例:

https://www.instagram.com/soniyadhawan49/
https://www.instagram.com/meb_ok/
https://www.instagram.com/_shiv_mahakal_0/
https://www.instagram.com/badkonak_miti/
https://www.instagram.com/idarebax9.212.189.baxs.9.999.9/
https://www.instagram.com/_shifa_khan_17_/
https://www.instagram.com/_navid.amiri/
https://www.instagram.com/ariani91___/
https://www.instagram.com/jason.mooreeuc9yg5r1/
https://www.instagram.com/intelligent_girl_231/
https://www.instagram.com/_fasilfiros/
https://www.instagram.com/yonder3r/
https://www.instagram.com/kaminisahu1234/
https://www.instagram.com/violet_organics/
https://www.instagram.com/monic_a9243/

可能的设置

参数默认值描述
Pages count1用于抓取评论的页数
cookie身份验证选项,在抓取评论时为必填项。
x-csrftoken身份验证选项,在抓取评论时为必填项。
x-ig-app-id身份验证选项,在抓取评论时为必填项。