Help with scraping comments and comment reactions and exporting CSV #811
Answered
by
neon-ninja
sk1pd1v1d3d
asked this question in
Q&A
-
I would like to scrape the text of an individual post's comments with each comment's total number of reactions (not the reactions to the post), number of each reaction type, and posted date (number of replies would be great as well if possible). I have successfully used the following code to scrape the text of comments and export them to a CSV file, but I don't know how to add additional columns for the total number of reactions, number of each reaction type, and posted date. I would appreciate any help. from pprint import pprint
from facebook_scraper import *
import logging
import os
import json
import pandas as pd
from tqdm import tqdm
import time
post_id = ['https://www.facebook.com/XXXX/posts/XXXXX']
cookies="cookies.json"
options = {"comments": 500, "progress": True, "allow_extra_requests": True}
fb_comments = {}
for post in get_posts(post_urls=post_id, cookies=cookies, options=options):
for c in post.get("comments_full"):
fb_comments[c.get("commenter_id")] = c.get("comment_text")
for r in c.get("replies", []):
fb_comments[r.get("commenter_id")] = r.get("comment_text")
df = pd.DataFrame(fb_comments.items(), columns=["id", "comment"])
df.to_csv("fbcomments.csv", index=False) |
Beta Was this translation helpful? Give feedback.
Answered by
neon-ninja
Jul 21, 2022
Replies: 1 comment 1 reply
-
Here's one way of solving this: from pprint import pprint
from facebook_scraper import *
import logging
import os
import json
import pandas as pd
from tqdm import tqdm
import time
post_ids = ["pfbid0BwGx3KRtMZtRCb9Aj5vVwugs8jXTcX4QT4hc6UGrfg61DJzFo7mVFTJZReGwWeZZl"]
cookies = "cookies.txt"
set_cookies("cookies.txt")
options = {"comments": 500, "progress": True, "comment_reactors": True}
def format_comment(c):
obj = {
"comment_id": c["comment_id"],
"comment_text": c["comment_text"],
"comment_reaction_count": c["comment_reaction_count"] or 0,
"reply_count": len(c["replies"]) if "replies" in c else 0,
"comment_time": c["comment_time"]
}
if c["comment_reactions"]:
obj.update(c["comment_reactions"])
return obj
fb_comments = []
post = next(get_posts(post_urls=post_ids, options=options))
for comment in post["comments_full"]:
fb_comments.append(format_comment(comment))
for reply in comment["replies"]:
fb_comments.append(format_comment(reply))
pd.DataFrame(fb_comments).to_csv("fbcomments.csv", index=False) |
Beta Was this translation helpful? Give feedback.
1 reply
Answer selected by
neon-ninja
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Here's one way of solving this: