Data

CommentR Interaction Dataset: a large-scale collection of human-LLM interactions from Weibo, focusing on posts where users actively mention the LLM agent account @CommentR. This dataset contains 557,645 posts, along with the corresponding 304,400 unique users and all related comments, enabling research into user engagement patterns, demographic traits, and language behaviors in public interactions with a social media AI agent. [Dataset]

GitHub Data: 10,649,574 users, 118,602,740 commits, and 20,999,258 repositories (collected in Jun.-Aug., 2018) [Dataset]

Detailed Check-In Data of the Swarm App: 1,562,452 check-ins generated by 5,112 Swarm users (collected in Apr.-May 2017) [Dataset]

Foursquare Tip Data: Tips of 6.52 million Foursquare users (collected in Nov. 2015)

Social graph and number of check-ins of the Swarm App: all 60+ million users (collected in Aug.-Sep. 2015) [Dataset]

Cross-Site Linking Data: An anonymized dataset of 60+ million Foursquare users with a focus on cross-site linking (collected in Aug. 2015) [Dataset]

Google Scholar Data: A co-authorship network with 402.39K nodes and 1.23 million edges (collected in May 2015) [Dataset]

Please contact me if you are interested in any of these data sets.