Social Media 数据 (Data) Toolkit: Standardization and Anonymization of Social Network 数据 (Data)sets
Social Media Data Toolkit: Standardization and Anonymization of Social Network Datasets
作者
Authors
Ali Najafi | Letizia Iannucci | Mikko Kivelä | Onur Varol
期刊
Journal
暂无期刊信息
年份
Year
2026
分类
Category
国家
Country
-
📝 摘要
Abstract
The rapid diversification of social media platforms and the increasing restrictions on official APIs have significantly complicated cross-platform analysis. Researchers are often forced to rely on heterogeneous datasets obtained through web scraping and historical archives; however they often lack structural consistency. Prior to conducting cross-platform social media analyses, one needs to answer three critical questions: (1) What makes platforms different and similar? (2) How were the datasets collected? (3) How can we align the datasets of different platforms to conduct fair analyses? To address these questions, we introduce the Social Media Data Toolkit (\projectname{}), a comprehensive Python framework designed for the standardization, anonymization, and enrichment of social network datasets. \projectname{} unifies diverse data structures into a generic schema comprising Communities, Accounts, Posts, Actions, and Entities to facilitate multi-platform research. The framework features a configurable anonymization module to secure Personally Identifiable Information (PII) and an extendable enrichment layer that integrates Large Language Models (LLMs) and network analysis tools for downstream tasks such as stance detection and toxicity scoring without creating codebase for different datasets. We demonstrate the versatility of \projectname{} through four case studies spanning from textual analysis of the content to network analysis across platforms. To offer reproducible social media research, \projectname{} is released as an open-source tool featuring detailed documentation and practical guides for researchers at any skill-level. It can be accessed at github.com/ViralLab/SMDT and varollab.com/SMDT.
📊 文章统计
Article Statistics
基础数据
Basic Stats
94
浏览
Views
0
下载
Downloads
15
引用
Citations
引用趋势
Citation Trend
阅读国家分布
Country Distribution
阅读机构分布
Institution Distribution
月度浏览趋势
Monthly Views