This repository contains code to reproduce the results in the paper Community Notes are Vulnerable to Rater Bias and Manipulation.
We're interested in testing the core version of the Community Notes algorithm.
Some notes about the simulated Community Notes data:
- It does not contain note summaries, thus topic modeling is not applicable.
- It does not contain detailed tags (e.g., SpamHarassmentOrAbuse), thus Harassment-Abuse Tag-Consensus Matrix Factorization and Note Status Explanation rule are not applicable.
- It assumes that every rating is posted 1 millisecond after the note, thus all ratings are valid ratings.
Changes were made to the scoring model to accommodate these details:
- In data processing, Post Selection Similarity module is disabled.
- In model training, only the CORE model is retained.
- Inside the CORE model training,
- When deciding rater helpfulness, only use the raterAgreeRatio (0.66).
- notHelpfulSpamHarassmentOrAbuse model is disabled.
- Low diligence model is disabled.
- In note status scoring rule,
- Only InitialNMR, CRH, CRNH are retained.
- In post training phase,
- PFLIP model is disabled.
data: C++ code to generate synthetic input data for each experimentexperiments: code to reproduce the results from scratch, including experiment metadata (definitions.json), main driver calling the data-generation code and scoring algorithm (run_all_experiments.sh) and post-processing scripts (count_FP_FN.py,count_filtered.py,corr_stats.py) to calculate summary metricslibs: the Community Notes scoring algorithm (git submodule)results: pre-computed summary CSVsreports: Jupyter notebooks to reproduce all figures in the paper
git submodule update --init --recursivecp pyproject.toml libs/communitynotes
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install poetry
poetry install --directory libs/communitynotes --no-rootThe results reported in the paper is already in results. You can JUMP DIRECTLY to Stage 3 and run the Jupyter notebooks to re-create the plots with this results. If you want to replicate the workflow from scratch, follow the step-by-step below.
experiments/run_all_experiments.sh is the main driver. For each experiment it automatically:
- Compiles the C++ data-generation code in
data/{exp}/ - Runs the compiled binary to generate synthetic input data
- Runs the Community Notes scoring algorithm on each generated notes file
NOTE: Running this script will always do these 3 steps. Since creating the synthetic data might take some time, you might want to only perform step 1 and 2 if the data doesn't exist.
To run:
bash experiments/run_all_experiments.sh <EXP_NAME> <FILTER>This script in turn calls experiments/run_multiple_times.sh, which just runs multiple replicates of the same argument set.
Arguments:
-
EXP_NAME: experiment name,
{"bad_actor_no_bias_grid", "bad_actor_with_bias_grid", "multi_bad_actor_no_bias", "multi_bad_actor_with_bias", "multi_bad_actor_with_bias_bhvr_1", "homophily", "iu_var", "note_pol", "user_pol", "Homophily_HIGH_UPol_HIGH_NPol_HIGH", "Homophily_HIGH_UPol_HIGH_NPol_LOW", "Homophily_HIGH_UPol_LOW_NPol_HIGH", "Homophily_HIGH_UPol_LOW_NPol_LOW", "Homophily_LOW_UPol_HIGH_NPol_LOW", "Homophily_LOW_UPol_LOW_NPol_HIGH", "Homophily_LOW_UPol_LOW_NPol_LOW", "Homophily_NONE_UPol_HIGH_NPol_LOW", "Homophily_NONE_UPol_LOW_NPol_HIGH"}, or--allMore about experiment description is in
experiments/README.md -
FILTER: helpfulness model flag,
{True | False}
For example, to run all 18 experiments:
bash experiments/run_all_experiments.sh --all TrueCalculate summary statistics using the generated data in experiments/ enerate summary CSVs. Each script takes 2 arguments EXP_NAME and FILTER, with the same meaning as above.
# FP/FN counts (pollution, suppression, infiltration, waste rates)
python experiments/count_FP_FN.py --exp <EXP_NAME> --helpfulness <FILTER>
# Filtered-user counts (bad-actor vs. good users removed by the algorithm)
python experiments/count_filtered.py --exp <EXP_NAME> --helpfulness <FILTER>
# Correlation statistics (inferred vs. ground-truth intercept and factor)
python experiments/corr_stats.py --exp <EXP_NAME> --helpfulness <FILTER>These scripts write three summary CSV files per experiment:
To process all 18 experiments at once, replace --exp EXP_NAME with --all:
python experiments/count_FP_FN.py --all --helpfulness TrueMore information about the data fields resulting from this computation is described in results/README.md
Notebooks using results/ to plot main figures.
To run these notebooks, first install the additional packages needed for notebooks:
pip install jupyter ipykernel tqdmAfter running Stage 2 and 3, you should be expecting the following outputs:
Synthetic input data — Each experiment sweeps over a grid of parameter combinations; for each combination, the C++ binary produces five TSV files:
data/{exp}/data/
└── {params}-notes.tsv # synthetic note ratings (input to the scoring algorithm)
{params}-ratings.tsv # per-rater ratings
{params}-TrueNoteParams.tsv # ground-truth note polarity / intercept parameters
{params}-TrueUserParams.tsv # ground-truth user polarity parameters
{params}-userEnrollment.tsv # user enrollment records
Algorithm output — For each parameter set the scoring algorithm runs once (run_0) and produces:
results/{exp}/output_helpfulness_{True|False}/
└── {params}-notes/
├── run_0/
│ ├── scored_notes.tsv # notes with inferred intercept and factor scores
│ ├── helpfulness_scores.tsv # per-user helpfulness / reputation scores
│ ├── aux_note_info.tsv # auxiliary note metadata
│ ├── note_status_history.tsv # note status over time
│ └── main_tiny.log # algorithm run log
└── timing_results_0.1.csv # wall-clock timing for this run
Summary output
experiments/{exp}/
├── FP_count/
│ ├── helpfulness_{True|False}.csv # FP/FN counts with reputation scoring
├── corr_stats/
│ ├── helpfulness_{True|False}.csv # Pearson correlation statistics
└── filtered_count/
└── helpfulness_{True|False}.csv # filter counts with reputation scoring