HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns

To Appear in the USENIX Security 2025

Xinyue Shen¹, Yixin Wu¹, Yiting Qu¹, Michael Backes¹, Savvas Zannettou², Yang Zhang¹

¹CISPA Helmholtz Center for Information Security, ²TU Delft

34 Identity Groups Covered in Our Study

The list is adopted from the Measuring Hate Speech project.

BibTeX

If you find this useful in your research, please consider citing:

@inproceedings{SWQBZZ25,
  author = {Xinyue Shen and Yixin Wu and Yiting Qu and Michael Backes and Savvas Zannettou and Yang Zhang},
  title = {{HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns}},
  booktitle = {{USENIX Security Symposium (USENIX Security)}},
  publisher = {USENIX},
  year = {2025}
}