NewsWCL50 is the first, open access evaluation dataset for methods seeking to identify bias by word choice and labeling.
The dataset consists (besides some additional files such as the readme you are currently reading) of two files:
Name | Description |
---|---|
Annotations.csv |
Contains all annotations that we coded during the manual, deductive content analysis. The start and end columns represent the annotation's position as to the document in number of tokens. |
Codebook.pdf |
The codebook used to conduct the final deductive content analysis. |
You can find more information on this and other news projects on our website.
If you are using NewsWCL50, please cite our paper (soon to be publicly available):
@InProceedings{Hamborg2019a,
author = {Hamborg, Felix and Zhukova, Anastasia and Gipp, Bela},
title = {Automated Identification of Media Bias by Word Choice and Labeling in News Articles},
booktitle = {Proceedings of the ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL)},
year = {2019},
month = {Jun.},
location = {Urbana-Champaign, USA}
}
You can find more information on this and other news projects on our website.
Licensed under the Attribution-ShareAlike 4.0 International (the "License"); you may not use NewsWCL50 except in compliance with the License. A copy of the License is included in the project, see the file LICENSE.
Copyright 2018-2019 The NewsWCL50 team