Skip to content

SenticNet/FineFake

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

26 Commits

Repository files navigation

FineFake

This is the dataset for FineFake : A Knowledge-Enriched Dataset for Fine-Grained Multi-Domain Fake News Detection. The main construction of FineFake is shown below. The code of construction for updating latest news will be released when the paper is accepted. construction_00

Getting Started

Follow the instructions to download the dataset. You can download text data, metadata, image data and knowledge data. The dataset is divided into six topics and eight platforms: Politics, Entertainment, Business, Health, Society, Conflict. Snopes, Twitter, Reddit, CNN, Apnews, Cdc.gov, Nytimes, Washingtonpos. The dataset and images can be downloaded here.

DataFrame file

The data is stored as pickle file, it can be opened to dataframe by following codes. Details can be found at demo.py.

pipinstallpicklepipinstallpandasimportpickleaspklimportpandasaspdwithopen(file_name,"rb") asf: data_df=pkl.load(f) # data_df is in dataframe 

There are 13 columns in pickle file, each attribute and its corresponding meaning is shown in the table below:

textimage_pathentity_idtopiclabelfine-grained labelknowledge_embeddingdescriptionrelationplatformauthordatecomment
news body textimage_path(relative path)text-entity wiki idtopic from six topicslabelfine-grained labelknowledge_embeddingtext-entity descriptionrelationThe source of the newsauthorThe date of the news publicationcomment

Labels

For the binary label, "0" represents fake and "1" represents real. For the fine-grained label, each label and its corresponding meaning is shown in the table below:

012345
realtext-image inconsistencycontent-knowledge inconsistencytext-based fakeimage-based fakeothers

Guidelines

  • FineFake is designed to advance research in fake news detection and should not be used for any malicious or harmful purposes. Users should refrain from using the dataset for generating or spreading misinformation, manipulating public opinion, or any other activity that could harm individuals, groups, or society at large.
  • It is the responsibility of users to ensure that their models and research outcomes are fair and unbiased. Any biases inherent in the dataset must be carefully addressed in your work. If biases are detected, they should be documented, and appropriate mitigation strategies should be applied.
  • The FineFake dataset contains data sourced from public domains, but it is essential to respect the privacy and anonymity of individuals. Any attempt to de-anonymize individuals or re-identify entities within the dataset is strictly prohibited. All users must ensure that their research upholds the principles of privacy protection.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python100.0%