- Description:
Sentiment140 allows you to discover the sentiment of a brand, product, or topic on Twitter.
The data is a CSV with emoticons removed. Data file format has 6 fields:
- the polarity of the tweet (0 = negative, 2 = neutral, 4 = positive)
- the id of the tweet (2087)
- the date of the tweet (Sat May 16 23:58:44 UTC 2009)
- the query (lyx). If there is no query, then this value is NO_QUERY.
- the user that tweeted (robotickilldozr)
- the text of the tweet (Lyx is cool)
For more information, refer to the paper Twitter Sentiment Classification with Distant Supervision at https://cs.stanford.edu/people/alecmgo/papers/TwitterDistantSupervision09.pdf
Additional Documentation: Explore on Papers With Code
Homepage: http://help.sentiment140.com/home
Source code:
tfds.datasets.sentiment140.Builder
Versions:
1.0.0
(default): No release notes.
Download size:
77.59 MiB
Dataset size:
305.13 MiB
Auto-cached (documentation): No
Splits:
Split | Examples |
---|---|
'test' |
498 |
'train' |
1,600,000 |
- Feature structure:
FeaturesDict({
'date': Text(shape=(), dtype=string),
'polarity': int32,
'query': Text(shape=(), dtype=string),
'text': Text(shape=(), dtype=string),
'user': Text(shape=(), dtype=string),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
date | Text | string | ||
polarity | Tensor | int32 | ||
query | Text | string | ||
text | Text | string | ||
user | Text | string |
Supervised keys (See
as_supervised
doc):('text', 'polarity')
Figure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):
- Citation:
@ONLINE {Sentiment140,
author = "Go, Alec and Bhayani, Richa and Huang, Lei",
title = "Twitter Sentiment Classification using Distant Supervision",
year = "2009",
url = "http://help.sentiment140.com/home"
}