- Description:
A collection of email messages of employees in the Enron Corporation.
There are two features:
- email_body: email body text.
subject_line: email subject text.
Additional Documentation: Explore on Papers With Code
Homepage: https://github.com/ryanzhumich/AESLC
Source code:
tfds.datasets.aeslc.Builder
Versions:
1.0.0
(default): No release notes.
Download size:
11.10 MiB
Dataset size:
14.96 MiB
Auto-cached (documentation): Yes
Splits:
Split | Examples |
---|---|
'test' |
1,906 |
'train' |
14,436 |
'validation' |
1,960 |
- Feature structure:
FeaturesDict({
'email_body': Text(shape=(), dtype=string),
'subject_line': Text(shape=(), dtype=string),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
email_body | Text | string | ||
subject_line | Text | string |
Supervised keys (See
as_supervised
doc):('email_body', 'subject_line')
Figure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):
- Citation:
@misc{zhang2019email,
title={This Email Could Save Your Life: Introducing the Task of Email Subject Line Generation},
author={Rui Zhang and Joel Tetreault},
year={2019},
eprint={1906.03497},
archivePrefix={arXiv},
primaryClass={cs.CL}
}