qm9

  • Description:

QM9 consists of computed geometric, energetic, electronic, and thermodynamic properties for 134k stable small organic molecules made up of C, H, O, N, and F. As usual, we remove the uncharacterized molecules and provide the remaining 130,831.

FeaturesDict({
    'A': float32,
    'B': float32,
    'C': float32,
    'Cv': float32,
    'G': float32,
    'G_atomization': float32,
    'H': float32,
    'H_atomization': float32,
    'InChI': string,
    'InChI_relaxed': string,
    'Mulliken_charges': Tensor(shape=(29,), dtype=float32),
    'SMILES': string,
    'SMILES_relaxed': string,
    'U': float32,
    'U0': float32,
    'U0_atomization': float32,
    'U_atomization': float32,
    'alpha': float32,
    'charges': Tensor(shape=(29,), dtype=int64),
    'frequencies': Tensor(shape=(None,), dtype=float32),
    'gap': float32,
    'homo': float32,
    'index': int64,
    'lumo': float32,
    'mu': float32,
    'num_atoms': int64,
    'positions': Tensor(shape=(29, 3), dtype=float32),
    'r2': float32,
    'tag': string,
    'zpve': float32,
})
  • Feature documentation:
Feature Class Shape Dtype Description
FeaturesDict
A Tensor float32
B Tensor float32
C Tensor float32
Cv Tensor float32
G Tensor float32
G_atomization Tensor float32
H Tensor float32
H_atomization Tensor float32
InChI Tensor string
InChI_relaxed Tensor string
Mulliken_charges Tensor (29,) float32
SMILES Tensor string
SMILES_relaxed Tensor string
U Tensor float32
U0 Tensor float32
U0_atomization Tensor float32
U_atomization Tensor float32
alpha Tensor float32
charges Tensor (29,) int64
frequencies Tensor (None,) float32
gap Tensor float32
homo Tensor float32
index Tensor int64
lumo Tensor float32
mu Tensor float32
num_atoms Tensor int64
positions Tensor (29, 3) float32
r2 Tensor float32
tag Tensor string
zpve Tensor float32
@article{ramakrishnan2014quantum,
  title={Quantum chemistry structures and properties of 134 kilo molecules},
  author={Ramakrishnan, Raghunathan and Dral, Pavlo O and Rupp, Matthias and von Lilienfeld, O Anatole},
  journal={Scientific Data},
  volume={1},
  year={2014},
  publisher={Nature Publishing Group}
}

qm9/original (default config)

  • Config description: QM9 does not define any splits. So this variant puts the full QM9 dataset in the train split, in the original order (no shuffling).

  • Auto-cached (documentation): Only when shuffle_files=False (train)

  • Splits:

Split Examples
'train' 130,831

qm9/cormorant

Split Examples
'test' 13,083
'train' 100,000
'validation' 17,748

qm9/dimenet

Split Examples
'test' 10,831
'train' 110,000
'validation' 10,000