For this project, I used the open Catalyst 2020 Dataset (OC20).
A few important points:
→ data is stored in PyTorch Geometric objects and stored in LMDB files
→ for each task, there are several sized training splits.
→ validation/test splits are broken into subsplits
→ in domain (ID)
→ out of domain adsorbate (OOD-Ads)
→ out of domain catalyst (OOD-Cat)
→ out of domain adsorbate and catalyst (OOD-Both)
Train
Val/test
import matplotlib
matplotlib.use('Agg')
import os
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
params = {
'axes.labelsize': 14,
'font.size': 14,
'font.family': ' DejaVu Sans',
'legend.fontsize': 20,
'xtick.labelsize': 20,
'ytick.labelsize': 20,
'axes.labelsize': 25,
'axes.titlesize': 25,
'text.usetex': False,
'figure.figsize': [12, 12]
}
matplotlib.rcParams.update(params)
import ase.io
from ase.io.trajectory import Trajectory
from ase.io import extxyz
from ase.calculators.emt import EMT
from ase.build import fcc100, add_adsorbate, molecule
from ase.constraints import FixAtoms
from ase.optimize import LBFGS
from ase.visualize.plot import plot_atoms
from ase import Atoms
from IPython.display import Image
matplotlib.use('Agg')
- "Agg" backend, which stands for "Anti-Grain Geometry". This backend is used for saving plots to files, rather than displaying them on the screen.
params
dictionary sets some default options for matplotlib
, such as the font size and family, the size of the labels and ticks on the axes, and the size of the figure.
matplotlib.rcParams.update(params)
line updates default options with the ones specified in the params
dictionary
rest of the code imports various functions and classes from the ase
and IPython
modules, which are used for tasks such as reading and writing atomic simulation data, building and optimizing atomic structures, and displaying images in the notebook.