The S2EF task takes an atomic system as input and predicts the energy of the entire system and forces on each atom. This is our most general task, ultimately serving as a surrogate to DFT. A model that can perform well on this task can accelerate other applications like molecular dynamics and transitions tate calculations.

Steps for training an S2EF model

  1. Define or load a configuration (config), which includes the following
  1. Create a ForcesTrainer object

  2. Train the model

  3. Validate the model

For storage and compute reasons I used a very small subset of the OC20 S2EF - results are considerably worse than presented in the actual paper.

normalize the data

compute the mean and standard deviation for our energy values, because forces are physically related by the negative gradient of energy, use the same multiplicative energy factor for forces.

train_dataset = TrajectoryLmdbDataset({"src": train_src})

energies = []
for data in train_dataset:
  energies.append(data.y)

mean = np.mean(energies)
stdev = np.std(energies)

create the trainer

trainer = ForcesTrainer(
    task=task,
    model=copy.deepcopy(model), # copied for later use, not necessary in practice.
    dataset=dataset,
    optimizer=optimizer,
    identifier="S2EF-example",
    run_dir="./", # directory to save results if is_debug=False. Prediction files are saved here so be careful not to override!
    is_debug=False, # if True, do not save checkpoint, logs, or results
    print_every=5,
    seed=0, # random seed to use
    logger="tensorboard", # logger of choice (tensorboard and wandb supported)
    local_rank=0,
    amp=True, # use PyTorch Automatic Mixed Precision (faster training and less memory usage),
)

on gem-net