This model is a transformer-based model that leverages gaussian pair-wise positional embeddings to train on atomistic graph data. It
is part of a suite of datasets/models/utilities in the AtomGen project that supports other methods for pre-training and fine-tuning
models on atomistic graphs.
Model description
AtomFormer is a transformer model with modifcations to train on atomstic graphs. It builds primarily on the work
from uni-mol+ to add the pair-wise pos. embeds. to the attention mask to leverage 3-D positional information.
This model was pre-trained on a diverse set of aggregated atomistic datasets where the target task is the per-atom
force prediction and the per-system energy prediction.
The model also includes metadata regarding the atomic species that are being modeled, this includes the atomic radius,
electronegativity, valency, etc. The metadata is normalized and projected to be added to the atom embeddings in the model.
Intended uses & limitations
You can use the raw model for either force and energy prediction, but it's mostly intended to
be fine-tuned on a downstream task. The performance of the model as a force and energy prediction model
is not validated, it was primarily used a pre-training task.
How to use
Here is how to use the model to extract features from the pre-trained backbone:
AtomFormer is trained on an aggregated S2EF dataset from multiple sources such as OC20, OC22, ODAC23, MPtrj, and SPICE
with structures and energies/forces for pre-training. The pre-training data includes total energies and formation
energies but trains using formation energy (which isn't included for OC22, indicated by "has_formation_energy" column).
Preprocessing
The model expects input in the form of tokenized atomic symbols represented as
input_ids
and 3D coordinates represented
as
coords
. For the pre-training task it also expects labels for the
forces
and
formation_energy
.
The
DataCollatorForAtomModeling
utility in the AtomGen library has the capacity to perform dynamic padding to batch the
data together. It also offers the option to flatten the data and provide a
batch
column for gnn-style training.
Pretraining
The model was trained on a node of 4xA40 (48 GB) for 10 epochs (~2 weeks). See the
training code
for all hyperparameters
details.
Evaluation results
We use the Atom3D dataset to evaluate the model's performance on downstream tasks.
When fine-tuned on downstream tasks, this model achieves the following results:
Task
SMP
PIP
RES
MSP
LBA
LEP
PSR
RSR
1.077
TBD
TBD
TBD
TBD
TBD
TBD
TBD
Runs of vector-institute atomformer-base on huggingface.co
15
Total runs
0
24-hour runs
0
3-day runs
8
7-day runs
7
30-day runs
More Information About atomformer-base huggingface.co Model
atomformer-base huggingface.co is an AI model on huggingface.co that provides atomformer-base's model effect (), which can be used instantly with this vector-institute atomformer-base model. huggingface.co supports a free trial of the atomformer-base model, and also provides paid use of the atomformer-base. Support call atomformer-base model through api, including Node.js, Python, http.
atomformer-base huggingface.co is an online trial and call api platform, which integrates atomformer-base's modeling effects, including api services, and provides a free online trial of atomformer-base, you can try atomformer-base online for free by clicking the link below.
vector-institute atomformer-base online free url in huggingface.co:
atomformer-base is an open source model from GitHub that offers a free installation service, and any user can find atomformer-base on GitHub to install. At the same time, huggingface.co provides the effect of atomformer-base install, users can directly use atomformer-base installed effect in huggingface.co for debugging and trial. It also supports api for free installation.