Atomic Descriptors
AtomicAI provides locally-averaged atomic fingerprints (LAAF) that encode the chemical environment of each atom. These descriptors are used as input features for machine learning models.
Command-line usage
generate_descriptors trajectory.xyz [--descriptor TYPE [TYPE ...]] [--n-eta N]
Options
Option |
Default |
Description |
|---|---|---|
|
|
One or more descriptor types to compute |
|
|
Number of eta decay functions |
Descriptor types
ACSF_G2 — Radial symmetry functions
Two-body Behler-Parrinello G2 functions. Each function is a Gaussian in interatomic distance, parameterised by eta (width) and Rs (shift):
where \(f_c\) is a cosine cutoff function.
generate_descriptors traj.xyz --descriptor ACSF_G2 --n-eta 80
ACSF_G3 — Cosine basis functions
G3 functions use a cosine basis parameterised by kappa:
generate_descriptors traj.xyz --descriptor ACSF_G3
ACSF_G4 — Angular symmetry functions (with rjk)
Three-body functions that encode bond angles. The rjk term is included in the cutoff and exponent sum:
generate_descriptors traj.xyz --descriptor ACSF_G4
ACSF_G5 — Angular symmetry functions (without rjk)
Similar to G4 but the rjk distance is not included, making it faster to compute for large systems:
generate_descriptors traj.xyz --descriptor ACSF_G5
ACSF_G2G4 — Combined radial + angular (recommended)
Concatenates G2 and G4 vectors to produce a complete two-body + three-body descriptor. This is generally the best balance of accuracy and cost.
generate_descriptors traj.xyz --descriptor ACSF_G2G4 --n-eta 60
ACSF_G2G4G5 — Full combined descriptor
Concatenates G2 + G4 + G5. Provides the richest angular description.
generate_descriptors traj.xyz --descriptor ACSF_G2G4G5
SOAP — Smooth Overlap of Atomic Positions
Rotationally invariant descriptor based on the overlap of atomic density functions, computed via DScribe.
generate_descriptors traj.xyz --descriptor SOAP
MBSF — Many-body symmetry functions
Combines a radial term (gr, G2-like) with an angular term (ga) that includes \(\zeta\), \(\theta_s\), eta, and Rs parameters.
generate_descriptors traj.xyz --descriptor MBSF
Output
Descriptor files are written to ./descriptors/ with the naming convention:
<TYPE>_<cutoff_descriptor>_<cutoff_average>_<element1>_<element2>.dat
Each row is one averaged fingerprint vector for a single atom. The cutoff values
(in Å) come from the built-in descriptor_cutoff table in
AtomicAI/data/data_lib.py.
Running multiple types
You can compute several descriptor types in a single call — they run in parallel using Python multiprocessing:
generate_descriptors traj.xyz --descriptor ACSF_G2 ACSF_G2G4 SOAP MBSF --n-eta 50