How To Think Like A Computational Chemist: Geometry Optimization

A computational chemist has many tools at their disposal; however, effective project support is more than clicking on a mouse. It is a mindset. In this article, I’ll break down how to think like a computational chemist with respect to a foundational technique: geometry optimization. Sure, I’ll go over common techniques, but I will also show you how we think when solving problems and how to circumvent hidden pitfalls. With this information, you’ll soon be making contributions of your own, should you choose to put on the “computational chemist” hat yourself.

What is Geometry Optimization?
Tools of a Computational Chemist
Common Pitfalls in Geometry Optimization
Diastereomeric Ratio of Dihydrooxepinones
A Computational Chemist Must Balance Precision and Time

What is Geometry Optimization?

Geometry optimization is the process of using computational methods to find molecular conformations using an energy minimization process.

But how can we “minimize the energy” of a molecule? Think of pressing two identical magnetic poles together or setting a ball on a slope. What happens? Forces propagate the system to a lower energy state. Magnets push each other away. The ball rolls down the hill.

A molecular mechanics (MM) model evaluates forces on atoms and propagates the atoms to minimize these forces. Don’t worry: software does the heavy lifting. I haven’t touched a Lagrange multiplier or calculated a gradient by hand in years.

Functional form of an MM force field. Energy is modeled by internal vibrations and rotations, van der Waals forces, and atomic charges

As an example, think of a bond between two atoms as a spring. Stretching or compressing this bond from its equilibrium distance requires energy. The dependence of energy on internal coordinates, e.g. bond distance, forms a molecule’s potential energy surface (PES). In practice, this PES is multi-dimensional and is a function of all collective motions of bond vibrations, angle bends, torsional rotations, etc.

Potential energy surface of bond vibrations modeled as a spring. — Potential energy surface of a bond vibration modeled as a spring.

Alternatively, treating molecules at an electronic level with quantum mechanics (QM) is an option. Ab initio methods you may have heard of, like Density Functional Theory, follow this formalism. Brushing with broad strokes, MM is faster than QM methods but less precise. That being said, each have appropriate use cases and trade-offs.

QM treatment of energy. — Schrödinger equation

“Yeah, yeah enough theory… where’s a molecule I can get to optimizing already?” I hear you, don’t worry. You’ve been patient while I laid some fundamental groundwork, so let’s get to some practical stuff.

Tools of a Computational Chemist

There are many tools available to a computational chemist for geometry optimization. Here are just a few software packages and programs that I thought of (in alphabetical order) to get started:

GAMESS – freeware from Mark Gordon’s group at Iowa State University.
Gaussian – Commercial quantum chemistry software for electronic structure calculations developed by Gaussian, Inc.
NWChem – Open-source designed for large-scale simulations.
OpenMOLCAS – Open-source suite specializing in multiconfigurational methods and electronic structure theory.
ORCA – Freeware for DFT and ab initio calculations developed by Frank Neese’s group (MPI KoFo).
Promethium – Cloud-based software platform for managing and automating computational chemistry workflows.
Psi4 – Open-source package with Python interface.
Spartan – Commercial program with intuitive GUI for molecular modeling.

I will cover these packages in greater detail in another post; however, I recommend something simple and free like Psi4 for simple optimizations. For reference, you can download here and customize the code in this tutorial.

Psi4 has a Python API, but you don’t have to be a coding expert to benefit from its power. Helpful examples online and a solid community of users provide all the support you should need. You don’t need access to high performance computing clusters. Most laptops in 2025 are powerful enough to run calculations at production level efficiency.

I run through set up and analysis of calculations in Gaussian here, if you’re interested.

Common Pitfalls in Geometry Optimization

Now, with your method of choice in hand, let’s discuss potential issues. The energy minimization process is tricky for the beginner and the seasoned veteran alike. Pulling from my own experiences, here are a few issues and how to avoid them.

Poor Initial Structures

The geometry optimization process relies heavily on the starting structure provided: garbage in, garbage out. If the initial structure is far from an equilibrium geometry, convergence to a minimum will take longer or may result in errors. This can happen when building a molecule in 3-dimensional space (Yes, I’ve seen a lot of this kind of thing). To prevent this:

Fix overlapping atoms manually before minimization
Optimize at a lower level of theory first
Provide geometric constraints to the molecule (in certain situations)

Geometry optimization gone wrong. — Bad initial structures can lead to unrealistic results.

QM and MM Rank Conformers Differently

The relationship between MM and QM conformer rankings shows a limitation of classical force fields. MM can adequately and quickly sample conformational space, often showing strong rank correlation with higher-level QM. However, they usually fail to correctly identify the global minimum energy conformer. This discrepancy arises because MM force fields approximate molecular interactions with simple equations (refer to earlier discussion). QM methods explicitly account for electronic effects and quantum phenomena. Let’s look at rezatapopt, a reactivator of p53 Y220C in development for cancer treatment that was recently published in ACS Medicinal Chemistry Letters.

2D graph of rezatapopt — Rezatapopt
*ACS Med Chem Lett* **2025**, 16, 1, 34-39

I conducted a conformational search with OMEGA and energy calculations with MMFF94 followed by single-point energy calculations with QM at the M06-2X/6-31+G(d,p) level. The agreement between the two is decent (Kendall’s tau: 0.91; p = 1.5e-6); however, the global energy minimum was predicted incorrectly. A common protocol for obtaining input structures for QM optimization is to take the top ranked compound from a conformational search. This example is not an isolated case of the failure of such a protocol.

Agreement between QM and MM for conformational analysis. — Plot of QM vs. MM energies for a set of 12 conformers of rezatapopt.

In this day and age, computational resources are no longer the limiting factor for robust calculations of energies and other molecular properties. Still, I would use a hierarchical approach to balance computational efficiency and accuracy to produce meaningful and actionable results. My recommendation is:

Always refine MM conformer energies with QM.
Boltzmann-average ensembles rather than relying on global minimum.

There are many other aspects of geometry optimization that we could discuss (and I’m sure we will in the future); however, now we’ll focus on applying these initial principles to a real example.

Diastereomeric Ratio of Dihydrooxepinones

Authors of a 2018 Org. Lett. study investigated stereoselective syntheses of medium-sized cyclic ethers. Two diastereomeric products of one reaction are shown below, hereafter referred to as R and S. Could we use computational methods to predict the diastereomeric ratio before any experiments are carried out?

Oxepinones: 1D, 2D, and 3D. — Conversion of 1D SMILES to 3D conformations.

Taking their SMILES strings, one may use RDKit or any conformer generator to generate a set of 3D coordinates. Check out the “Working with 3D Molecules” section in the RDKit docs for an example.

Minimizing these coordinates with MMFF94, a molecular mechanics force field, I computed an energy difference, ΔE, of 1.3 kcal/mol favoring the R isomer. Then, I used an equilibrium equation derived from the Boltzmann distribution to calculate a d.r. of 91:9. Please note that I am giving all calculations the benefit of an extra significant figure. In reality, neither method used here is precise enough to report tenths of a kcal.

Next, I computed the QM single-point energy difference as 3.6 kcal/mol in favor of S. After optimizing the two structures with QM, the ΔE is 1.1 kcal/mol in favor of S. Already, we have issues in both quantitative and qualitative agreement between MM and QM.

Method	Optimization	ΔE_R-S, kcal/mol
MM	Y	-1.3
QM	N	+3.6
QM	Y	+1.1

Ensembles Describe Behavior, Not Lone Structures

You, the reader, might take these SMILES strings and get a different answer than what I have. It could be the level of theory or software package you use is different. It could also be because conformer searching algorithms tend to be non-deterministic unless a specific random seed is provided.

Reproducibility aside, it is more realistic to consider a large sample of chemical space of the molecules we investigate. Many molecules relevant to the biopharma and agrochemical spaces are large and flexible, and their potential energy surfaces are flat with many low-lying conformers. In other words, their populations and physical properties likely depend on a set of conformers rather than a single one.

Using a threshold RMSD of 0.75 Angstroms, I generated a set of seven conformers for both R and S diastereomers, and then optimized at two levels of theory: MN15/def2-TZVP and M06-2X/6-311++G(d,p). After removing duplicates, calculating partition functions, and Boltzmann-averaging, the two methods agreed on an energy difference of about 0.1-0.2 kcal/mol in favor of S. That comes out to a d.r. of about 0.8:1. These results predict that the reaction is not diastereoselective.

Regardless of whether the answer is correct or not, notice that all of our calculations predict different levels of diastereoselectivity.

How to Cope With Incorrect Predictions

In the article, the actual d.r. reported is >97:3; an energy difference of ~2 kcal/mol in favor of R. The MM result on a single conformer provided the best agreement with experiment in our case. Which scenarios could account for this?

Kinetic control dictates stereoselectivity
Insufficient sampling
Superior parametrization of force field parameters
Neglect of solvent effects

There are likely others, perhaps some system-dependent. If this scenario occurred while supporting a project, my official hypothesis would be that the diastereoselectivity occurs in the transition state. That is, the selectivity operates under kinetic control rather than the initial assumption of thermodynamic control. It may not be the answer we wanted, but we still learned something about the system and can propose further testing.

A Computational Chemist Must Balance Precision and Time

The ideal balance. It’s never perfect like this.

Not counting the conformational searching and file IO time, my QM calculations at the chosen level of theory clocked about 23 days of CPU time. Through parallelization and multi-threading, the calculations took about 18 hours of “real” time. The MM calculations were done in an instant. Molecular mechanics forcefields are parametrized to reproduce experimental values from typical organic molecules, making MM calculations a decent starting point for geometry optimizations and ranking compounds.

Why Can’t I Use Only MM and Save Time?

It’s true that you will save a lot of time going with MM calculations, and they might be adequate in some cases. Ultimately, the sophistication of QM (which still comes with its own set of assumptions) is closer to reality than the equations of MM. We want to get the right answer for the right reason.

If that doesn’t convince you, here is a taste of what you can’t do with MM:

Modeling bond-forming / bond-breaking transition state structures
Calculating excited states
Treat unparametrized atom types or oxidation states
Predicting spectra, e.g. nuclear magnetic resonance
Accurately predict hyperconjugation (difluoroethane, anomeric effect, etc.)

In practice, I and most computational chemists I know use hierarchical methods that increase in sophistication. Balancing the trade-off between accuracy and time is key, especially while supporting multiple projects under tight deadlines.

Closing Remarks

Geometry optimization is a key computational step in understanding molecular properties. By thinking carefully and applying these techniques, you can tackle complex chemical questions and contribute to groundbreaking discoveries.

Want to learn more detail about the mechanics of running calculations? Check out this post on running and analyzing geometry optimizations with Gaussian.

Know anyone who would benefit from this content or have suggestions for things you’d like me to cover? Pass it along and let’s connect!

Until Next Time,

Avery

Please consider supporting the blog!

https://buymeacoffee.com/averysader

Comments

One response to “How To Think Like A Computational Chemist: Geometry Optimization”

Getting Started With Gaussian: Practical Tips for New Users

February 1, 2025

[…] or import a molecule. The quality of this input structure is highly important, as outlined in a post on geometry optimization. The options for providing a structure […]