.. -*- encoding: utf-8 -*- .. include:: /includes/defs.rst .. include:: /includes/links.rst .. meta:: :description: Hardware requirements for QM/MM simulations with AMBER, TeraChem and NBO. GPU recommendations for TeraChem CUDA, pmemd.cuda and CPU requirements for sander QM/MM and NBO analysis. :keywords: TeraChem v1.96 GPU requirements, AMBER 22 pmemd.cuda GPU, sander CPU QM/MM, NBO 7 CPU, CUDA 12 cores, RTX 3090, RTX 4090, Nvidia GPU QM/MM **************************** Hardware Requirements **************************** ========================================== Software Versions Used in This Tutorial ========================================== The results presented in `Sahrawat et al. (2024)`_ were produced with the following software versions. Using different versions may require minor adjustments to input flags or file formats. .. list-table:: :header-rows: 1 :widths: 20 25 55 * - Software - Version - Notes * - AMBER_ - **22** (SANDER 2022) - Classical MD with ``pmemd.cuda``; QM/MM production and SMD with ``sander`` * - TeraChem_ - **v1.96H-beta** (build 2023-04-07) - Development build; compiled against CUDA 12.1; supports SM 5.0–9.0 (Turing, Ampere, Ada) * - NBO_ - **7.0** (Linux x64) - Binary distribution; called via TeraChem at each QM/MM SMD step .. note:: TeraChem v1.96H-beta is a development release from PetaChem. If you are using a stable release or a newer version, the input syntax and available keywords should remain compatible, but confirm with the `TeraChem release notes `_ if you encounter unexpected behaviour. ========================================== What You Need to Run This Tutorial ========================================== This tutorial couples three distinct computational tools — AMBER_, TeraChem_, and NBO_ — and each has different hardware demands. The table below gives a quick overview; detailed guidance for each component follows. .. list-table:: :header-rows: 1 :widths: 20 20 20 40 * - Software - Step in Tutorial - Hardware - Notes * - ``pmemd.cuda`` - Classical MD (Steps 1–3) - Nvidia GPU - Any modern Nvidia GPU; CUDA-enabled * - ``sander`` - QM/MM minimisation and production MD - CPU (multi-core) - AMBER's QM/MM interface does not use the GPU * - TeraChem_ - QM calculations within QM/MM - Nvidia GPU (CUDA) - CUDA core count is the key metric, not vRAM * - NBO_ - NBO analysis alongside QM/MM SMD - CPU (multi-core) - Runs on CPU only; scales well with core count TeraChem: GPU is Essential, CUDA Cores are the Key =================================================== TeraChem_ performs all its electronic structure calculations directly on the GPU. It is written entirely in CUDA and achieves substantial speedups over CPU-based QM packages precisely because it maps the linear-algebra operations of DFT onto the massive parallelism of a GPU's shader cores. **What matters most: CUDA core count, not vRAM.** The QM region in a typical enzyme simulation (50–150 atoms with a 6-31G* basis) fits comfortably within 4–8 GB of GPU memory. What limits throughput is floating-point throughput, i.e., the number of CUDA cores running in parallel. A high-end gaming GPU with a large core count will therefore outperform a workstation GPU that has more vRAM but fewer cores. .. admonition:: Best price-to-performance GPUs for TeraChem Consumer-grade Nvidia GPUs offer an excellent price-to-performance ratio for TeraChem. Our recommended choices (in order of increasing performance): * **RTX 3090** — 10,496 CUDA cores, 24 GB GDDR6X. A strong performer for QM/MM and widely available second-hand. * **RTX 4090** — 16,384 CUDA cores, 24 GB GDDR6X. Currently one of the best single-GPU options for TeraChem. * **RTX 5090** — 21,760 CUDA cores, 32 GB GDDR7. The latest generation; best single-GPU performance available as of 2025. For multi-GPU runs (``ngpus = 2`` in the TeraChem template), two RTX 4090s or two RTX 3090s are a very practical and cost-effective setup. The demo version of TeraChem supports up to two GPUs and a maximum runtime of 15 minutes per job, which is sufficient for the QM/MM MD steps in this tutorial. .. note:: TeraChem requires an Nvidia GPU with CUDA compute capability ≥ 3.5. AMD and Intel GPUs are not supported. Always install the CUDA toolkit version that matches your TeraChem build. AMBER ``pmemd.cuda``: GPU-Accelerated Classical MD ==================================================== The classical MD steps (minimisation, thermalisation, NPT equilibration in :doc:`4-settle_system`) use ``pmemd.cuda``, AMBER's GPU-accelerated production engine. For these steps the GPU is used primarily for non-bonded force evaluation, and any modern Nvidia GPU (Turing architecture or newer) will provide a substantial speedup over CPU-only ``pmemd``. There is no strict minimum vRAM for the system sizes typical in this tutorial (~70,000 atoms with solvent), but 8 GB or more is comfortable. The same RTX 3090/4090/5090 GPUs recommended for TeraChem work equally well here. .. important:: ``pmemd.cuda`` and TeraChem cannot share the same GPU simultaneously, as TeraChem claims the full GPU memory for its QM calculations. In practice this is not an issue because the ``pmemd.cuda`` classical equilibration steps complete before the QM/MM production runs begin. On a multi-GPU node you can run ``pmemd.cuda`` on one GPU while TeraChem runs on another. AMBER ``sander``: QM/MM Runs on CPU ===================================== The QM/MM production runs and SMD simulations (:doc:`6-qmmm_production`, :doc:`8-smd_simulations`) use AMBER's ``sander`` engine, not ``pmemd.cuda``. This is because the QM/MM interface in AMBER — the file-based protocol that hands coordinates to TeraChem and reads back energies and gradients — is implemented only in ``sander``. The MM part of the force evaluation in ``sander`` runs on CPU. In practice, ``sander`` is not the computational bottleneck: TeraChem on the GPU handles the expensive QM step, and ``sander`` simply manages the I/O and MM calculation between QM calls. A modern multi-core workstation CPU (8–32 cores) is more than sufficient for the ``sander`` side. NBO: CPU-Only, Scales with Core Count ======================================= NBO_ (Natural Bond Orbital) analysis runs entirely on CPU. At each step of the NBO-coupled QM/MM SMD simulations (:doc:`9-nbo-smd`), TeraChem writes a ``.47`` wavefunction file and calls the NBO binary, which processes the orbital information and returns the results to TeraChem before the next MD step proceeds. NBO7 parallelises well across CPU cores; providing 8–16 dedicated CPU cores will keep the NBO step from becoming a bottleneck in the overall QM/MM SMD pipeline. Ensure the NBO executable path is correctly set in your environment before running these simulations: .. code-block:: bash export NBOEXE="$your_nbo_install_dir/bin/nbo7.i4.exe" Summary: a Practical Single-Workstation Setup ============================================== The simulations in this tutorial were carried out on a workstation with two consumer-grade Nvidia GPUs. The following configuration is representative of what you need: * **GPU**: 2 × Nvidia RTX 3090 (or RTX 4090 / 5090 for better performance) * **CPU**: 16–32 cores (Intel or AMD) for ``sander`` MM evaluation and NBO analysis * **RAM**: 64–128 GB system RAM (TeraChem and NBO both make heavy use of system memory) * **Storage**: Fast NVMe SSD recommended — QM/MM SMD runs generate large numbers of small files (TeraChem scratch, NBO ``.47`` files, AMBER restart files) and I/O speed matters Cloud-based GPU instances (e.g., AWS p3/p4, Google Cloud A100 nodes) can also be used, but consumer-grade gaming GPUs remain the most cost-effective option for research groups running TeraChem at this scale.