# NWChem: Open Source High-Performance Computational Chemistry

NWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters.

NWChem software can handle

• Biomolecules, nanostructures, and solid-state
• From quantum to classical, and all combinations
• Ground and excited-states
• Gaussian basis functions or plane-waves
• Scaling from one to thousands of processors
• Properties and relativistic effects

NWChem is actively developed by a consortium of developers and maintained by the EMSL located at the Pacific Northwest National Laboratory (PNNL) in Washington State. Researchers interested in contributing to NWChem should review the Developers page. The code is distributed as open-source under the terms of the Educational Community License version 2.0 (ECL 2.0).

The NWChem development strategy is focused on providing new and essential scientific capabilities to its users in the areas of kinetics and dynamics of chemical transformations, chemistry at interfaces and in the condensed phase, and enabling innovative and integrated research at EMSL. At the same time continued development is needed to enable NWChem to effectively utilize architectures of tens of petaflops and beyond.

# Science with NWChem

NWChem used by thousands of researchers worldwide to investigate questions about chemical processes by applying theoretical techniques to predict the structure, properties, and reactivity of chemical and biological species ranging in size from tens to millions of atoms. With NWChem, researchers can tackle molecular systems including biomolecules, nanostructures, actinide complexes, and materials. NWChem offers an extensive array of highly scalable, parallel computational chemistry methods needed to address scientific questions that are relevant to reactive chemical processes occurring in our everyday environment—photosynthesis, protein functions, and combustion, to name a few. They include a multitude of highly correlated methods, density functional theory (DFT) with an extensive set of exchange-correlation functionals, time-dependent density functional theory (TDDFT), plane-wave DFT with exact exchange and Car-Parrinello, molecular dynamics with AMBER and CHARMM force fields, and combinations of them.

A list of research publications that utilized NWChem can be found here.

# Software Similar to NWChem

Quantum Espresso [1]
CPMD [2]
Gaussian [3]
CP2k [4]

# Compiling NWChem for the Cheaha cluster

The steps outlined here are adapted from the general guide for a site installation consisting of commodity hardware over MPI [1]. There are many compilation options for differing architectures, network protocols, and optimized mathematics libraries. This guide will show the steps necessary to compile NWChem 6.5 for OpenMPI [2], using OpenBLAS[3] tuned for Intel's Nehalem microarchitecture and ScaLAPACK[4] for optimized linear algebra calculations. This guide assume some familiarity with linux commands and utilities. This guide also assumes that each software will be downloaded, compiled, and installed in the user's home directory. After each subsection, I'll add my own comments from experience that may help with any hiccups.

At the time of writing, the latest available version of OpenBLAS is 2.14[5]

Create the source directory

 mkdir -p $HOME/src  Download the source code cd$HOME/src
wget http://github.com/xianyi/OpenBLAS/archive/v0.2.14.tar.gz


Unpack the source code


mv v0.2.14 v0.2.14.tar.gz
tar xf v0.2.14.tar.gz         (this step may take a few moments)
mv OpenBLAS-0.2.14 OpenBLAS


Edit the configuration file so that its contents match those below. Some special notes here - (1) The TARGET is set to PENRYN for use in the sipsey queue. As the cluster ages and new generations are added, this sill probably need to be updated. (2) GCC is used as opposed to Intel's compilers. (3) NWChem uses 64 bit integers by default, so INTERFACE64 is set to 1. Most other options are fairly self explanatory. The default compiler is a little dated at this point. Load the gcc 4.9.3 module to use gcc/gfortran 4.9.3 for some hopeful optimization!

module load gcc/4.9.3
cd OpenBLAS
nano Makefile.rule

#
#  Beginning of user configuration
#

# This library's version
VERSION = 0.2.14

# If you set the suffix, the library name will be libopenblas_$(LIBNAMESUFFIX).a # and libopenblas_$(LIBNAMESUFFIX).so. Meanwhile, the soname in shared library
# is libopenblas_$(LIBNAMESUFFIX).so.0. # LIBNAMESUFFIX = # You can specify the target architecture, otherwise it's # automatically detected. TARGET = PENRYN # If you want to support multiple architecture in one binary #DYNAMIC_ARCH = 1 # C compiler including binary type(32bit / 64bit). Default is gcc. # Don't use Intel Compiler or PGI, it won't generate right codes as I expect. CC = gcc # Fortran compiler. Default is g77. FC = gfortran # Even you can specify cross compiler. Meanwhile, please set HOSTCC. # cross compiler for Windows # CC = x86_64-w64-mingw32-gcc # FC = x86_64-w64-mingw32-gfortran # cross compiler for 32bit ARM # CC = arm-linux-gnueabihf-gcc # FC = arm-linux-gnueabihf-gfortran # cross compiler for 64bit ARM # CC = aarch64-linux-gnu-gcc # FC = aarch64-linux-gnu-gfortran # If you use the cross compiler, please set this host compiler. # HOSTCC = gcc # If you need 32bit binary, define BINARY=32, otherwise define BINARY=64 BINARY=64 # About threaded BLAS. It will be automatically detected if you don't # specify it. # For force setting for single threaded, specify USE_THREAD = 0 # For force setting for multi threaded, specify USE_THREAD = 1 USE_THREAD = 0 # If you're going to use this library with OpenMP, please comment it in. # USE_OPENMP = 1 # You can define maximum number of threads. Basically it should be # less than actual number of cores. If you don't specify one, it's # automatically detected by the the script. # NUM_THREADS = 999 # if you don't need to install the static library, please comment it in. # NO_STATIC = 1 # if you don't need generate the shared library, please comment it in. # NO_SHARED = 1 # If you don't need CBLAS interface, please comment it in. # NO_CBLAS = 1 # If you only want CBLAS interface without installing Fortran compiler, # please comment it in. # ONLY_CBLAS = 1 # If you don't need LAPACK, please comment it in. # If you set NO_LAPACK=1, the library automatically sets NO_LAPACKE=1. # NO_LAPACK = 1 # If you don't need LAPACKE (C Interface to LAPACK), please comment it in. # NO_LAPACKE = 1 # If you want to use legacy threaded Level 3 implementation. # USE_SIMPLE_THREADED_LEVEL3 = 1 # If you want to drive whole 64bit region by BLAS. Not all Fortran # compiler supports this. It's safe to keep comment it out if you # are not sure(equivalent to "-i8" option). INTERFACE64 = 1 # Unfortunately most of kernel won't give us high quality buffer. # BLAS tries to find the best region before entering main function, # but it will consume time. If you don't like it, you can disable one. NO_WARMUP = 1 # If you want to disable CPU/Memory affinity on Linux. NO_AFFINITY = 1 # if you are compiling for Linux and you have more than 16 numa nodes or more than 256 cpus BIGNUMA = 1 # Don't use AVX kernel on Sandy Bridge. It is compatible with old compilers # and OS. However, the performance is low. # NO_AVX = 1 # Don't use Haswell optimizations if binutils is too old (e.g. RHEL6) # NO_AVX2 = 1 # Don't use parallel make. # NO_PARALLEL_MAKE = 1 # If you would like to know minute performance report of GotoBLAS. # FUNCTION_PROFILE = 1 # Support for IEEE quad precision(it's *real* REAL*16)( under testing) # QUAD_PRECISION = 1 # Theads are still working for a while after finishing BLAS operation # to reduce thread activate/deactivate overhead. You can determine # time out to improve performance. This number should be from 4 to 30 # which corresponds to (1 << n) cycles. For example, if you set to 26, # thread will be running for (1 << 26) cycles(about 25ms on 3.0GHz # system). Also you can control this mumber by THREAD_TIMEOUT # CCOMMON_OPT += -DTHREAD_TIMEOUT=26 # Using special device driver for mapping physically contigous memory # to the user space. If bigphysarea is enabled, it will use it. # DEVICEDRIVER_ALLOCATION = 1 # If you need to synchronize FP CSR between threads (for x86/x86_64 only). # CONSISTENT_FPCSR = 1 # If any gemm arguement m, n or k is less or equal this threshold, gemm will be execute # with single thread. You can use this flag to avoid the overhead of multi-threading # in small matrix sizes. The default value is 4. # GEMM_MULTITHREAD_THRESHOLD = 4 # If you need santy check by comparing reference BLAS. It'll be very # slow (Not implemented yet). # SANITY_CHECK = 1 # Run testcases in utest/ . When you enable UTEST_CHECK, it would enable # SANITY_CHECK to compare the result with reference BLAS. # UTEST_CHECK = 1 # The installation directory. PREFIX =$HOME/OpenBLAS

# Common Optimization Flag;
# The default -O2 is enough.
COMMON_OPT = -O2

# gfortran option for LAPACK
# enable this flag only on 64bit Linux and if you need a thread safe lapack library
# FCOMMON_OPT = -frecursive

# Profiling flags
COMMON_PROF = -pg

# Build Debug version
# DEBUG = 1

# Improve GEMV and GER for small matrices by stack allocation.
# For details, https://github.com/xianyi/OpenBLAS/pull/482
#
# MAX_STACK_ALLOC=2048

# Add a prefix or suffix to all exported symbol names in the shared library.
# Avoid conflicts with other BLAS libraries, especially when using
# 64 bit integer interfaces in OpenBLAS.
# For details, https://github.com/xianyi/OpenBLAS/pull/459
#
# SYMBOLPREFIX=
# SYMBOLSUFFIX=

#
#  End of user configuration
#


Compile OpenBLAS, and install it to $HOME/OpenBLAS. Compiling may take a few minutes, so go grab another cup of coffee.  make all  When the compilation is finished, you should see an output similar to the one below OpenBLAS build complete. OS ... GNU/Linux Architecture ... x86_64 BINARY ... 64bit  Install the libraries and header files  make install  ### Notes on Compiling OpenBLAS 1. When editing the config file 'Makefile.rule', you may have noticed an option for threading, and that it was ignored. NWChem will use MPI for its threading model, and spawns many different processes that are distributed. Each of these processes will use libopenblas directly; if we were to allow OpenBLAS to spawn its own threads, 'bad stuff' would happen. 2. Inside the$HOME/src/OpenBLAS directory there is a 'test' directory. It contains several "blat" files that can and should be used to test the installation. In a successful compile and install, these should all run and pass. Otherwise, something isn't correct and things wont work down the line.

At the time of writing, the latest version of ScaLAPACK is 2.0.2[6].

cd $HOME/src wget http://www.netlib.org/scalapack/scalapack-2.0.2.tgz tar xf scalapack-2.0.2.tgz mv scalapack-2.0.2 ScaLAPACK cd ScaLAPACK  Since ScaLAPACK leverages OpenMPI, load the module openmpi/openmpi-gnu.  module load openmpi/openmpi-gnu  Copy the example SLmake.inc.example to SLmake.inc and edit SLmake.inc making the the following changes. cp SLmake.inc.example SLmake.inc nano SLmake.inc  CDEFS = -DAdd_ FC = mpif90 CC = mpicc NOOPT = -O0 FCFLAGS = -O3 CCFLAGS = -O3 FCLOADER =$(FC)
CCLOADER      = $(CC) FCLOADFLAGS =$(FCFLAGS)
CCLOADFLAGS   = $(CCFLAGS) ARCH = ar ARCHFLAGS = cr RANLIB = ranlib SCALAPACKLIB = libscalapack.a BLASLIB =$HOME/OpenBLAS/lib/libopenblas.a
LAPACKLIB     = $HOME/OpenBLAS/lib/libopenblas.a LIBS =$(LAPACKLIB) $(BLASLIB)  Compile ScaLAPACK; this will also take a while. Grab some Lunch!  make all  While there is no "make install" option in the ScaLAPACK Makefile, we'll create a directory and lib structure in our home directory to keep things consistent, and copy the generated library file there. mkdir -p$HOME/ScaLAPACK/lib
cp libscalapack.a $HOME/ScaLAPACK/lib  ### Notes on Compiling ScaLAPACK 1. Before the recent few versions of ScaLAPACK, BLACS, a communication layer between the BLAS parts and MPI parts of ScaLAPACK was needed. It had to be compiled by itself, and included in the SLmake.inc file, just like the OpenBLAS libraries. This is no longer necessary as BLACS has been "absorbed" into ScaLAPACK and is compiled during the 'make all' step. However, in the BLACS directory there is still the source folder, and a testing folder. After ScaLAPACK has been compiled, try running BLACS/TESTING/xCbtest and xFbtest with mpirun to make sure that part went as expected. 2. Inside the ScaLAPACK directory there is also a TESTING directory. Try these as well. 3. If something shouldn't go right and you need to start over, for example, if you forgot to load the mpi module, run 'make clean' in the BLACS/SRC as well as the ScaLAPACK/SRC directories. Running 'make clean' from the top level dir didn't seem to clear out some 'bad stuff' that was left over. ## Download and Compile NWChem At the time of writing, the latest version of NWChem is 6.5[7]. Download the source code, and the patches that have been released since.  cd$HOME/src
tar xfj Nwchem-6.5.revision26243-src.2014-09-10.tar.bz2         (this will take a moment, there are many small files to decompress)
mv Nwchem-6.5.revision26243-src.2014-09-10 nwchem-6.5
cd nwchem-6.5/src
mkdir patches
cd patches


#### Get and Apply Patches

Pull the patches down, decompress them, and apply them. At this point it should be said to check the NWChem Download page and add any to the list that have been added since this document was written. Copy this list of file names into a file called "patch_list".

 nano patch_list
http://www.nwchem-sw.org/images/Util_md_sockets.patch.gz
http://www.nwchem-sw.org/images/Hbar.patch.gz
http://www.nwchem-sw.org/images/Hnd_giaxyz_noinline.patch.gz
http://www.nwchem-sw.org/images/Parallelmpi.patch.gz
http://www.nwchem-sw.org/images/Makefile_gcc4x.patch.gz
http://www.nwchem-sw.org/images/Bcast_ccsd.patch.gz
http://www.nwchem-sw.org/images/Elpa_syncs.patch.gz
http://www.nwchem-sw.org/images/Xlmpoles_ifort15.patch.gz
http://www.nwchem-sw.org/images/Texas_iorb.patch.gz
http://www.nwchem-sw.org/images/Dmapp_inc.patch.gz
http://www.nwchem-sw.org/images/Print1e.patch.gz
http://www.nwchem-sw.org/images/Hnd_rys.patch.gz
http://www.nwchem-sw.org/images/Cdft.patch.gz
http://www.nwchem-sw.org/images/Vdw3_nwchem65.patch.gz


wget -i patch_list
gunzip *.gz
cd $HOME/src/nwchem-6.5/src  Apply the patches by creating a very tiny script here. Copy the following into a file, make it executable, and run it.  nano apply_patches  #!/bin/csh foreach FILE (ls patches/*.patch) echo$FILE
patch -N -p0 < $FILE end  chmod +x apply_patches ./apply_patches  #### NWChem Environment Variables and Build Script There are many options for the compilation that are set with environment variables, this is an easier way of consolidating the process. Create a file called "build_nwchem", copy the contents below into it, make it executable and run it. This will set all necessary variables, define the locations of the OpemMPI libraries, and the locations of OpenBLAS / ScaLAPACK that we've just finished compiling. The environment variables set here are all outlined in the compiling guide in the reference section for the curious.  nano build_nwchem  #!/bin/bash export LARGE_FILES=TRUE export USE_NOFSCHECK=TRUE export USE_NOIO=TRUE export LIB_DEFINES=-DDFLT_TOT_MEM=475000000 export NWCHEM_MODULES="all python" export NWCHEM_TOP=$HOME/src/nwchem-6.5
export NWCHEM_TARGET=LINUX64

module purge

# Language Setting
export FC=gfortran
export CC=gcc
export PYTHONHOME=/opt/python
export PYTHONVERSION=2.7
export PYTHONLIBTYPE=a

# MPI
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export ARMCI_NETWORK=MPI_TS
export MPI_LOC=$MPI_HOME export MPI_INCLUDE="-I$MPI_LOC/include"
export MPI_LIB="-L$MPI_LOC/lib" export LIBMPI="-lmpi_f90 -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -lnsl -lutil -lm -ldl" # BLAS and LAPACK export HAS_BLAS=y export BLAS_SIZE=8 export BLASOPT=$HOME/OpenBLAS/lib/libopenblas.so
export BLAS_LIB=$BLASOPT export LAPACK_SIZE=8 export LAPACK_LIB=$BLASOPT

# BLACS and ScaLAPACK
export USE_SCALAPACK=y
export SCALAPACK_SIZE=8
export SCALAPACK=$HOME/ScaLAPACK/lib/libscalapack.a export SCALAPACK_LIB=$SCALAPACK

cd $NWCHEM_TOP/src make > make.log  make the script executable and run it. This will take even longer than the other two to compile, probably in the 30 minute - 1 hour, depending on the speed of the system. Time for elevenses? Luncheon? Afternoon tea?  chmod +x build_nwchem ./build_nwchem  After the compilation is complete, create a NWChem folder in the$HOME directory. There we will place some reference data, and a .nwchemrc file to point to it when jobs are run.


mkdir -p $HOME/NWChem/bin mkdir$HOME/NWChem/data
cp $HOME/src/nwchem-6.5/bin/LINUX64/nwchem$HOME/NWChem/bin
chmod 755 $HOME/NWChem/bin/nwchem cp -r$HOME/src/nwchem-6.5/src/basis/libraries $HOME/NWChem/data cp -r$HOME/src/nwchem-6.5/src/data $HOME/NWChem cp -r$HOME/src/nwchem-6.5/src/nwpw/libraryps $HOME/NWChem/data  Copy the contents below into a file called ".nwchemrc". cd$HOME
nano .nwchemrc

nwchem_basis_library $HOME/NWChem/data/libraries/ nwchem_nwpw_library$HOME/NWChem/data/libraryps/
ffield amber
amber_1 $HOME/NWChem/data/amber_s/ amber_2$HOME/NWChem/data/amber_q/
amber_3 $HOME/NWChem/data/amber_x/ amber_4$HOME/NWChem/data/amber_u/
spce    $HOME/NWChem/data/solvents/spce.rst charmm_s$HOME/NWChem/data/charmm_s/
charmm_x $HOME/NWChem/data/charmm_x/  #### Notes on Compiling and Installing NWChem 1. If we've come this far without any error on the first try, something has gone wrong 2. Something went wrong. Check the make.log file to ensure that both the libscalapack and libopenblas files were detected correctly. If they weren't found, check the path's above for typos. cd to nwchem-6.5/src, and look in the log for the following output. If it's there, all the libs have been found correctly. configure: configure: Checks for BLAS,LAPACK,ScaLAPACK configure: configure: Attempting to locate BLAS library checking for BLAS with user-supplied flags... yes configure: Attempting to locate LAPACK library checking for Fortran 77 LAPACK with user-supplied flags... yes configure: Attempting to locate SCALAPACK library checking for SCALAPACK with user-supplied flags... yes checking whether SCALAPACK implements pdsyevr... yes configure:  1. The openmpi/openmpi-gnu module uses mpi 1.4, which is a little more dated than I would have liked, but it was 'infiniband aware', which kept me from needing to specify the IB libs and include directory in the NWChem environment variables inside the "build_nwchem" script. 2. More recent versions of OpenMPI do not usually build libmpi_f77 and libmpi_f90. If in the future NWChem is recompiled with a newer version of OpenMPI, the appropriate library is just libmpi_mpifh. The appropriate compiler is just mpifort, as opposed to having both mpif77 and mpif90. 3. The make.log file contains tons of useful information, and is hard to dig through, but there is useful information in there if things go badly 4. You may have noticed that the MPI_LOC directory was never explicitly set in the "build_nwchem" script. It's defined as an environment variable once we load the openmpi/openmpi-gnu module. 5. The environment variable DFLT_TOT_MEM=475000000 in the "build_nwchem" script represents how many 'doubles' (8 bytes) to give each processor by default. I used ~4GB per node here, based on the information available about the gen3 Cheaha hardware. This can be overridden in the NWChem input file by using the 'memory' keyword to allocate more or less memory per core. ## Loadable module for NWChem It is convenient to create a loadable module, so that any time we wish to run a job with NWChem, we can simply load it, run the job, and unload it, without having to specify full paths, and setting environment variables beforehand. To do this in the$HOME directory, we will create a 'private modules' folder, and create a module file. The contents of the file are a bit beyond the scope here, but I believe you'll be able to guess at what each part does.

Create the privatemodules folder

 mkdir -p $HOME/privatemodules/NWChem  Copy the following contents into a file named "nwchem-6.5"  nano$HOME/privatemodules/NWChem/nwchem-6.5
#%Module

# This NWChem module was created by Kyle Bentley (kwbent@uab.edu) on
# September 25 2015.  NWChem was compiled with OpenMPI 1.4, OpenBLAS
# 2.14, and ScaLAPACK 2.0.2...
# Module will load the other appropriate modules, and set paths.

proc ModulesHelp { } {
puts stderr "This module loads openMPI, OpenBLAS, gcc-4.9.3, and nwchem-6.5"
}

module-whatis   "This module loads openMPI, OpenBLAS, gcc-4.9.3, and nwchem-6.5"

prepend-path PATH $HOME/NWChem/bin prepend-path LD_LIBRARY_PATH$HOME/OpenBLAS/lib


In order to use this newly created module, you have to first load the module 'use.own'. Then you should be able to see it in when running 'module avail'. It will be at the very bottom of the list.

module load use.own
module list


-------------- /home/kwbent/privatemodules -----------
NWChem/nwchem-6.5


# Testing the NWChem Package

NWChem includes its own set of example input files. In the $HOME/src/nwchem-6.5/example directory there are 5 tests of different types of calculations. dirdyvtst md pspw qmd rimp2 tcepolar Though they should all be ran and verified, copy one of the examples to the$HOME directory.

 cp $HOME/src/nwchem-6.5/examples/pspw/C6.nw$HOME

Refer to the getting started [8] guide for more information on submitting jobs. Below is an example SGE script that will run the NWChem test file, and place the results back in the home directory.

Create a file "nwchem_test" and put the following content into it.

 nano nwchem_test
#!/bin/bash

#$-S /bin/bash #$ -N test_job
#$-pe rr_openmpi 8 #$ -l h_rt=00:20:00,h_vmem=1.25G,vf=1G
#$-M$USER@uab.edu
#$-m eas #$ -q sipsey.q

module purge

# Copy the NWChem input file to the scratch
# directory, and run it from there
INPUT=$HOME/C6 mkdir$USER_SCRATCH/NWCHEM_test
SCRATCH=$USER_SCRATCH/NWChem_test cp$INPUT.nw $SCRATCH cd$SCRATCH

# Set a tuning parameter for OpenMPI
ARMCI_DEFAULT_SHMMAX=4096

# Run all NWChem jobs with mpiexec or mpirun
# The output file C6.out will be in the $Home directory # on completion mpiexec -np$NSLOTS nwchem $INPUT.nw >&$INPUT.out


Submit this job to the queue with qsub, and wait for the results. Once it starts, it shouldn't be more than a few minutes with this input file.

 qsub nwchem_test

# Final Thoughts

1. With any large software package, the setup is tedious and much can go wrong, especially when it depends on many previous steps. Make sure to run tests at each step of the way, and verify them before moving on.
2. NWChem scales very well with the number of cores used. The more the merrier!
3. Google has been, and shall always be your friend when it comes to compile errors, and NWChem input files.