NWChem: Difference between revisions

From Cheaha
Jump to navigation Jump to search
No edit summary
 
(31 intermediate revisions by the same user not shown)
Line 32: Line 32:
=Compiling NWChem for the Cheaha cluster=
=Compiling NWChem for the Cheaha cluster=


The steps outlined here are adapted from the general guide for a site installation consisting of commodity hardware over MPI <ref name="COMPILE">[http://www.nwchem-sw.org/index.php/Compiling_NWChem Compiling NWChem from source]</ref>.  There are many compilation options for differing architectures, network protocols, and optimized mathematics libraries. This guide will show the steps necessary to compile NWChem 6.5 for OpenMPI <ref name="OMPI">[http://www.open-mpi.org/ OpenMPI - Message Passing]</ref>, using OpenBLAS<ref name="OBLAS">[http://www.openblas.net/ OpenBLAS - Optimized BLAS Package]</ref> tuned for Intel's Nehalem microarchitecture and ScaLAPACK<ref name="SCALAPACK">[http://www.netlib.org/scalapack/ ScaLAPACK - Scalable Linear Algebra Package]</ref> for optimized linear algebra calculations.  This guide assume some familiarity with linux commands and utilities.  This guide also assumes that each software will be downloaded, compiled, and installed in the user's home directory.
The steps outlined here are adapted from the general guide for a site installation consisting of commodity hardware over MPI <ref name="COMPILE">[http://www.nwchem-sw.org/index.php/Compiling_NWChem Compiling NWChem from source]</ref>.  There are many compilation options for differing architectures, network protocols, and optimized mathematics libraries. This guide will show the steps necessary to compile NWChem 6.5 for OpenMPI <ref name="OMPI">[http://www.open-mpi.org/ OpenMPI - Message Passing]</ref>, using OpenBLAS<ref name="OBLAS">[http://www.openblas.net/ OpenBLAS - Optimized BLAS Package]</ref> tuned for Intel's Nehalem microarchitecture and ScaLAPACK<ref name="SCALAPACK">[http://www.netlib.org/scalapack/ ScaLAPACK - Scalable Linear Algebra Package]</ref> for optimized linear algebra calculations.  This guide assume some familiarity with linux commands and utilities.  This guide also assumes that each software will be downloaded, compiled, and installed in the user's home directory.  After each subsection, I'll add my own comments from experience that may help with any hiccups.


== Download and Compile OpenBLAS==
== Download and Compile OpenBLAS==
At the time of writing, the latest available version of OpenBLAS is 2.14<ref name="OBLAS-SRC">[http://github.com/xianyi/OpenBLAS/archive/v0.2.14.tar.gz OpenBLAS - Source Code]</ref>   
At the time of writing, the latest available version of OpenBLAS is 2.14<ref name="OBLAS-SRC">[http://github.com/xianyi/OpenBLAS/archive/v0.2.14.tar.gz OpenBLAS - Source Code]</ref>   


# Create the source directory  
Create the source directory  
<pre> mkdir -p $HOME/src </pre>
Download the source code
<pre>
<pre>
mkdir -p $HOME/src/OpenBLAS
cd $HOME/src
wget http://github.com/xianyi/OpenBLAS/archive/v0.2.14.tar.gz
</pre>
</pre>
# Download the source code
Unpack the source code
<pre>
<pre>  
cd $HOME/src/OpenBLAS
wget http://github.com/xianyi/OpenBLAS/archive/v0.2.14.tar.gz
</pre>
# Unpack the source code
<pre>
mv v0.2.14 v0.2.14.tar.gz
mv v0.2.14 v0.2.14.tar.gz
tar xf v0.2.14.tar.gz        (this step may take a few moments)
tar xf v0.2.14.tar.gz        (this step may take a few moments)
mv OpenBLAS-0.2.14 OpenBLAS
mv OpenBLAS-0.2.14 OpenBLAS
</pre>
</pre>
# Edit the configuration file so that its contents match those below.  Some special notes here - (1) The TARGET is set to PENRYN for use in the sipsey queue.  As the cluster ages and new generations are added, this sill probably need to be updated. (2) GCC is used as opposed to Intel's compilers. (3) NWChem uses 64 bit integers by default, so INTERFACE64 is set to 1.  Most other options are fairly self explanatory.
Edit the configuration file so that its contents match those below.  Some special notes here - (1) The TARGET is set to PENRYN for use in the sipsey queue.  As the cluster ages and new generations are added, this sill probably need to be updated. (2) GCC is used as opposed to Intel's compilers. (3) NWChem uses 64 bit integers by default, so INTERFACE64 is set to 1.  Most other options are fairly self explanatory. The default compiler is a little dated at this point.  Load the gcc 4.9.3 module to use gcc/gfortran 4.9.3 for some hopeful optimization!
<pre>
<pre>
module load gcc/4.9.3
cd OpenBLAS
cd OpenBLAS
nano Makefile.rule
nano Makefile.rule
Line 236: Line 235:
#
#
</pre>
</pre>
Compile OpenBLAS, and install it to $HOME/OpenBLAS.  Compiling may take a few minutes, so go grab another cup of coffee. 
<pre> make all </pre>
When the compilation is finished, you should see an output similar to the one below
<pre>
OpenBLAS build complete.
OS              ... GNU/Linux           
Architecture    ... x86_64             
BINARY          ... 64bit     
</pre>
Install the libraries and header files
<pre> make install </pre>
=== Notes on Compiling OpenBLAS ===
#When editing the config file 'Makefile.rule', you may have noticed an option for threading, and that it was ignored.  NWChem will use MPI for its threading model, and spawns many different processes that are distributed.  Each of these processes will use libopenblas directly; if we were to allow OpenBLAS to spawn its own threads, 'bad stuff' would happen.
#Inside the $HOME/src/OpenBLAS directory there is a 'test' directory.  It contains several "blat" files that can and should be used to test the installation.  In a successful compile and install, these should all run and pass.  Otherwise, something isn't correct and things wont work down the line.
==Download and Compile ScaLAPACK==
At the time of writing, the latest version of ScaLAPACK is 2.0.2<ref name="SCALAPACK-SOURCE">[http://www.netlib.org/scalapack/scalapack-2.0.2.tgz ScaLAPACK - Source Code]</ref>.
Download and unpack the source code
<pre>
cd $HOME/src
wget http://www.netlib.org/scalapack/scalapack-2.0.2.tgz
tar xf scalapack-2.0.2.tgz
mv scalapack-2.0.2 ScaLAPACK
cd ScaLAPACK
</pre>
Since ScaLAPACK leverages OpenMPI, load the module openmpi/openmpi-gnu.
<pre> module load openmpi/openmpi-gnu </pre>
Copy the example SLmake.inc.example to SLmake.inc and edit SLmake.inc making the the following changes.
<pre>
cp SLmake.inc.example SLmake.inc
nano SLmake.inc
</pre>
<pre>
CDEFS        = -DAdd_
FC            = mpif90
CC            = mpicc
NOOPT        = -O0
FCFLAGS      = -O3
CCFLAGS      = -O3
FCLOADER      = $(FC)
CCLOADER      = $(CC)
FCLOADFLAGS  = $(FCFLAGS)
CCLOADFLAGS  = $(CCFLAGS)
ARCH          = ar
ARCHFLAGS    = cr
RANLIB        = ranlib
SCALAPACKLIB  = libscalapack.a
BLASLIB      = $HOME/OpenBLAS/lib/libopenblas.a
LAPACKLIB    = $HOME/OpenBLAS/lib/libopenblas.a
LIBS          = $(LAPACKLIB) $(BLASLIB)
</pre>
Compile ScaLAPACK; this will also take a while.  Grab some Lunch!
<pre> make all </pre>
While there is no "make install" option in the ScaLAPACK Makefile, we'll create a directory and lib structure in our home directory to keep things consistent, and copy the generated library file there.
<pre>
mkdir -p $HOME/ScaLAPACK/lib
cp libscalapack.a $HOME/ScaLAPACK/lib
</pre>
===Notes on Compiling ScaLAPACK===
#Before the recent few versions of ScaLAPACK, BLACS, a communication layer between the BLAS parts and MPI parts of ScaLAPACK was needed.  It had to be compiled by itself, and included in the SLmake.inc file, just like the OpenBLAS libraries.  This is no longer necessary as BLACS has been "absorbed" into ScaLAPACK and is compiled during the 'make all' step.  However, in the BLACS directory there is still the source folder, and a testing folder.  After ScaLAPACK has been compiled, try running BLACS/TESTING/xCbtest and xFbtest with mpirun to make sure that part went as expected.
#Inside the ScaLAPACK directory there is also a TESTING directory.  Try these as well.
#If something shouldn't go right and you need to start over, for example, if you forgot to load the mpi module, run 'make clean' in the BLACS/SRC as well as the ScaLAPACK/SRC directories.  Running 'make clean' from the top level dir didn't seem to clear out some 'bad stuff' that was left over.
==Download and Compile NWChem==
At the time of writing, the latest version of NWChem is 6.5<ref name="NWCHEM-SOURCE">[http://www.nwchem-sw.org/download.php?f=Nwchem-6.5.revision26243-src.2014-09-10.tar.bz2 NWChem - Source Code]</ref>.
Download the source code, and the patches that have been released since.
<pre>
cd $HOME/src
wget http://www.nwchem-sw.org/download.php?f=Nwchem-6.5.revision26243-src.2014-09-10.tar.bz2
tar xfj Nwchem-6.5.revision26243-src.2014-09-10.tar.bz2        (this will take a moment, there are many small files to decompress)
mv Nwchem-6.5.revision26243-src.2014-09-10 nwchem-6.5
cd nwchem-6.5/src
mkdir patches
cd patches
</pre>
====Get and Apply Patches====
Pull the patches down, decompress them, and apply them.  At this point it should be said to check the NWChem Download page and add any to the list that have been added since this document was written.  Copy this list of file names into a file called "patch_list".
<pre> nano patch_list </pre>
<pre>
http://www.nwchem-sw.org/images/Util_md_sockets.patch.gz
http://www.nwchem-sw.org/images/Hbar.patch.gz
http://www.nwchem-sw.org/images/Tcenxtask.patch.gz
http://www.nwchem-sw.org/images/Hnd_giaxyz_noinline.patch.gz
http://www.nwchem-sw.org/images/Parallelmpi.patch.gz
http://www.nwchem-sw.org/images/Makefile_gcc4x.patch.gz
http://www.nwchem-sw.org/images/Bcast_ccsd.patch.gz
http://www.nwchem-sw.org/images/Elpa_syncs.patch.gz
http://www.nwchem-sw.org/images/Xlmpoles_ifort15.patch.gz
http://www.nwchem-sw.org/images/Ifort15_fpp_offload.patch.gz
http://www.nwchem-sw.org/images/Texas_iorb.patch.gz
http://www.nwchem-sw.org/images/Dmapp_inc.patch.gz
http://www.nwchem-sw.org/images/Print1e.patch.gz
http://www.nwchem-sw.org/images/Hnd_rys.patch.gz
http://www.nwchem-sw.org/images/Tddft_grad.patch.gz
http://www.nwchem-sw.org/images/Cdft.patch.gz
http://www.nwchem-sw.org/images/Vdw3_nwchem65.patch.gz
</pre>
Download the patches and decompress them
<pre>
wget -i patch_list
gunzip *.gz
cd $HOME/src/nwchem-6.5/src
</pre>
Apply the patches by creating a very tiny script here.  Copy the following into a file, make it executable, and run it. 
<pre> nano apply_patches </pre>
<pre>
#!/bin/csh
foreach FILE (`ls patches/*.patch`)
echo $FILE
patch -N -p0 < $FILE
end
</pre>
<pre>
chmod +x apply_patches
./apply_patches
</pre>
====NWChem Environment Variables and Build Script====
There are many options for the compilation that are set with environment variables, this is an easier way of consolidating the process.  Create a file called "build_nwchem", copy the contents below into it, make it executable and run it.  This will set all necessary variables, define the locations of the OpemMPI libraries, and the locations of OpenBLAS / ScaLAPACK that we've just finished compiling. The environment variables set here are all outlined in the compiling guide in the reference section for the curious. 
<pre> nano build_nwchem </pre>
<pre>
#!/bin/bash
export LARGE_FILES=TRUE
export USE_NOFSCHECK=TRUE
export USE_NOIO=TRUE
export LIB_DEFINES=-DDFLT_TOT_MEM=475000000
export NWCHEM_MODULES="all python"
export NWCHEM_TOP=$HOME/src/nwchem-6.5
export NWCHEM_TARGET=LINUX64
module purge
module load gcc/4.9.3
module load openmpi/openmpi-gnu
# Language Setting
export FC=gfortran
export CC=gcc
export PYTHONHOME=/opt/python
export PYTHONVERSION=2.7
export PYTHONLIBTYPE=a
# MPI
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export ARMCI_NETWORK=MPI_TS
export MPI_LOC=$MPI_HOME
export MPI_INCLUDE="-I$MPI_LOC/include"
export MPI_LIB="-L$MPI_LOC/lib"
export LIBMPI="-lmpi_f90 -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -lnsl -lutil -lm -ldl"
# BLAS and LAPACK
export HAS_BLAS=y
export BLAS_SIZE=8
export BLASOPT=$HOME/OpenBLAS/lib/libopenblas.so
export BLAS_LIB=$BLASOPT
export LAPACK_SIZE=8
export LAPACK_LIB=$BLASOPT
# BLACS and ScaLAPACK
export USE_SCALAPACK=y
export SCALAPACK_SIZE=8
export SCALAPACK=$HOME/ScaLAPACK/lib/libscalapack.a
export SCALAPACK_LIB=$SCALAPACK
cd $NWCHEM_TOP/src
make > make.log
</pre>
make the script executable and run it.  This will take even longer than the other two to compile, probably in the 30 minute - 1 hour, depending on the speed of the system.  Time for elevenses? Luncheon? Afternoon tea?
<pre>
chmod +x build_nwchem
./build_nwchem
</pre>
After the compilation is complete, create a NWChem folder in the $HOME directory.  There we will place some reference data, and a .nwchemrc file to point to it when jobs are run.
<pre>
mkdir -p $HOME/NWChem/bin
mkdir $HOME/NWChem/data
cp $HOME/src/nwchem-6.5/bin/LINUX64/nwchem $HOME/NWChem/bin
chmod 755 $HOME/NWChem/bin/nwchem
cp -r $HOME/src/nwchem-6.5/src/basis/libraries $HOME/NWChem/data
cp -r $HOME/src/nwchem-6.5/src/data $HOME/NWChem
cp -r $HOME/src/nwchem-6.5/src/nwpw/libraryps $HOME/NWChem/data
</pre>
Copy the contents below into a file called ".nwchemrc".
<pre>
cd $HOME
nano .nwchemrc
</pre>
<pre>
nwchem_basis_library $HOME/NWChem/data/libraries/
nwchem_nwpw_library $HOME/NWChem/data/libraryps/
ffield amber
amber_1 $HOME/NWChem/data/amber_s/
amber_2 $HOME/NWChem/data/amber_q/
amber_3 $HOME/NWChem/data/amber_x/
amber_4 $HOME/NWChem/data/amber_u/
spce    $HOME/NWChem/data/solvents/spce.rst
charmm_s $HOME/NWChem/data/charmm_s/
charmm_x $HOME/NWChem/data/charmm_x/
</pre>
====Notes on Compiling and Installing NWChem====
#If we've come this far without any error on the first try, something has gone wrong
#Something went wrong.  Check the make.log file to ensure that both the libscalapack and libopenblas files were detected correctly.  If they weren't found, check the path's above for typos.  cd to nwchem-6.5/src, and look in the log for the following output.  If it's there, all the libs have been found correctly.
<pre>
configure:
configure: Checks for BLAS,LAPACK,ScaLAPACK
configure:
configure: Attempting to locate BLAS library
checking for BLAS with user-supplied flags... yes
configure: Attempting to locate LAPACK library
checking for Fortran 77 LAPACK with user-supplied flags... yes
configure: Attempting to locate SCALAPACK library
checking for SCALAPACK with user-supplied flags... yes
checking whether SCALAPACK implements pdsyevr... yes
configure:
</pre>
#The openmpi/openmpi-gnu module uses mpi 1.4, which is a little more dated than I would have liked, but it was 'infiniband aware', which kept me from needing to specify the IB libs and include directory in the NWChem environment variables inside the "build_nwchem" script.
#More recent versions of OpenMPI do not usually build libmpi_f77 and libmpi_f90.  If in the future NWChem is recompiled with a newer version of OpenMPI, the appropriate library is just libmpi_mpifh.  The appropriate compiler is just mpifort, as opposed to having both mpif77 and mpif90.
#The make.log file contains tons of useful information, and is hard to dig through, but there is useful information in there if things go badly
#You may have noticed that the MPI_LOC directory was never explicitly set in the "build_nwchem" script.  It's defined as an environment variable once we load the openmpi/openmpi-gnu module.
#The environment variable DFLT_TOT_MEM=475000000 in the "build_nwchem" script represents how many 'doubles' (8 bytes) to give each processor by default.  I used ~4GB per node here, based on the information available about the gen3 Cheaha hardware.  This can be overridden in the NWChem input file by using the 'memory' keyword to allocate more or less memory per core.
==Loadable module for NWChem==
It is convenient to create a loadable module, so that any time we wish to run a job with NWChem, we can simply load it, run the job, and unload it, without having to specify full paths, and setting environment variables beforehand. To do this in the $HOME directory, we will create a 'private modules' folder, and create a module file.  The contents of the file are a bit beyond the scope here, but I believe you'll be able to guess at what each part does.
Create the privatemodules folder
<pre> mkdir -p $HOME/privatemodules/NWChem </pre>
Copy the following contents into a file named "nwchem-6.5"
<pre> nano $HOME/privatemodules/NWChem/nwchem-6.5 </pre>
<pre>
#%Module
# This NWChem module was created by Kyle Bentley (kwbent@uab.edu) on
# September 25 2015.  NWChem was compiled with OpenMPI 1.4, OpenBLAS
# 2.14, and ScaLAPACK 2.0.2... 
# Module will load the other appropriate modules, and set paths.
proc ModulesHelp { } {
  puts stderr "This module loads openMPI, OpenBLAS, gcc-4.9.3, and nwchem-6.5"
}
module-whatis  "This module loads openMPI, OpenBLAS, gcc-4.9.3, and nwchem-6.5"
module load openmpi/openmpi-gnu
module load gcc/4.9.3
prepend-path PATH $HOME/NWChem/bin
prepend-path LD_LIBRARY_PATH $HOME/OpenBLAS/lib
</pre>
In order to use this newly created module, you have to first load the module 'use.own'.  Then you should be able to see it in when running 'module avail'.  It will be at the very bottom of the list.
<pre>
module load use.own
module load NWChem/nwchem-6.5
module list
</pre>
<pre>
-------------- /home/kwbent/privatemodules -----------
NWChem/nwchem-6.5
</pre>
=Testing the NWChem Package=
NWChem includes its own set of example input files.  In the $HOME/src/nwchem-6.5/example directory there are 5 tests of different types of calculations.
<pre>dirdyvtst  md  pspw  qmd  rimp2  tcepolar</pre>
Though they should all be ran and verified, copy one of the examples to the $HOME directory.
<pre> cp $HOME/src/nwchem-6.5/examples/pspw/C6.nw $HOME </pre>
Refer to the getting started <ref name="STARTMEUP"> [[https://docs.uabgrid.uab.edu/wiki/Cheaha_GettingStarted#Submitting_Jobs Getting Started]] </ref> guide for more information on submitting jobs.  Below is an example SGE script that will run the NWChem test file, and place the results back in the home directory.
Create a file "nwchem_test" and put the following content into it.
<pre> nano nwchem_test </pre>
<pre>
#!/bin/bash
#$ -S /bin/bash
#$ -N test_job
#$ -pe rr_openmpi 8
#$ -l h_rt=00:20:00,h_vmem=1.25G,vf=1G
#$ -M $USER@uab.edu
#$ -m eas
#$ -q sipsey.q
# Load the NWChem module
module purge
module load use.own
module load NWChem/nwchem-6.5
# Copy the NWChem input file to the scratch
# directory, and run it from there
INPUT=$HOME/C6
mkdir $USER_SCRATCH/NWCHEM_test
SCRATCH=$USER_SCRATCH/NWChem_test
cp $INPUT.nw $SCRATCH
cd $SCRATCH
# Set a tuning parameter for OpenMPI
ARMCI_DEFAULT_SHMMAX=4096
# Run all NWChem jobs with mpiexec or mpirun
# The output file C6.out will be in the $Home directory
# on completion
mpiexec -np $NSLOTS nwchem $INPUT.nw >& $INPUT.out
</pre>
Submit this job to the queue with qsub, and wait for the results.  Once it starts, it shouldn't be more than a few minutes with this input file.
<pre> qsub nwchem_test </pre>
=Final Thoughts=
#With any large software package, the setup is tedious and much can go wrong, especially when it depends on many previous steps.  Make sure to run tests at each step of the way, and verify them before moving on. 
#NWChem scales very well with the number of cores used.  The more the merrier!
#Google has been, and shall always be your friend when it comes to compile errors, and NWChem input files.


=References=
=References=
{{Reflist}}
{{Reflist}}

Latest revision as of 23:20, 28 September 2015

NWChem: Open Source High-Performance Computational Chemistry

NWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters.

NWChem software can handle

  • Biomolecules, nanostructures, and solid-state
  • From quantum to classical, and all combinations
  • Ground and excited-states
  • Gaussian basis functions or plane-waves
  • Scaling from one to thousands of processors
  • Properties and relativistic effects

NWChem is actively developed by a consortium of developers and maintained by the EMSL located at the Pacific Northwest National Laboratory (PNNL) in Washington State. Researchers interested in contributing to NWChem should review the Developers page. The code is distributed as open-source under the terms of the Educational Community License version 2.0 (ECL 2.0).

The NWChem development strategy is focused on providing new and essential scientific capabilities to its users in the areas of kinetics and dynamics of chemical transformations, chemistry at interfaces and in the condensed phase, and enabling innovative and integrated research at EMSL. At the same time continued development is needed to enable NWChem to effectively utilize architectures of tens of petaflops and beyond.

Science with NWChem

NWChem used by thousands of researchers worldwide to investigate questions about chemical processes by applying theoretical techniques to predict the structure, properties, and reactivity of chemical and biological species ranging in size from tens to millions of atoms. With NWChem, researchers can tackle molecular systems including biomolecules, nanostructures, actinide complexes, and materials. NWChem offers an extensive array of highly scalable, parallel computational chemistry methods needed to address scientific questions that are relevant to reactive chemical processes occurring in our everyday environment—photosynthesis, protein functions, and combustion, to name a few. They include a multitude of highly correlated methods, density functional theory (DFT) with an extensive set of exchange-correlation functionals, time-dependent density functional theory (TDDFT), plane-wave DFT with exact exchange and Car-Parrinello, molecular dynamics with AMBER and CHARMM force fields, and combinations of them.

A list of research publications that utilized NWChem can be found here.

Software Similar to NWChem

Quantum Espresso [1]
CPMD [2]
Gaussian [3]
CP2k [4]

Compiling NWChem for the Cheaha cluster

The steps outlined here are adapted from the general guide for a site installation consisting of commodity hardware over MPI <ref name="COMPILE">Compiling NWChem from source</ref>. There are many compilation options for differing architectures, network protocols, and optimized mathematics libraries. This guide will show the steps necessary to compile NWChem 6.5 for OpenMPI <ref name="OMPI">OpenMPI - Message Passing</ref>, using OpenBLAS<ref name="OBLAS">OpenBLAS - Optimized BLAS Package</ref> tuned for Intel's Nehalem microarchitecture and ScaLAPACK<ref name="SCALAPACK">ScaLAPACK - Scalable Linear Algebra Package</ref> for optimized linear algebra calculations. This guide assume some familiarity with linux commands and utilities. This guide also assumes that each software will be downloaded, compiled, and installed in the user's home directory. After each subsection, I'll add my own comments from experience that may help with any hiccups.

Download and Compile OpenBLAS

At the time of writing, the latest available version of OpenBLAS is 2.14<ref name="OBLAS-SRC">OpenBLAS - Source Code</ref>

Create the source directory

 mkdir -p $HOME/src 

Download the source code

cd $HOME/src
wget http://github.com/xianyi/OpenBLAS/archive/v0.2.14.tar.gz 

Unpack the source code

 
mv v0.2.14 v0.2.14.tar.gz
tar xf v0.2.14.tar.gz         (this step may take a few moments)
mv OpenBLAS-0.2.14 OpenBLAS

Edit the configuration file so that its contents match those below. Some special notes here - (1) The TARGET is set to PENRYN for use in the sipsey queue. As the cluster ages and new generations are added, this sill probably need to be updated. (2) GCC is used as opposed to Intel's compilers. (3) NWChem uses 64 bit integers by default, so INTERFACE64 is set to 1. Most other options are fairly self explanatory. The default compiler is a little dated at this point. Load the gcc 4.9.3 module to use gcc/gfortran 4.9.3 for some hopeful optimization!

module load gcc/4.9.3
cd OpenBLAS
nano Makefile.rule
#
#  Beginning of user configuration
#

# This library's version
VERSION = 0.2.14

# If you set the suffix, the library name will be libopenblas_$(LIBNAMESUFFIX).a
# and libopenblas_$(LIBNAMESUFFIX).so. Meanwhile, the soname in shared library
# is libopenblas_$(LIBNAMESUFFIX).so.0.
# LIBNAMESUFFIX = 

# You can specify the target architecture, otherwise it's
# automatically detected.
TARGET = PENRYN

# If you want to support multiple architecture in one binary
#DYNAMIC_ARCH = 1

# C compiler including binary type(32bit / 64bit). Default is gcc.
# Don't use Intel Compiler or PGI, it won't generate right codes as I expect.
CC = gcc

# Fortran compiler. Default is g77.
FC = gfortran

# Even you can specify cross compiler. Meanwhile, please set HOSTCC.

# cross compiler for Windows
# CC = x86_64-w64-mingw32-gcc
# FC = x86_64-w64-mingw32-gfortran

# cross compiler for 32bit ARM
# CC = arm-linux-gnueabihf-gcc
# FC = arm-linux-gnueabihf-gfortran

# cross compiler for 64bit ARM
# CC = aarch64-linux-gnu-gcc
# FC = aarch64-linux-gnu-gfortran


# If you use the cross compiler, please set this host compiler.
# HOSTCC = gcc

# If you need 32bit binary, define BINARY=32, otherwise define BINARY=64
BINARY=64

# About threaded BLAS. It will be automatically detected if you don't
# specify it.
# For force setting for single threaded, specify USE_THREAD = 0
# For force setting for multi  threaded, specify USE_THREAD = 1
USE_THREAD = 0

# If you're going to use this library with OpenMP, please comment it in.
# USE_OPENMP = 1

# You can define maximum number of threads. Basically it should be
# less than actual number of cores. If you don't specify one, it's
# automatically detected by the the script.
# NUM_THREADS = 999

# if you don't need to install the static library, please comment it in.
# NO_STATIC = 1

# if you don't need generate the shared library, please comment it in.
# NO_SHARED = 1

# If you don't need CBLAS interface, please comment it in.
# NO_CBLAS = 1

# If you only want CBLAS interface without installing Fortran compiler,
# please comment it in.
# ONLY_CBLAS = 1

# If you don't need LAPACK, please comment it in.
# If you set NO_LAPACK=1, the library automatically sets NO_LAPACKE=1.
# NO_LAPACK = 1

# If you don't need LAPACKE (C Interface to LAPACK), please comment it in.
# NO_LAPACKE = 1

# If you want to use legacy threaded Level 3 implementation.
# USE_SIMPLE_THREADED_LEVEL3 = 1

# If you want to drive whole 64bit region by BLAS. Not all Fortran
# compiler supports this. It's safe to keep comment it out if you
# are not sure(equivalent to "-i8" option).
INTERFACE64 = 1

# Unfortunately most of kernel won't give us high quality buffer.
# BLAS tries to find the best region before entering main function,
# but it will consume time. If you don't like it, you can disable one.
NO_WARMUP = 1

# If you want to disable CPU/Memory affinity on Linux.
NO_AFFINITY = 1

# if you are compiling for Linux and you have more than 16 numa nodes or more than 256 cpus
BIGNUMA = 1

# Don't use AVX kernel on Sandy Bridge. It is compatible with old compilers
# and OS. However, the performance is low.
# NO_AVX = 1

# Don't use Haswell optimizations if binutils is too old (e.g. RHEL6)
# NO_AVX2 = 1

# Don't use parallel make.
# NO_PARALLEL_MAKE = 1

# If you would like to know minute performance report of GotoBLAS.
# FUNCTION_PROFILE = 1

# Support for IEEE quad precision(it's *real* REAL*16)( under testing)
# QUAD_PRECISION = 1

# Theads are still working for a while after finishing BLAS operation
# to reduce thread activate/deactivate overhead. You can determine
# time out to improve performance. This number should be from 4 to 30
# which corresponds to (1 << n) cycles. For example, if you set to 26,
# thread will be running for (1 << 26) cycles(about 25ms on 3.0GHz
# system). Also you can control this mumber by THREAD_TIMEOUT
# CCOMMON_OPT	+= -DTHREAD_TIMEOUT=26

# Using special device driver for mapping physically contigous memory
# to the user space. If bigphysarea is enabled, it will use it.
# DEVICEDRIVER_ALLOCATION = 1

# If you need to synchronize FP CSR between threads (for x86/x86_64 only).
# CONSISTENT_FPCSR = 1

# If any gemm arguement m, n or k is less or equal this threshold, gemm will be execute
# with single thread. You can use this flag to avoid the overhead of multi-threading
# in small matrix sizes. The default value is 4.
# GEMM_MULTITHREAD_THRESHOLD = 4

# If you need santy check by comparing reference BLAS. It'll be very
# slow (Not implemented yet).
# SANITY_CHECK = 1

# Run testcases in utest/ . When you enable UTEST_CHECK, it would enable
# SANITY_CHECK to compare the result with reference BLAS.
# UTEST_CHECK = 1

# The installation directory.
PREFIX = $HOME/OpenBLAS

# Common Optimization Flag;
# The default -O2 is enough.
COMMON_OPT = -O2

# gfortran option for LAPACK
# enable this flag only on 64bit Linux and if you need a thread safe lapack library
# FCOMMON_OPT = -frecursive

# Profiling flags
COMMON_PROF = -pg

# Build Debug version
# DEBUG = 1

# Improve GEMV and GER for small matrices by stack allocation.
# For details, https://github.com/xianyi/OpenBLAS/pull/482
#
# MAX_STACK_ALLOC=2048

# Add a prefix or suffix to all exported symbol names in the shared library.
# Avoid conflicts with other BLAS libraries, especially when using
# 64 bit integer interfaces in OpenBLAS.
# For details, https://github.com/xianyi/OpenBLAS/pull/459
#
# SYMBOLPREFIX=
# SYMBOLSUFFIX=

#
#  End of user configuration
#

Compile OpenBLAS, and install it to $HOME/OpenBLAS. Compiling may take a few minutes, so go grab another cup of coffee.

 make all 

When the compilation is finished, you should see an output similar to the one below

OpenBLAS build complete. 
OS               ... GNU/Linux             
Architecture     ... x86_64              
BINARY           ... 64bit      

Install the libraries and header files

 make install 

Notes on Compiling OpenBLAS

  1. When editing the config file 'Makefile.rule', you may have noticed an option for threading, and that it was ignored. NWChem will use MPI for its threading model, and spawns many different processes that are distributed. Each of these processes will use libopenblas directly; if we were to allow OpenBLAS to spawn its own threads, 'bad stuff' would happen.
  2. Inside the $HOME/src/OpenBLAS directory there is a 'test' directory. It contains several "blat" files that can and should be used to test the installation. In a successful compile and install, these should all run and pass. Otherwise, something isn't correct and things wont work down the line.

Download and Compile ScaLAPACK

At the time of writing, the latest version of ScaLAPACK is 2.0.2<ref name="SCALAPACK-SOURCE">ScaLAPACK - Source Code</ref>.

Download and unpack the source code

cd $HOME/src
wget http://www.netlib.org/scalapack/scalapack-2.0.2.tgz
tar xf scalapack-2.0.2.tgz
mv scalapack-2.0.2 ScaLAPACK
cd ScaLAPACK

Since ScaLAPACK leverages OpenMPI, load the module openmpi/openmpi-gnu.

 module load openmpi/openmpi-gnu 

Copy the example SLmake.inc.example to SLmake.inc and edit SLmake.inc making the the following changes.

cp SLmake.inc.example SLmake.inc 
nano SLmake.inc
CDEFS         = -DAdd_
FC            = mpif90
CC            = mpicc
NOOPT         = -O0
FCFLAGS       = -O3
CCFLAGS       = -O3
FCLOADER      = $(FC)
CCLOADER      = $(CC)
FCLOADFLAGS   = $(FCFLAGS)
CCLOADFLAGS   = $(CCFLAGS)
ARCH          = ar
ARCHFLAGS     = cr
RANLIB        = ranlib
SCALAPACKLIB  = libscalapack.a
BLASLIB       = $HOME/OpenBLAS/lib/libopenblas.a
LAPACKLIB     = $HOME/OpenBLAS/lib/libopenblas.a
LIBS          = $(LAPACKLIB) $(BLASLIB)

Compile ScaLAPACK; this will also take a while. Grab some Lunch!

 make all 

While there is no "make install" option in the ScaLAPACK Makefile, we'll create a directory and lib structure in our home directory to keep things consistent, and copy the generated library file there.

mkdir -p $HOME/ScaLAPACK/lib
cp libscalapack.a $HOME/ScaLAPACK/lib

Notes on Compiling ScaLAPACK

  1. Before the recent few versions of ScaLAPACK, BLACS, a communication layer between the BLAS parts and MPI parts of ScaLAPACK was needed. It had to be compiled by itself, and included in the SLmake.inc file, just like the OpenBLAS libraries. This is no longer necessary as BLACS has been "absorbed" into ScaLAPACK and is compiled during the 'make all' step. However, in the BLACS directory there is still the source folder, and a testing folder. After ScaLAPACK has been compiled, try running BLACS/TESTING/xCbtest and xFbtest with mpirun to make sure that part went as expected.
  2. Inside the ScaLAPACK directory there is also a TESTING directory. Try these as well.
  3. If something shouldn't go right and you need to start over, for example, if you forgot to load the mpi module, run 'make clean' in the BLACS/SRC as well as the ScaLAPACK/SRC directories. Running 'make clean' from the top level dir didn't seem to clear out some 'bad stuff' that was left over.

Download and Compile NWChem

At the time of writing, the latest version of NWChem is 6.5<ref name="NWCHEM-SOURCE">NWChem - Source Code</ref>.

Download the source code, and the patches that have been released since.

 
cd $HOME/src
wget http://www.nwchem-sw.org/download.php?f=Nwchem-6.5.revision26243-src.2014-09-10.tar.bz2
tar xfj Nwchem-6.5.revision26243-src.2014-09-10.tar.bz2         (this will take a moment, there are many small files to decompress)
mv Nwchem-6.5.revision26243-src.2014-09-10 nwchem-6.5
cd nwchem-6.5/src
mkdir patches
cd patches

Get and Apply Patches

Pull the patches down, decompress them, and apply them. At this point it should be said to check the NWChem Download page and add any to the list that have been added since this document was written. Copy this list of file names into a file called "patch_list".

 nano patch_list 
http://www.nwchem-sw.org/images/Util_md_sockets.patch.gz
http://www.nwchem-sw.org/images/Hbar.patch.gz
http://www.nwchem-sw.org/images/Tcenxtask.patch.gz
http://www.nwchem-sw.org/images/Hnd_giaxyz_noinline.patch.gz
http://www.nwchem-sw.org/images/Parallelmpi.patch.gz
http://www.nwchem-sw.org/images/Makefile_gcc4x.patch.gz
http://www.nwchem-sw.org/images/Bcast_ccsd.patch.gz
http://www.nwchem-sw.org/images/Elpa_syncs.patch.gz
http://www.nwchem-sw.org/images/Xlmpoles_ifort15.patch.gz
http://www.nwchem-sw.org/images/Ifort15_fpp_offload.patch.gz
http://www.nwchem-sw.org/images/Texas_iorb.patch.gz
http://www.nwchem-sw.org/images/Dmapp_inc.patch.gz
http://www.nwchem-sw.org/images/Print1e.patch.gz
http://www.nwchem-sw.org/images/Hnd_rys.patch.gz
http://www.nwchem-sw.org/images/Tddft_grad.patch.gz
http://www.nwchem-sw.org/images/Cdft.patch.gz
http://www.nwchem-sw.org/images/Vdw3_nwchem65.patch.gz

Download the patches and decompress them

wget -i patch_list 
gunzip *.gz
cd $HOME/src/nwchem-6.5/src

Apply the patches by creating a very tiny script here. Copy the following into a file, make it executable, and run it.

 nano apply_patches 
#!/bin/csh

foreach FILE (`ls patches/*.patch`)

echo $FILE
patch -N -p0 < $FILE

end
chmod +x apply_patches
./apply_patches

NWChem Environment Variables and Build Script

There are many options for the compilation that are set with environment variables, this is an easier way of consolidating the process. Create a file called "build_nwchem", copy the contents below into it, make it executable and run it. This will set all necessary variables, define the locations of the OpemMPI libraries, and the locations of OpenBLAS / ScaLAPACK that we've just finished compiling. The environment variables set here are all outlined in the compiling guide in the reference section for the curious.

 nano build_nwchem 
#!/bin/bash

export LARGE_FILES=TRUE
export USE_NOFSCHECK=TRUE
export USE_NOIO=TRUE
export LIB_DEFINES=-DDFLT_TOT_MEM=475000000
export NWCHEM_MODULES="all python"
export NWCHEM_TOP=$HOME/src/nwchem-6.5
export NWCHEM_TARGET=LINUX64

module purge
module load gcc/4.9.3
module load openmpi/openmpi-gnu

# Language Setting
export FC=gfortran
export CC=gcc
export PYTHONHOME=/opt/python
export PYTHONVERSION=2.7
export PYTHONLIBTYPE=a

# MPI
export USE_MPI=y
export USE_MPIF=y
export USE_MPIF4=y
export ARMCI_NETWORK=MPI_TS
export MPI_LOC=$MPI_HOME
export MPI_INCLUDE="-I$MPI_LOC/include"
export MPI_LIB="-L$MPI_LOC/lib"
export LIBMPI="-lmpi_f90 -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -lnsl -lutil -lm -ldl"

# BLAS and LAPACK
export HAS_BLAS=y
export BLAS_SIZE=8
export BLASOPT=$HOME/OpenBLAS/lib/libopenblas.so
export BLAS_LIB=$BLASOPT
export LAPACK_SIZE=8
export LAPACK_LIB=$BLASOPT

# BLACS and ScaLAPACK
export USE_SCALAPACK=y
export SCALAPACK_SIZE=8
export SCALAPACK=$HOME/ScaLAPACK/lib/libscalapack.a
export SCALAPACK_LIB=$SCALAPACK

cd $NWCHEM_TOP/src

make > make.log

make the script executable and run it. This will take even longer than the other two to compile, probably in the 30 minute - 1 hour, depending on the speed of the system. Time for elevenses? Luncheon? Afternoon tea?

 
chmod +x build_nwchem
./build_nwchem

After the compilation is complete, create a NWChem folder in the $HOME directory. There we will place some reference data, and a .nwchemrc file to point to it when jobs are run.

 
mkdir -p $HOME/NWChem/bin
mkdir $HOME/NWChem/data
cp $HOME/src/nwchem-6.5/bin/LINUX64/nwchem $HOME/NWChem/bin
chmod 755 $HOME/NWChem/bin/nwchem
cp -r $HOME/src/nwchem-6.5/src/basis/libraries $HOME/NWChem/data
cp -r $HOME/src/nwchem-6.5/src/data $HOME/NWChem
cp -r $HOME/src/nwchem-6.5/src/nwpw/libraryps $HOME/NWChem/data

Copy the contents below into a file called ".nwchemrc".

cd $HOME
nano .nwchemrc
nwchem_basis_library $HOME/NWChem/data/libraries/
nwchem_nwpw_library $HOME/NWChem/data/libraryps/
ffield amber
amber_1 $HOME/NWChem/data/amber_s/
amber_2 $HOME/NWChem/data/amber_q/
amber_3 $HOME/NWChem/data/amber_x/
amber_4 $HOME/NWChem/data/amber_u/
spce    $HOME/NWChem/data/solvents/spce.rst
charmm_s $HOME/NWChem/data/charmm_s/
charmm_x $HOME/NWChem/data/charmm_x/

Notes on Compiling and Installing NWChem

  1. If we've come this far without any error on the first try, something has gone wrong
  2. Something went wrong. Check the make.log file to ensure that both the libscalapack and libopenblas files were detected correctly. If they weren't found, check the path's above for typos. cd to nwchem-6.5/src, and look in the log for the following output. If it's there, all the libs have been found correctly.
configure: 
configure: Checks for BLAS,LAPACK,ScaLAPACK
configure: 
configure: Attempting to locate BLAS library
checking for BLAS with user-supplied flags... yes
configure: Attempting to locate LAPACK library
checking for Fortran 77 LAPACK with user-supplied flags... yes
configure: Attempting to locate SCALAPACK library
checking for SCALAPACK with user-supplied flags... yes
checking whether SCALAPACK implements pdsyevr... yes
configure: 
  1. The openmpi/openmpi-gnu module uses mpi 1.4, which is a little more dated than I would have liked, but it was 'infiniband aware', which kept me from needing to specify the IB libs and include directory in the NWChem environment variables inside the "build_nwchem" script.
  2. More recent versions of OpenMPI do not usually build libmpi_f77 and libmpi_f90. If in the future NWChem is recompiled with a newer version of OpenMPI, the appropriate library is just libmpi_mpifh. The appropriate compiler is just mpifort, as opposed to having both mpif77 and mpif90.
  3. The make.log file contains tons of useful information, and is hard to dig through, but there is useful information in there if things go badly
  4. You may have noticed that the MPI_LOC directory was never explicitly set in the "build_nwchem" script. It's defined as an environment variable once we load the openmpi/openmpi-gnu module.
  5. The environment variable DFLT_TOT_MEM=475000000 in the "build_nwchem" script represents how many 'doubles' (8 bytes) to give each processor by default. I used ~4GB per node here, based on the information available about the gen3 Cheaha hardware. This can be overridden in the NWChem input file by using the 'memory' keyword to allocate more or less memory per core.

Loadable module for NWChem

It is convenient to create a loadable module, so that any time we wish to run a job with NWChem, we can simply load it, run the job, and unload it, without having to specify full paths, and setting environment variables beforehand. To do this in the $HOME directory, we will create a 'private modules' folder, and create a module file. The contents of the file are a bit beyond the scope here, but I believe you'll be able to guess at what each part does.

Create the privatemodules folder

 mkdir -p $HOME/privatemodules/NWChem 

Copy the following contents into a file named "nwchem-6.5"

 nano $HOME/privatemodules/NWChem/nwchem-6.5 
#%Module

# This NWChem module was created by Kyle Bentley (kwbent@uab.edu) on
# September 25 2015.  NWChem was compiled with OpenMPI 1.4, OpenBLAS
# 2.14, and ScaLAPACK 2.0.2...  
# Module will load the other appropriate modules, and set paths.

proc ModulesHelp { } {
   puts stderr "This module loads openMPI, OpenBLAS, gcc-4.9.3, and nwchem-6.5"
}

module-whatis   "This module loads openMPI, OpenBLAS, gcc-4.9.3, and nwchem-6.5"

module load openmpi/openmpi-gnu
module load gcc/4.9.3

prepend-path PATH $HOME/NWChem/bin
prepend-path LD_LIBRARY_PATH $HOME/OpenBLAS/lib

In order to use this newly created module, you have to first load the module 'use.own'. Then you should be able to see it in when running 'module avail'. It will be at the very bottom of the list.

module load use.own
module load NWChem/nwchem-6.5
module list
 
-------------- /home/kwbent/privatemodules -----------
NWChem/nwchem-6.5

Testing the NWChem Package

NWChem includes its own set of example input files. In the $HOME/src/nwchem-6.5/example directory there are 5 tests of different types of calculations.

dirdyvtst  md  pspw  qmd  rimp2  tcepolar

Though they should all be ran and verified, copy one of the examples to the $HOME directory.

 cp $HOME/src/nwchem-6.5/examples/pspw/C6.nw $HOME 

Refer to the getting started <ref name="STARTMEUP"> [Getting Started] </ref> guide for more information on submitting jobs. Below is an example SGE script that will run the NWChem test file, and place the results back in the home directory.

Create a file "nwchem_test" and put the following content into it.

 nano nwchem_test 
#!/bin/bash

#$ -S /bin/bash
#$ -N test_job
#$ -pe rr_openmpi 8 
#$ -l h_rt=00:20:00,h_vmem=1.25G,vf=1G
#$ -M $USER@uab.edu
#$ -m eas
#$ -q sipsey.q

# Load the NWChem module
module purge
module load use.own
module load NWChem/nwchem-6.5

# Copy the NWChem input file to the scratch
# directory, and run it from there
INPUT=$HOME/C6
mkdir $USER_SCRATCH/NWCHEM_test
SCRATCH=$USER_SCRATCH/NWChem_test
cp $INPUT.nw $SCRATCH
cd $SCRATCH

# Set a tuning parameter for OpenMPI
ARMCI_DEFAULT_SHMMAX=4096

# Run all NWChem jobs with mpiexec or mpirun
# The output file C6.out will be in the $Home directory
# on completion
mpiexec -np $NSLOTS nwchem $INPUT.nw >& $INPUT.out

Submit this job to the queue with qsub, and wait for the results. Once it starts, it shouldn't be more than a few minutes with this input file.

 qsub nwchem_test 

Final Thoughts

  1. With any large software package, the setup is tedious and much can go wrong, especially when it depends on many previous steps. Make sure to run tests at each step of the way, and verify them before moving on.
  2. NWChem scales very well with the number of cores used. The more the merrier!
  3. Google has been, and shall always be your friend when it comes to compile errors, and NWChem input files.

References

1 | references-column-count references-column-count-{{{1}}} }} }} }}" {{#if: | style="-moz-column-width:{{{colwidth}}}; column-width:{{{colwidth}}};" | {{#if: | style="-moz-column-count:{{{1}}}; column-count:{{{1}}};" }} }}> <references group=""></references>