Crawfish is a python library for pcohp analysis on JDFTx calculations.
Crawfish (originally called ultraSoftCrawfish) is a python library intended primarily for performing bonding analysis on the output of JDFTx calculations. Its reason for existing (as alluded to in the original name) is that the state-of-the-art COHP analysis software (LOBSTER) only supports calculations with PAW pseudopotentials. While the researchers of LOBSTER have shown unavoidable pitfalls when attempting cohp analysis on non-PAW calculations, cohp analysis on calculations of other pseudopotential-type calculations is far from meaningless and still provides tremendous insight. Thus the goal of crawfish is to allow access to cohp analysis for DFT users who do not use PAW pseudopotentials. While this library is intended for JDFTx, crawfish also offers support for pCOHP analysis on user-provided system data, whether generated by another DFT software, or created by the user for learning purposes.
All arrays are named in the general format "name_indices", where "name" provides insight to the meaning of the array, and "indices" tells the user the array's dimensionality, and the significance of each dimension. ie for h_uu, "h" would signify the system hamiltonian, and "uu" would signify the array is 2-dimensional, where both dimensions correspond to atomic orbitals (meaning of each index name given below). Parts of the indices are also occasionally separated by an underscore for clarity, but are meaningless (ie s_tj_uu would be assumed equivalent to s_tjuu)
Spin and k-points are collapsed to a single index t, called a "state" (and nstates gives the total number of states for a calculation) . When un-collapsed, spin is given the index s (nspin) and steps along the first, second, and third reciprocal lattice vector are given the indices a, b, and c (nka, nkb, nkc = kfolding). Bands are indexed always by j (nbands). Orbitals are indexed by either u (nproj) (v (
-
projis used to signify the projection vector, typically in shapetju. In braket notation, proj_tju[t,j,u] =$\bra{\phi_\mu}\psi_j(t)\rangle$ . -
e($\epsilon$ ) is used to signify the Kohn-Sham eigenvalues of the DFT calculation, and has either the shapetjuorsabcju. -
wkis used to signify the weights of each k-point, and has only the shapet -
occ($f$ ) is used to signify the occupation at each state (k-point + spin) and band, and thus has either shapetjorsabcj -
sis used to signify orbital overlaps, thus will either have shapeuu($\bra{\phi_\mu} \phi_\nu\rangle$ ) ortj_uu($\bra{\phi_\mu}\psi_j(t)\rangle\langle\psi_j(t)\ket{\phi_\nu}$ ) -
pis used to signify orbital-overlap populations,thus will either have shapeuu($\bra{\phi_\mu} \hat{\rho} \ket{\phi_\nu}$ ) ortj_uu($f_j(t)\bra{\phi_\mu}\psi_j(t)\rangle\rho_j(t)\langle\psi_j(t)\ket{\phi_\nu}$ ).
Unless otherwise indicated, all energies are in Hartrees and are not normalized to the Fermi level!!
-
trim_excess_bandsis a bool class variable ofElecDatain which onlynprojbands are included in analysis. By trimming excess bands, the projection vectorproj_tjubecomes square at each state. This has been primarily useful so far as means of allowing the projections at each state to be normalized for each band and each orbital. Theoretically, this also allows for projections to undergo a band-lowdin-orthogonalization, but the usefulness has not been investigated. Theoretically this also allows for using the dual space of the projections (allowing for less ad-hoc approaches to charge conservation), but this has yet to be implemented. -
los_orbsis a bool class variable ofElecDatain which orbitals are made orthogonal to one another via the Lowdin-Orthogonalization technique. This may seem counterintuitive in a framework centered around how orbitals interact with each other, but remember that this orthogonality ($\langle\phi_\mu|\phi_\nu\rangle=\delta_{\mu,\nu}$ ) does not eliminate overlap between orbitals at individual bands and states ($\bra{\phi_\mu}\psi_j(t)\rangle\langle\psi_j(t)\ket{\phi_\nu}$ ), only over the sum of all bands and states ($\sum_{j,t}w_t\langle\phi_\mu|\psi_j(t)\rangle\langle\psi_j(t)|\phi_\nu\rangle)=\delta_{\mu,\nu}$ ). This is an incredibly useful technique when trying to reformulate our calculation in a LCAO picture, as it ensures that for all bonding interactions (bandsjat statetwhere$c_{\mu,j}(t)^* c_{\nu,j}(t)>0$ ), there are enough antibonding interactions ($c_{\mu,j}(t)^* c_{\nu,j}(t)<0$ ) such that the sum over all bands at that state for any orbital pair$\mu,\nu$ sums to$\delta_{\mu,\nu}$ . The Lowdin-Orthogonalization technique is the obvious choice for this orthogonalization, as it is a simple to employ (takes 5 lines of vectorized numpy processes here) and minimizes the deviation of each projection from the true value (JDFTx will orthogonalize the orbitals if given the argumentband-projection-params yes noprior to evaluating and dumping the band projections. However, due to the incompleteness of the space spanned by the bands at each state, this orthogonality will be lost when evaluating total overlap with the dumped projections. The same can also be realized for the bands due to the incompleteness of the space spanned by the orbitals).
-
p_uu_consistentis a bool class variable ofElecDataensuring charge conservation when building the orbital-overlap population matrix. WhenTrue, it will temporarily re-scaleprojsuch that summing overuandvforp_tj_uu[t,j,u,v]equalsocc_tj[t,j]($$\sum_{\mu, \nu}P_{u, v}(t, j)=f_j(t)$$ ). -
s_tj_uu_realis a bool class variable ofElecDataensuring that orbital overlap is a real value. Since planewaves have a complex component, orbital/band projectionsproj_tju($\bra{\phi_\mu}\psi_j(t)\rangle$ ) are typical complex. -
s_tj_uu_posis a bool class variable ofElecDataensuring that orbital overlap is a positive value. This is done by subtracting out the smallest value from the entire tensor, and then rescaling the entire tensor such the sum over all indicestjuvmatches the original sum.
For the following equations, projections (proj_tju[t,j,u]) are short-handed as e_tj[t,j]) are notated as erange. By default, gaussian smearing is employed, by which sig. If linear tetrahedron integration (lti) is requested, libtetrabz package, and
- pDOS Projected density-of-states (pDOS) is primarily included in this package for sanity checks, and is evaluated as
-
pCOMO
The project Crystal Orbital Mobile Order (pCOMO) is a novel equation for an ancient concept - Coulson's "Mobile Order." This equation is most similar to pCOOP, sharing the same signage convention (bonding = positive, antibonding = negative, nonbonding = zero) but with a noticable lack of orbital overlap,
$S_{\mu,\nu}$ . This equation is an unquestionable step down from the familiar equations in the crystal orbital formalism family (pCOOP, pCOHP, and pCOBI). However, this equation provides an added convenience as the orbital overlap term$S_{\mu,\nu}$ may be harder to obtain than the KS projection coefficients used to evaluate$T_{\nu,j}$ . While the disadvantages to this equation are obvious, the lack of orbital overlap does not influence the identification of a peak as bonding, antibonding, or non-bonding. As orbital overlap is only currently approximated as$\left|T_{\mu,j}^*T_{\nu,j}\right|$ for JDFTx, this is the only faithfully represented bonding equation in this package.
- pCOOP
- pCOBI
- pCOHP
where
and
and
- Non-PAW JDFTx calculations The intended audience for
crawfishis anyone curious about the bondinging within a non-PAW pseudopotential calculation performed using JDFTx. While LOBSTER is not explicitly supported by JDFTx, the output of any unsupported calculation with PAW pseudopotentials can be converted by the user to mimic the output of a calculation which is supported by LOBSTER, circumventing the need of explicit support. If this is not the case,crawfishis here for you. - General non-PAW calculations The techniques used by
crawfishare made available to other DFT calculators, so long as the user is able to acquire the required data to construct anElecDataobject. The instructions for how to do so are available in the "Creating your ownElecData" section of this readme. This process requires providingcrawfishwith the Kohn-Sham eigenvalues, and the projections of each Kohn-Sham wavefunction onto each orbital (as well as some other information that is typically much easier to obtain). If you are interested in doing so, please reach out to me (beri9208@colorado.edu) to help you with any obstacles that might require fixing some less-tested parts of the code.
- Create an
ElecDataobjectElecDatais the class used to house all electronic data and derived tensors for a given calculation. Provided the JDFTx calculation has been run with the required settings (band-projection-params yes no,dump End BandProjectionsanddump End BandEigs), this can be done in one line asedata = ElecData.from_calc_dir(calc_dir), whereElecDatahas been imported fromcrawfish.core.elecdata, andcalc_diris eitherstrorPathgiving the full path to your directory containing the calculation output data. - Change desired settings If there are any parameters you wish to change that effect the computed tensors required for pCOHP analysis (ie
edata.los_orbs), you can change these values in the typical fashion (edata.los_orbs = False) triggering a re-evaluation of the affected tensors with this change in mind. If you have multiple settings you want to change, you can avoid repeated re-evaluations by changing the setting's private value (edata._los_orbs = False) and either remembering to change the final setting through the public value or by runningedata.alloc_elec_data(). - Import function(s) for desired analysis All spectrum-generating functions for a given analysis technique "" can be imported from
crawfish.funcs.<mode>, which will contain a dos-like spectrum-generating functionget_<mode>and a spectrum integrating functionget_i<mode>. - Generate spectra and plot Dos-like generating functions and spectrum integrating function will both return a length-2 tuple
erange, spectrum, corresponding to the spectrum values and their corresponding energy-axis values. If you are unfamiliar with plotting in python, this can be easily done withmatplotlib(the installation of this library ensures your python environment has this package) by importingimport matplotlib.pyplot as pltand runningplt.plot(erange, spectrum). The arguments for these functions can all be checked in the docustrings for the function definitions.
- Collect (or artificially construct) the following objects
- A pymatgen
Structureof your system. It is critical that the species in this structure are ordered by their atomic number. - The band eigenvalues in shape
tjas a numpy array. - The k-point folding and the corresponding k-points. (if you are unsure but know the total number of k-points, the kfolding can be set arbitrarily and the k-points can be left as None)
- The k-point weights (if you are unsure but know there was no symmetry reduction (ie every k-point on your MK grid was evaluated) then all your k-point weights are equal to nspin/nkpts. If there was symmetry reduction of your k-mesh but know how many k-points were reduced into each of the output k-pts, multiply the weights by the number of k-points each output k-point "represents")
- The number of projections (orbitals) gathered for each atom type and their corresponding quantum numbers (exact principal quantum number n matters less, as long as you known the ordering of them for multiple shells of a given angular momentum)
- The projection coefficients for each band+state on each orbital in shape
tju. It is critical that you have the actual projections and not their absolute values (the latter typically dumped as it is all that is needed for pDOS analysis) as taking the absolute value removes all information about the phase of the orbital at that band+state (bonding vs antibonding interaction is determined solely by the matching of phases between two orbitals). The ordering of the projections must match the ordering of the atoms as given in the Structure (ie for an all-electron calculation of an Li2 structure, projections 0-3 should correspond to Li #1's 1s, Li #1's 2s, Li #2's 1s, and Li #2's 2s)
- Initialize an empty
ElecDataobject with the class methodedata = ElecData.as_empty()to circumvent initialization procedures for JDFTx calculations. - Set
user_proj_tju(ieedata.user_proj_tju = np.random([10,5,4])). This will not be touched, and all projection manipulations will be performed on a copy of this array. This will automatically definenstates,nbands, andnproj - Set the
atom_orb_labels_dictproperty for the class as a dictionary mapping each element in your calculation to a string representing of all the quantum numbers for the projections gathered for that element (ie for a calculation of C2H4O, setedata.atom_orb_labels_dict = {"H": ["s"], "C": ["s", "px", "py", "pz], "O": ["s", "px", "py", "pz"]}. If you do not wish to perform orbital resolution on your analysis, it does not matter what you put here, as long as all the elements of each list are unique and the list length matches the number of projections for that element. The ordering of the projections must match the ordering as they are listed in youruser_proj_tju. If you have multiple projections for a given angular momentum, include the principal quantum number as well (ie["1s", "2s"]- they do not need to be the true principal quantum number. - Set
kfolding(ieedata.kfolding = [3,3,3]). If you are unsure but knownspin, set as[int(edata.nstates/nspin), 1, 1] - Set
wk_t(ieedata.wk_t = np.ones(edata.nstates)*edata.nspin/np.prod(edata.kfolding)). If you are unsure, just make sure they sum tonspin. - Set the fermi level as
edata.mu(if you are going to setocc_tjexplicitly, this step becomes optional but is still useful for plotting)
- If you have the state/band occupation, set it as
edata.occ_tj. Otherwise, it will be calculated for your usingedata.broadeningandedata.broadening_type.
Any github-url pip installation method should work, but below are the steps I have tested and know should work.
- Clone this repo somewhere
git clone https://github.com/benrich37/crawfish.git- Activate the python environment you wish to use when performing pCOHP analysis NOTE: At the moment, the JDFTx IO module that part of this library depends on only exists on an independent fork of pymatgen. At the time of writing this (10/24/24) this fork is fully up-to-date, but later on this installation may roll back your pymatgen to an older version. If you are worried about dependency conflicts, I would reccomend creating a conda virtual environment with python version 3.12 (latest as of writing this)
- Navigate to ~/crawfish/ where you cloned this repo (not ~/crawfish/src/crawfish/) and install via pip
cd ./crawfishpip install .