D.80 Derivation of perturbation theory

This note derives the perturb­ation theory results for the solution of the eigenvalue problem $(H_0+H_1)\psi$ $\vphantom0\raisebox{1.5pt}{$=$}$ $E\psi$ where $H_1$ is small. The consid­erations for degenerate problems use linear algebra.

First, “small” is not a valid mathematical term. There are no small numbers in mathematics, just numbers that become zero in some limit. Therefore, to mathemati­cally analyze the problem, the perturb­ation Hamiltonian will be written as

\begin{displaymath}
H_1 \equiv \varepsilon H_{\varepsilon}
\end{displaymath}

where $\varepsilon$ is some chosen number that physi­cally indicates the magnitude of the perturb­ation potential. For example, if the perturb­ation is an external electric field, $\varepsilon$ could be taken as the reference magnitude of the electric field. In perturb­ation analysis, $\varepsilon$ is assumed to be vanishingly small.

The idea is now to start with a good eigen­function $\psi_{{\vec n},0}$ of $H_0$, (where “good” is still to be defined), and correct it so that it becomes an eigen­function of $H$ $\vphantom0\raisebox{1.5pt}{$=$}$ $H_0+H_1$. To do so, both the desired energy eigen­function and its energy eigenvalue are expanded in a power series in terms of $\varepsilon$:

\begin{eqnarray*}
&&
\psi_{\vec n}= \psi_{{\vec n},0}
+ \varepsilon \psi_{...
...psilon}
+ \varepsilon^2 E_{{\vec n},\varepsilon^2}
+ \ldots
\end{eqnarray*}

If $\varepsilon$ is a small quantity, then $\varepsilon^2$ will be much smaller still, and can probably be ignored. If not, then surely $\varepsilon^3$ will be so small that it can be ignored. A result that forgets about powers of $\varepsilon$ higher than one is called first order perturb­ation theory. A result that also includes the quadratic powers, but forgets about powers higher than two is called second order perturb­ation theory, etcetera.

Before proceeding with the practical appli­cation, a disclaimer is needed. While it is relatively easy to see that the eigen­values expand in whole powers of $\varepsilon$, (note that they must be real whether $\varepsilon$ is positive or negative), it is much more messy to show that the eigen­functions must expand in whole powers. In fact, for degenerate energies $E_{{\vec n},0}$ they only do if you choose good states $\psi_{{\vec n},0}$. See Rellich’s lecture notes on Perturb­ation Theory [Gordon & Breach, 1969] for a proof. As a result the problem with degeneracy becomes that the good unperturbed eigen­function $\psi_{{\vec n},0}$ is initially unknown. It leads to lots of messi­ness in the procedures for degenerate eigen­values described below.

When the above power series are substituted into the eigenvalue problem to be solved,

\begin{displaymath}
\left(H_0+\varepsilon H_\varepsilon\right)\psi_{\vec n}
= E_{\vec n}\psi_{\vec n}
\end{displaymath}

the net coefficient of every power of $\varepsilon$ must be equal in the left and right hand sides. Collecting these coefficients and rearranging them appro­priately produces:

\begin{eqnarray*}
&\varepsilon^0:& (H_0-E_{{\vec n},0})\psi_{{\vec n},0} = 0 \...
...{{\vec n},\varepsilon^3}\psi_{{\vec n},0} \\
&\vdots& \cdots
\end{eqnarray*}

These are the equations to be solved in succession to give the various terms in the expansion for the wave function $\psi_{\vec n}$ and the energy $E_{\vec n}$. The further you go down the list, the better your combined result should be.

Note that all it takes is to solve problems of the form

\begin{displaymath}
(H_0-E_{{\vec n},0})\psi_{{\vec n},\ldots} = \ldots
\end{displaymath}

The equations for the unknown functions are in terms of the unperturbed Hamiltonian $H_0$, with some additional but in principle knowable terms.

For difficult perturb­ation problems like you find in engineering, the use of a small parameter $\varepsilon$ is essential to get the mathematics right. But in the simple appli­cations in quantum mechanics, it is usually overkill. So most of the time the expansions are written without, like

\begin{eqnarray*}
&&
\psi_{\vec n}= \psi_{{\vec n},0} + \psi_{{\vec n},1} + ...
...c n}= E_{{\vec n},0} + E_{{\vec n},1} + E_{{\vec n},2} + \ldots
\end{eqnarray*}

where you are assumed to just imagine that $\psi_{{\vec n},1}$ and $E_{{\vec n},1}$ are “first order small,” $\psi_{{\vec n},2}$ and $E_{{\vec n},2}$ are “second order small,” etcetera. In those terms, the successive equations to solve are:
     $\displaystyle (H_0-E_{{\vec n},0})\psi_{{\vec n},0} = 0$  (D.55)
     $\displaystyle (H_0-E_{{\vec n},0})\psi_{{\vec n},1}
= - H_1\psi_{{\vec n},0}
+ E_{{\vec n},1}\psi_{{\vec n},0}$  (D.56)
     $\displaystyle (H_0-E_{{\vec n},0})\psi_{{\vec n},2}
= - H_1\psi_{{\vec n},1}
+ E_{{\vec n},1}\psi_{{\vec n},1}
+ E_{{\vec n},2}\psi_{{\vec n},0}$  (D.57)
     $\displaystyle (H_0-E_{{\vec n},0})\psi_{{\vec n},3}
= - H_{1}\psi_{{\vec n},2...
...c n},2}
+ E_{{\vec n},2}\psi_{{\vec n},1}
+ E_{{\vec n},3}\psi_{{\vec n},0}$  (D.58)
     $\displaystyle \cdots$   

Now consider each of these equations in turn. First, (D.55) is just the Hamiltonian eigenvalue problem for $H_0$ and is already satisfied by the chosen unperturbed solution $\psi_{{\vec n},0}$ and its eigenvalue $E_{{\vec n},0}$. However, the remaining equations are not trivial. To solve them, write their solutions in terms of the other eigen­functions $\psi_{\underline{\vec n},0}$ of the unperturbed Hamiltonian $H_0$. In particular, to solve (D.56), write

\begin{displaymath}
\psi_{{\vec n},1} =
\sum_{\underline{\vec n}\ne{\vec n}}
c_{\underline{\vec n},1} \psi_{\underline{\vec n},0}
\end{displaymath}

where the coefficients $c_{\underline{\vec n},1}$ are still to be determined. The coefficient of $\psi_{{\vec n},0}$ is zero on account of the normal­ization requirement. (And in fact, it is easiest to take the coefficient of $\psi_{{\vec n},0}$ also zero for $\psi_{{\vec n},2}$, $\psi_{{\vec n},3}$, ..., even if it means that the resulting wave function will no longer be normalized.)

The problem (D.56) becomes

\begin{displaymath}
\sum_{\underline{\vec n}\ne{\vec n}}
c_{\underline{\vec ...
... = - H_1\psi_{{\vec n},0}
+ E_{{\vec n},1}\psi_{{\vec n},0}
\end{displaymath}

where the left hand side was cleaned up using the fact that the $\psi_{\underline{\vec n},0}$ are eigen­functions of $H_0$. To get the first order energy correction $E_{{\vec n},1}$, the trick is now to take an inner product of the entire equation with $\langle\psi_{{\vec n},0}\vert$. Because of the fact that the energy eigen­functions of $H_0$ are ortho­normal, this inner product produces zero in the left hand side, and in the right hand side it produces:

\begin{displaymath}
0 = - H_{{\vec n}{\vec n},1} + E_{{\vec n},1}
\qquad
H...
...} = \langle\psi_{{\vec n},0}\vert H_1\psi_{{\vec n},0}\rangle
\end{displaymath}

And that is exactly the first order correction to the energy claimed in {A.37.1}; $E_{{\vec n},1}$ equals the Hamiltonian perturb­ation coefficient $H_{{\vec n}{\vec n},1}$. If the problem is not degenerate or $\psi_{{\vec n},0}$ is good, that is.

To get the coefficients $c_{\underline{\vec n},1}$, so that you know what is the first order correction $\psi_{{\vec n},1}$ to the wave function, just take an inner product with each of the other eigen­functions $\langle\psi_{\underline{\vec n},0}\vert$ of $H_0$ in turn. In the left hand side it only leaves the coefficient of the selected eigen­function because of ortho­normality, and for the same reason, in the right hand side the final term drops out. The result is

\begin{displaymath}
c_{\underline{\vec n},1} (E_{\underline{\vec n},0} - E_{{\...
...\psi_{\underline{\vec n},0} \vert H_1\psi_{{\vec n},0}\rangle
\end{displaymath}

The coefficients $c_{\underline{\vec n},1}$ can normally be computed from this.

Note however that if the problem is degenerate, there will be eigen­functions $\psi_{\underline{\vec n},0}$ that have the same energy $E_{{\vec n},0}$ as the eigen­function $\psi_{{\vec n},0}$ being corrected. For these the left hand side in the equation above is zero, and the equation cannot in general be satisfied. If so, it means that the assumption that an eigen­function $\psi_{\vec n}$ of the full Hamiltonian expands in a power series in $\varepsilon$ starting from $\psi_{{\vec n},0}$ is untrue. Eigen­function $\psi_{{\vec n},0}$ is bad. And that means that the first order energy correction derived above is simply wrong. To fix the problem, what needs to be done is to identify the submatrix of all Hamiltonian perturb­ation coefficients in which both unperturbed eigen­functions have the energy $E_{{\vec n},0}$, i.e. the submatrix

\begin{displaymath}
\mbox{all}\quad
H_{{\vec n}_i{\vec n}_j,1}
\quad\mbox{with}\quad
E_{{\vec n}_i,0}=E_{{\vec n}_j,0}=E_{{\vec n},0}
\end{displaymath}

The eigen­values of this submatrix are the correct first order energy changes. So, if all you want is the first order energy changes, you can stop here. Otherwise, you need to replace the unperturbed eigen­functions that have energy $E_{{\vec n},0}$. For each ortho­normal eigen­vector $(c_1,c_2,\ldots)$ of the submatrix, there is a corre­sponding replacement unperturbed eigen­function

\begin{displaymath}
c_1 \psi_{{\vec n}_1,0,{\rm old}} +
c_2 \psi_{{\vec n}_2,0,{\rm old}} +
\ldots
\end{displaymath}

You will need to rewrite the Hamiltonian perturb­ation coefficients in terms of these new eigen­functions. (Since the replacement eigen­functions are linear combin­ations of the old ones, no new integr­ations are needed.) You then need to reselect the eigen­function $\psi_{{\vec n},0}$ whose energy to correct from among these replacement eigen­functions. Choose the first order energy change (eigenvalue of the submatrix) $E_{{\vec n},1}$ that is of interest to you and then choose $\psi_{{\vec n},0}$ as the replacement eigen­function corre­sponding to a corre­sponding eigen­vector. If the first order energy change $E_{{\vec n},1}$ is not degenerate, the eigen­vector is unique, so $\psi_{{\vec n},0}$ is now good. If not, the good eigen­function will be some combin­ation of the replacement eigen­functions that have that first order energy change, and the good combin­ation will have to be figured out later in the analysis. In any case, the problem with the equation above for the $c_{\underline{\vec n},1}$ will be fixed, because the new submatrix will be a diagonal one: $H_{\underline{\vec n}{\vec n},1}$ will be zero when $E_{\underline{\vec n},0}$ $\vphantom0\raisebox{1.5pt}{$=$}$ $E_{{\vec n},0}$ and $\underline{\vec n}$ $\raisebox{.2pt}{$\ne$}$ ${\vec n}$. The coefficients $c_{\underline{\vec n},1}$ for which $E_{\underline{\vec n},0}$ $\vphantom0\raisebox{1.5pt}{$=$}$ $E_{{\vec n},0}$ remain indeterminate at this stage. They will normally be found at a later stage in the expansion.

With the coefficients $c_{\underline{\vec n},1}$ as found, or not found, the sum for the first order perturb­ation $\psi_{{\vec n},1}$ in the wave function becomes

\begin{displaymath}
\psi_{{\vec n},1} = - \hspace{-5pt} \sum_{E_{\underline{\v...
...c n}}}
c_{\underline{\vec n},1} \psi_{\underline{\vec n},0}
\end{displaymath}

The entire process repeats for higher order. In particular, to second order (D.57) gives, writing $\psi_{{\vec n},2}$ also in terms of the unperturbed eigen­functions,

\begin{eqnarray*}
\sum_{\underline{\vec n}}
c_{\underline{\vec n},2}
(E_{\...
...\psi_{\underline{\vec n},0}
+ E_{{\vec n},2}\psi_{{\vec n},0}
\end{eqnarray*}

To get the second order contribution to the energy, take again an inner product with $\langle\psi_{{\vec n},0}\vert$. That produces, again using ortho­normality, (and diagonality of the submatrix discussed above if degenerate),

\begin{displaymath}
0 =
\sum_{E_{\underline{\vec n},0}\ne E_{{\vec n},0}}
...
...E_{\underline{\vec n},0} - E_{{\vec n},0}}
+ E_{{\vec n},2}
\end{displaymath}

This gives the second order change in the energy stated in {A.37.1}, if $\psi_{{\vec n},0}$ is good. Note that since $H_1$ is Hermitian, the product of the two Hamiltonian perturb­ation coefficients in the expression is just the square magnitude of either.

In the degenerate case, when taking an inner product with a $\langle\psi_{\underline{\vec n},0}\vert$ for which $E_{\underline{\vec n},0}$ $\vphantom0\raisebox{1.5pt}{$=$}$ $E_{{\vec n},0}$, the equation can be satisfied through the still indeterminate $c_{\underline{\vec n},1}$ provided that the corre­sponding diagonal coefficient $H_{\underline{\vec n}\underline{\vec n},1}$ of the diagon­alized submatrix is unequal to $E_{{\vec n},1}$ $\vphantom0\raisebox{1.5pt}{$=$}$ $H_{{\vec n}{\vec n},1}$. In other words, provided that the first order energy change is not degenerate. If that is untrue, the higher order submatrix

\begin{displaymath}
\mbox{all }
\sum_{E_{\underline{\vec n},0}\ne E_{{\vec n...
...0}
\quad
E_{{\vec n}_i,1}=E_{{\vec n}_j,1}=E_{{\vec n},1}
\end{displaymath}

will need to be diagon­alized, (the rest of the equation needs to be zero). Its eigen­values give the correct second order energy changes. To proceed to still higher energy, reselect the eigen­functions following the same general lines as before. Obviously, in the degenerate case the entire process can become very messy. And you may never become sure about the good eigen­function.

This problem can often be eliminated or greatly reduced if the eigen­functions of $H_0$ are also eigen­functions of another operator $A$, and $H_1$ commutes with $A$. Then you can arrange the eigen­functions $\psi_{\underline{\vec n},0}$ into sets that have the same value for the “good” quantum number $a$ of $A$. You can analyze the perturbed eigen­functions in each of these sets while completely ignoring the existence of eigen­functions with different values for quantum number $a$.

To see why, consider two example eigen­functions $\psi_1$ and $\psi_2$ of $A$ that have different eigen­values $a_1$ and $a_2$. Since $H_0$ and $H_1$ both commute with $A$, their sum $H$ does, so

\begin{displaymath}
0 = \langle\psi_2\vert(H A - A H)\psi_1\rangle
= \langle...
..._1\rangle
= (a_1-a_2)\langle\psi_2\vert H\vert\psi_1\rangle
\end{displaymath}

and since $a_1-a_2$ is not zero, $\langle\psi_2\vert H\vert\psi_1\rangle$ must be. Now $\langle\psi_2\vert H\vert\psi_1\rangle$ is the amount of eigen­function $\psi_2$ produced by applying $H$ on $\psi_1$. It follows that applying $H$ on an eigen­function with an eigenvalue $a_1$ does not produce any eigen­functions with different eigen­values $a$. Thus an eigen­function of $H$ satisfying

\begin{displaymath}
H \left(\sum_{a=a_1}c_{\vec n}\psi_{{\vec n},0}
+ \sum_{...
...c n},0}
+ \sum_{a\ne a_1}c_{\vec n}\psi_{{\vec n},0}\right)
\end{displaymath}

can be replaced by just $\sum_{a=a_1}c_{\vec n}\psi_{{\vec n},0}$, since this by itself must satisfy the eigenvalue problem: the Hamiltonian of the second sum does not produce any amount of eigen­functions in the first sum and vice-versa. (There must always be at least one value of $a_1$ for which the first sum at $\varepsilon$ $\vphantom0\raisebox{1.5pt}{$=$}$ 0 is independent of the other eigen­functions of $H$.) Reduce every eigen­function of $H$ to an eigen­function of $A$ in this way. Now the existence of eigen­functions with different values of $a$ than the one being analyzed can be ignored since the Hamiltonian does not produce them. In terms of linear algebra, the Hamiltonian has been reduced to block diagonal form, with each block corre­sponding to a set of eigen­functions with a single value of $a$. If the Hamiltonian also commutes with another operator $B$ that the $\psi_{{\vec n},0}$ are eigen­functions of, the argument repeats for the subsets with a single value for $b$.

The Hamiltonian perturb­ation coefficient $\langle\psi_2\vert H_1\vert\psi_1\rangle$ is zero whenever two good quantum numbers $a_1$ and $a_2$ are unequal. The reason is the same as for $\langle\psi_2\vert H\vert\psi_1\rangle$ above. Only perturb­ation coefficients for which all good quantum numbers are the same can be non­zero.