You don't need a lot of math to appreciate how X-ray crystallography works, but knowing a bit about three-dimensional vectors and about complex numbers helps in understanding some of the fundamentals and much of the details.
If we place a three-dimensional cartesian coordinate system into space, we can describe every point in space by its x, y, and z coordinate. That's what is used in the protein data bank to describe the position of each atom in a biomolecular structure. Here is an excerpt of the structure 1HHP with the coordinates high-lighted in red:
ATOM 1 N PRO 1 52.574 58.851 -7.646 1.00 34.60 1HHP 44
ATOM 2 CA PRO 1 51.842 59.784 -6.815 1.00 34.88 1HHP 45
ATOM 3 C PRO 1 52.146 59.438 -5.356 1.00 36.01 1HHP 46
ATOM 4 O PRO 1 53.031 58.612 -5.150 1.00 35.05 1HHP 47
ATOM 5 CB PRO 1 50.391 59.581 -7.189 1.00 34.07 1HHP 48
ATOM 6 CG PRO 1 50.353 58.166 -7.724 1.00 32.00 1HHP 49
ATOM 7 CD PRO 1 51.621 58.242 -8.540 1.00 31.72 1HHP 50
ATOM 8 N GLN 2 51.488 60.047 -4.359 1.00 35.47 1HHP 51
Let's take the coordinates of atom 1 (the nitrogen of the first amino acid, which happens to be a proline) and call that point in space P1; similarily, P2 is the point in space occupied by the C-alpha atom of that same residue.
A vector has a length and a direction (imagine a straight line with an arrow tip at its end, which is how vectors are usually represented in figures). One way of describing a vector is by placing its starting point at the origin of the coordinate system and its tip at some point in space given by the coordinates of the vector. For example, the vector a1 from the origin to point P1 is the following:
a1 = (52.574, 58.851, -7.646)
To distinguish vectors from scalars (simple numbers), you either write it in bold or put an arrow above it (that's easy on the blackboard, but often cumbersome on the computer). Here is the vector a2 between the origin and P2:
a2 = (51.842, 59.784, -6.815)
While we're not really interested in the distance and direction between the origin and some atom, we do care about the distance between atoms. We can compute the bond vector dNCa between the nitrogen and the C-alpha atom (between a1 and a2) by subtracting a1 from a2:
dNCa = a2 – a1
If we know where the nitrogen atom is, and we know the vector dNCa between the nitrogen and the C-alpha atom (i.e. the bond distance and in which direction the bond lies), we can calculate the position of the C-alpha atom:
a2 = a1 + dNCa
How do we get the value of dNCa (what are the rules of addition and subtraction for vectors)? You just add or subtract the three components separately. So for calculating the bond vector, we have:
dNCa = a2 – a1
= (52.574, 58.851, -7.646) – (51.842, 59.784, -6.815)
= (52.574 – 51.842, 58.851 – 59.784, -7.646 – (-6.815))
= (0.732, -0.933, 0.831)
The following figure illustrates the relationship between a1, a2 and dNCa (and some other vectors mentioned below). It is difficult to draw 3-dimensional vectors on a 2-dimensional screen, so you have to use some imagination. Also, no attempt was made to draw exactly the vectors given numerically in the example.
How long is the bond dNCa between the nitrogen and carbon? The recipe to get the length of a vector from it's cartesian coordinates x, y, z is to square all the coordinates, add those up and take the square root. (This is related to Pythagoras' theorem that in a right-angled triangle, the square of the hypothenuse is equal to the sum of the squares of the two other sides of the triangle. In a two dimesional cartesian coordinate system, the distance is the hypothenuse and the sides are along the perpendicular coordinates axes. Thus in two dimensional space, the theorem directly gives a recipe how to calculate the length of vectors. For more dimensions, the recipe is a generalization of the simple case.) The length of a vector a, also called its magnitude, is symbolized by |a|. Let's calculate the bond length (all lengths in the PDB are in Angstroems):
dNCa = (0.732, -0.933, 0.831)
|dNCa| = sqrt (0.732 x 0.732 + 0.933 x 0.933 + 0.831 x 0.831) = sqrt (2.097) = 1.45
Sometimes crystallographers work in coordinate systems that are not cartesian, with axes that are not perpendicular to each other, and different length scales along the axes. One popular system amongst crystallographers is called fractional coordinates, in which points in the crystallographic unit cell all have coordinates between 0 and 1, and in which you obtain the coordinates of symmetry-related points in neighboring unit cells simply by adding 1 to one of the coordinates. In these systems, distances can't be determined by simply squaring the coordinates; in fact, most often one would switch to a cartesian system before calculating distances.
As an exercise, why don't you calculate the distance of the carbon-carbon bond between the C-alpha and the carbonyl carbon in the first residue? To save you some work, the bond vector is given below:
dCaC = (0.304, -0.346, 1.459)
|dCaC| = ... your answer here ... (is this a single or a double bond, you think?)
There are different types of vector multiplication, so it's important to be specific when talking about vector products. Some results in another vector, and others result in a simple number, a scalar. The scalar product is called that way because it yields a scalar (it is also sometimes called the dot-product because of its notation, a · b). We'll start be giving the recipe for calculating the scalar product, and then look at what we can do with it. To form the scalar product, you multiply vectors component-wise, and add up the products. Here is an example:
dNCa · dCaC =
= (0.732, -0.933, 0.831) · (0.304, -0.346, 1.459)
= (0.732 x 0.304) + (-0.933 x -0.346) + (0.831 x 1.459)
= 0.223 + 0.323 + 1.212 = 1.758
If we calculate the scalar product of a vector with itself, the calculation is very similar to the recipe of calculating a vector's length. In fact, the scalar product gives the square of the length, as the following example illustrates:
dNCa · dNCa =
= (0.732, -0.933, 0.831) · (0.732, -0.933, 0.831)
= (0.732 x 0.732 + 0.933 x 0.933 + 0.831 x 0.831)
= 0.536 + 0.870 + 0.691 = 2.097
Why is the scalar product useful? It's a convenient way to calculate the distance along a certain direction. As an example, you are planning a hike and know the vector d = (dx, dy, dz) from the trail head (i.e. parking lot) to the mountain top. Because you are really worried about the climb, you would like to figure out how much higher the mountain top is than the trail head. So you want to figure out the distance in one specific direction (let's say that the z-axis of your coordinate system is pointing in the up-direction). The recipe for this is to calculate the scalar product between the vector and the unit-length vector of the direction your interested in. The unit-length vector u in this case is:
u = (0, 0, 1)
It points along the z-direction (up) and has a length of one (that makes it a unit vector). Let's try our recipe:
"how much you have to climb" = d · u = (dx, dy, dz) · (0, 0, 1) = 0 x dx + 0 x dy + 1 x dz = dz
This might have been a bit of a disappointing example (the answer was right there from the start), but it turns out this recipe works for calculating directions in arbitrary directions, not only along the axes of the coordinate system. This comes in handy in crystallography when setting up the Laue conditions that govern in which directions diffracted X-rays are observed for a given X-ray source and a given unit cell of the crystaline sample.
Related to finding distances along a certain direction, you can use the scalar product to calculate projections of one vector in the direction of a second vector (which is necessary when you want to switch to a different coordinate system, and for many other tasks outside of the scope of this treatment).
There is a second, more general definition of the scalar product that involves the lengths of the vectors and the angle alpha between them (it is more general because it does not rely on a cartesian coordinate system to calculate the product):
a · b = |a| x |b| x cos(alpha)
Let's take some simple examples to calculate scalar products using this definition between the vectors r = (2, 0, 0), s = (0, 1, 0), t = (-3, 0, 0). The angle between r and s in 90 degrees (they are along two axes of the cartesian coordinate system, which is right-angled), that between r and t 180 degrees (they point in opposite directions), and that between r and r is 0 degrees (they are parallel). Your job is the calculate the scalar products by component-wise multiplication to check the results.
r · r = 2 x 2 x cos(0°) = 2 x 2 x 1 = 4
r · s = 2 x 1 x cos(90°) = 2 x 1 x 0 = 0
r · t = 2 x 3 x cos(180°) = 2 x 3 x (-1) = -6
Notice that the scalar product r · s is zero because the two vectors are perpendicular to each other, giving zero for the cosine term. When you calculate the scalar product of two vectors (with lengths greater zero), this tests whether they are perpendicular.
How can we use the scalar product to calculate angles? We solve for the angle and figure out the lengths of the vector and their scalar product from their coordinates (i.e. by component-wise multiplication):
alpha = arcos (a · b / |a| / |b|)
Let's try that for the angle between the two bond vectors originating from the C-alpha atom, dCaC and dCaN:
dCaC = (0.304, -0.346, 1.459)
dCaN = -dNCa = (-0.732, 0.933, -0.831)
alpha = arcos (dCaC · dCaN / |dCaC| / |dCaN|) = arcos (-1.76 / 1.45 / 1.53) = arcos (-0.793) = 143 degrees
Vectors have a length and a direction. If you know the cartesian coordinates x, y, z of three-dimensional vectors, it is easy to add and subtract them, and use the scalar product to figure out their lengths and the angles between them. For more advance calculations outside the scope of this treatment, you would have to learn about a second type of product, the vector (or cross) product, which yields a vector and is useful for describing planes and calculating torsion angles.
Complex numbers are a superset of real numbers that historically arose as solutions to equations that could not be solved with the real numbers. (This motivation to create a new type of number is quite typically. If you have positive integers only, you can add any two numbers, but get into trouble when subtracting large numbers from smaller ones; introducing negative numbers helps with that. If you have integers only, you can add, subtract and multiply any two numbers, but you get into trouble when trying to divide say 2 by 3; introducing fractions helps with that. Now you can add, subtract, multiply and divide any two numbers, but you can't state the ratio between the radius and the circumference of a circle, or state the length of a square with an area of 2; introducing real numbers helps with that). The equation that led to complex numbers was of the type
square(i) = -1
This problem is easy to solve by introducing an imaginary number i that solves this equation. A complex number, in its most general form, is a real number plus i times a real number, C= a + bi. We will use capital letter for all complex variables used here.
Addition, subtraction, multiplication and division of complex numbers follows the known rules of real numbers arithmetic (Associative Law: a + (b + c) = (a + b) + c, Commutative Law: n+m = m+n, Distributive Law: a (b+c) = ab + ac); whenever you see the term ii, you can replace it by -1. Let's try this with a couple of complex numbers R, S, and T:
R = 3.1 + 5i
S = -4 + 2.1i
T = 5i
Here are some example calculations (in some special cases, you end up with a real number as a result):
R + S = (3.1 + 5i) + (-4 + 2.1i) = (3.1-4) + (5+2.1)i = -0.9 + 7.1i
R – T = (3.1 + 5i) – (5i) = (3.1-0) + (5-5)i = 3.1
R x S = (3.1 + 5i) (-4 + 2.1i) = 3.1 x (-4) + (3.1 x 2.1 + 5 x (-4))i + (2.1 x (-4))ii
= -12.4 + (6.51 – 20)i + (- 8.4)(-1) = 4 - 13.49i
R / S = (3.1 + 5i) / 5i = 3.1/5i + 5i/5i = 0.62 x 1/i + 1 (oops, how do we calculate 1/i? ... see below)
For the following, it helps to visualize a complex number in a two-dimensional coordinate system with a real axis along x and an imaginary axis along y. This is sometimes called an Argand diagram. All complex numbers will be points on this graph, with real numbers appearing as a subset along the real axis. Some of this will sound a lot like vector algebra, but remember that the real and the imaginary axis are fundamentally different from each other, while axes in n-dimensional space are all the same.
![]() |
![]() |
Here are some new definitions for the real part, the imaginary part, the conjugate complex and the magnitude of a complex number C = a + bi:
Re(C) = a
Im(C) = b
C* = a – bi
|C| = sqrt(C x C*) = sqrt ((a+bi)(a–bi)) = sqrt(a x a – b x bii) = sqrt (a x a + b x b)
Comparing the magnitude of a vector vs. that of a complex number, in both cases the interpretation is as the distance between the origin and the point representing the vector or number in a graph, and in both cases you sum the square of the components and take the square root. However, we saw that for vectors, the more general recipe is to take the square root of the scalar product of the vector, whereas for complex numbers, you multiply the number with its complex conjugate before taking the square root.
A couple more comments on the complex conjugate. In the Argand diagram, c and c* are related by mirror symmetry along the real axis. If you multiply a complex number with its complex conjugate, you always get a real number. This is helpful in simplifying fractions with complex values in the denominator, as the expression 0.62/i encountered above. To simplify, we can multiply numerator and denominator with the complex conjugate, in this case -i to yield -0.62i/-ii = -0.62i. If you add a complex number to its complex conjugate, you double the real component while the imaginary part cancels out. If you subtract them, the real part cancels out. So this would be another way to get the real or imaginary part of a complex number:
Re(C) = 0.5 (C + C*) = 0.5 (2a) = a
Im(C) = -0.5i (C – C*) = 0.5i (2bi) = b
In solving a biomolecular structure by X-ray crystallography, we calculate a three dimensional function describing the electron density of the structure from complex numbers called structure factors. The electron density is obtained by summing up terms containing the structures factor; these terms are complex functions, but the electron density has real values. The reason we get real values is that for every complex term originating from a structure factor F(h k l), there is a conjugate complex term originating from its so-called Friedel mate, F(-h -k -l). As we saw, adding the complex conjugate to a number results in a sum that has a real values. If Friedel's law breaks down because of inelastic scattering (resulting in anomalous diffraction), the electron density map would have an imaginary component at the positions of the inelastic scatterers.
Just as we can describe vectors by their magnitude and their direction, we can describe complex numbers by their magnitude and their direction. We already introduced the magnitude of a complex number. To describe the direction of a complex number in the Argand diagram, we can use a complex number of magnitude one, similar to a unit vector. To get a complex number of magnitude one, we simply divide the number by its magnitude:
Uc = C / |C| = (a + bi) / sqrt (a x a + b x b)
That's not very beautiful. Another way of describing a unit-length complex number is by the exponential of an imaginary number i phi, the value of which is given by Euler's formula:
U(phi) = exp(i phi) = cos phi + i sin phi
In the Argand diagram above, phi is the angle between the x-axis and the line connecting the complex number to the origin. There are unique unit-length complex numbers for angles between 0 and 360 degrees, and the function u(phi) repeats itself every full turn. This is similar to a laundry washer dial:
![]() |
Quarter turn |
Rinse Spin Stop Rinse Spin Stop |
You might want to try this at home. The u(phi) dial works the same way, only it's counter-clockwise, while most washer dials operate clockwise and might break if you force them to go counter-clockwise. |
We can now write any complex number in so-called polar form as the product of its magnitude (a scalar) and an exponential function containing the angle phi (giving the direction):
C = |C| exp(i phi)
The rules for multiplying complex exponentials are the same as those for real number, i.e. exp(a) exp(b) = exp(a+b). What happens when we multiply C by exp(i alpha), i.e. a complex number of magnitude 1?
C exp(i alpha) = |C| exp(i phi) exp (i alpha) = |C| exp (i phi + i alpha) = |C| exp (i (phi + alpha))
So multiplying a number by exp(i alpha) changes its direction (by an angle alpha), but not its magnitude. In effect, it amounts to a rotation about the origin by an angle alpha. Multiplication of two complex numbers is easier in the polar form:
A = |A| exp (i alpha)
B = |B| exp (i beta)
A x B = |A| x |B| exp (i (alpha + beta))
So to multiply two numbers, you multiply the magnitudes and add up the angles.
Electromagnetic waves have an electronic and a magnetic component to them that vary over space according to a sine function. Whenever the electronic component is at its maximum, the magnetic component is zero, and vice versa. It is convenient to describe the wave by a complex function, with the real part representing the electronic component and the imaginary component the magnetic component. For a wave in one dimension x, we can write:
f(x) = A exp (2 pi i x k)
This function is periodic (f(x) = f(x + 1/k), try it out). The complex number A = |A| exp (i alpha) gives the amplitude |A| and the phase alpha. Different values of alpha will shift the positions of peaks and valleys, but not change anything else.

Expanding on section 2.4, we can combine 3d-vectors and complex numbers to represent waves in three dimensions by substituting the vector r for x (describing a point in space) and the vector k for k (describing direction and wavelength of the wave):
W(r) = A exp (2 pi i r · k)
The complex number A describes the amplitude and the phase of the wave. k is a three-dimensional vector whose magnitude is the inverse of the wavelength lambda (i.e. the distance between wave peaks is 1/|k|) and whose direction determines the direction of the wave (k is perpendicular to the planes formed by the wave peaks). In the figure below, the scalar product of k with each of the shown four points r1..r4 gives the same value, so they form a common wave front.

If we want to know whether a point in space r is at a peak or a valley of the wave (the phase at the point r), we can express the complex amplitude A in polar form and combine the exponentials:
A = |A| exp (alpha i)
W(r) = A exp (2 pi i r · k) = |A| exp (alpha i) exp (2 pi i r · k) = |A| exp ((alpha+2 pi r · k)i)
phase = alpha+2 pi r · k
If the phase is 0, we're at a peak, if it's 180°, we're in a valley.
In crystallography, we are interested in interference of waves, specifically how two waves of identical wavelength and direction, but not necessarily the same amplitude and phase add up. Here are to such waves and there summation:
W1(r) = A1 exp (2 pi i r · k)
W2(r) = A2 exp (2 pi i r · k)
Wadd(r) = W1(r) + W2(r) = A1 exp (2 pi i r · k) + A2 exp (2 pi i r · k) = (A1 + A2) exp (2 pi i r · k)
Depending on the relative phases of A1 and A2, the resulting wave will be stronger or weaker than the components. There are two extreme cases. Identical phases result in constructive interference, with the magnitude of the summed wave given by the sum of the magnitudes of the individual waves; opposite phases (i.e. different by 180 degrees) will result in destructive interference, with the magnitude of the summed wave given by the difference of the magnitudes of the individual waves.