Derivation and Implementation of Generalized Quaternion Rotation Functions

I want to preface this post by saying that I am not a mathematician. This is a story about using some relatively arcane math to solve a problem I encountered. Whether it reflects anything close to an optimized solution I'll leave to the determination of people with graduate degrees in math, computer science, or other relevant fields. I just enjoy it when high-level theory finds practical use and thought this was a story worth sharing.

Motivation

A couple years back, I was working on a modified Keplerian model of the Solar System for a larger project and needed to position the orbits of various bodies in 3D space. The true anomalies \((\theta)\) and radii of points along the orbits were first calculated in the plane of the orbit solving Kepler's equation: $$M = E - e\sin(E)$$ using Newton's method: $$x_{n+1} = x_{n} - \frac{f(x_{n})}{f'(x_{n})}$$ but still needed to be positioned in 3D space according to the longitude of ascending node \((\Omega)\), inclination \((i)\), and argument of periapsis \((\omega)\). For objects considered to be orbiting the Sun, this was the extent of the processing required to get their orbital trajectories correctly positioned, but additional transformations were needed for the natural satellites to align their orbits in their respective local coordinate frames with the global coordinate space. In the process of formalizing these transformation procedures into workable code, it became clear that the best way to achieve this would involve rotations around axes that weren't the cardinal axes of the coordinate system, so I reached deep into my memory of the spacecraft design class I took senior year of college, where my professor covered a brief but memorable unit on attitude determination and control. During this unit, I was formally introduced to the concept of quaternions and how they were often preferred over Euler rotations due to their mathematical efficiency and prevention of gimbal lock. Many years later when I became interested in graphics programming, I also encountered discussion about quaternions being used in computer graphics for the same reason and was aware that they could be used to rotate a vector around any other arbitrary (normalized) vector.

Research

Scouring the internet for simple functions that would use quaternion rotations and take three arguments, a vector to rotate, a vector to rotate around, and the rotation angle, I turned up a big goose egg. Maybe my googling wasn't very good, or maybe I didn't have enough patience to sift through some lengthy texts, but I found a lot of theory and less practical implementation, at least in the form I needed it. It had been a long time since I'd done a good mathematical derivation, though, so with all the underlying theory at my disposal and a mind for some math, I thought it would be worthwhile to do the hard work of deriving the functions myself. Without getting too deep into theory, a quaternion is a four-component number similar to two-component complex numbers and takes the following form: $$a + b\mathbf{i} + c\mathbf{j} + d\mathbf{k}$$ where \(a\), \(b\), \(c\), and \(d\) are real numbers, and \(\mathbf{i}\), \(\mathbf{j}\), and \(\mathbf{k}\) are the basis vectors and can be thought of as something like imaginary numbers that adhere to the following set of rules: $$\mathbf{i} ^2 = \mathbf{j}^2 = \mathbf{k}^2 = -1$$ $$\mathbf{ij} = -\mathbf{ji} = \mathbf{k}$$ $$\mathbf{jk} = -\mathbf{kj} = \mathbf{i}$$ $$\mathbf{ki} = -\mathbf{ik} = \mathbf{j}$$ Again, refraining from getting into the weeds about what all this means, these identities will be critically important for the mathematical derivation. Notice in particular how the multiplication of these basis vectors is noncommutative (order matters!). I'll also skip some more theory and get straight to the math relevant specifically to quaternion rotations, since quaternions can be used for scaling, too. Taking the four-component form from above, the quaternion constructed for rotation, \(q\), and its conjugate, \(q^*\), are described by the following equations: $$q = \cos{\textstyle\frac{\theta}{2}} + \vec{u}\cdot\sin{\textstyle\frac{\theta}{2}}$$ $$q^* = \cos{\textstyle\frac{\theta}{2}} - \vec{u}\cdot\sin{\textstyle\frac{\theta}{2}}$$ where \(\vec{u}\) is the rotational axis unit vector that can be rewritten in the form, \(u_x\mathbf{i} + u_y\mathbf{j} + u_z\mathbf{k}\), and \(\theta\) in this context represents the angle of rotation ( \(0 \leq \theta \leq \pi\)). The quaternion, \(q\), and its conjugate, \(q^*\), are then multiplied in a noncommutative manner to the vector we want to rotate, \(\vec{p}\), to obtain the rotated vector, \(\vec{p}\,'\), as follows: $$\vec{p}\,' = q\vec{p}q^*$$

Derivation

With the equation I was looking for right in front of my face, it was time to get it into a form that would be usable output to return from a function, namely, a three-component vector. Expansion of the right-hand side of the expression yields the following portent to what will become an Eldritch horror of multiplicative (noncommutative!) distribution and sign errors just waiting to creep out of nightmares and onto paper: $$\begin{aligned}&\vec{p}\,' =\\[2ex]&(\cos{\textstyle\frac{\theta}{2}} + (u_x\mathbf{i} + u_y\mathbf{j} + u_z\mathbf{k})\sin{\textstyle\frac{\theta}{2}})\,\cdot\\&(p_x + p_y + p_z)\,\cdot\\&(\cos{\textstyle\frac{\theta}{2}} - (u_x\mathbf{i} + u_y\mathbf{j} + u_z\mathbf{k})\sin{\textstyle\frac{\theta}{2}})\end{aligned}$$ I will spare you the numerous intermediary steps that took multiple sheets of paper to work through by hand, but if I can briefly quantify the ardor, that is 4 \(\times\) 3 \(\times\) 4 or 48 terms that pop out initially from distributing everything inside the three major sets of parentheses. Meticulously ensuring the negative signs are correctly distributed and keeping the basis vectors in the correct order within each term, we can refer back to the identities from earlier and convert each \(\mathbf{ij}\), \(\mathbf{ji}\), \(\mathbf{i}^2\), etc... into either the single basis vector equivalent or \(-1\) and then flip the sign in front of each term where necessary. At the end of this tedious and error-prone process, the proverbial storm clouds part and rainbows appear in a beautiful turn of events almost too profound to put into words when all the real terms (i.e. terms without \(\mathbf{i}\), \(\mathbf{j}\), or \(\mathbf{k}\) in them) cancel out, leaving only terms with a single basis vector in them that will form the long-sought-after three-component vector output after a bit of simplification. When all is said and done, the results of this reorganization yield the following expression: $$\begin{aligned}&\vec{p}\,' =\\[2ex] &(\sin^2{\textstyle\frac{\theta}{2}}\cdot(2u_x(u_{y}p_y + u_{z}p_z)\>+ \\ &p_{x}(u_{x}^2 - u_{y}^2 - u_{z}^2)) + p_x\cos^2{\textstyle\frac{\theta}{2}}\>+\\ &2\sin{\textstyle\frac{\theta}{2}}\cos{\textstyle\frac{\theta}{2}}\cdot(u_{y}p_z - u_{z}p_y))\cdot\mathbf{i}\>+\\[2ex] &(\sin^2{\textstyle\frac{\theta}{2}}\cdot(2u_y(u_{x}p_x + u_{z}p_z)\>+ \\ &p_{y}(u_{y}^2 - u_{z}^2 - u_{x}^2)) + p_y\cos^2{\textstyle\frac{\theta}{2}}\>+\\ &2\sin{\textstyle\frac{\theta}{2}}\cos{\textstyle\frac{\theta}{2}}\cdot(u_{z}p_x - u_{x}p_z))\cdot\mathbf{j}\>+\\[2ex] &(\sin^2{\textstyle\frac{\theta}{2}}\cdot(2u_z(u_{x}p_x + u_{y}p_y)\>+ \\ &p_{z}(u_{z}^2 - u_{x}^2 - u_{y}^2)) + p_z\cos^2{\textstyle\frac{\theta}{2}}\>+\\ &2\sin{\textstyle\frac{\theta}{2}}\cos{\textstyle\frac{\theta}{2}}\cdot(u_{x}p_y - u_{y}p_x))\cdot\mathbf{k}\\[1ex]\end{aligned}$$ It's hardly concise, but a close inspection of each of the three major terms shows a clear pattern between the vector components, and everything is now organized as a function of the three desired arguments: \(\vec{p}\), \(\vec{u}\), and \(\theta\). What it lacks in brevity it retains in elegance.

Implementation

With the mathematical work out of the way, it was time to translate it to code. I originally implemented this function in Python since this was meant to perform calculations offline behind the scenes, but here I will give an example of an implementation in TypeScript. I will, however, skip the type/class declarations/definitions for the sake of brevity:
// Quaternion rotation
function quatRotate(p: Vector3, u: Vector3, t: number): Vector3 {

   function cFinal(c1: 'x' | 'y' | 'z'): number {

      const c2 = c1 === 'x' ? 'y' : c1 === 'y' ? 'z' : 'x';
      const c3 = c1 === 'x' ? 'z' : c1 === 'y' ? 'x' : 'y';

      const halfTSin = Math.sin(t/2);
      const halfTCos = Math.cos(t/2);

      const term1 = 2 * u[c1] * (u[c2] * p[c2] + u[c3] * p[c3]);
      const term2 = p[c1] * (u[c1]**2 - u[c2]**2 - u[c3]**2);
      const term3 = 2 * halfTSin * halfTCos * (u[c2] * p[c3] - u[c3] * p[c2]);

      return (term1 + term2) * halfTSin**2 + p[c1] * halfTCos**2 + term3;

   }

   return new Vector3(cFinal('x'), cFinal('y'), cFinal('z'));

}

Conclusion

And that's it! After about six sheets of paper and having to correct a handful of sign errors along the way, I distilled some fairly esoteric mathematical theory into a usable function that did exactly what I needed. I'd be lying if I said I fully understood the math as well as William Hamilton did when he first devised/discovered quaternions in the mid-19th century, but after fairly extensive testing, this function has proven to be accurate and useful for my purposes. I also realize it's entirely possible there is a more computationally efficient way to do this that I'm unaware of, or that the final equation could potentially be further simplified through mathematical identities. For now, though, I have a very straightforward way to rotate any 3D vector around any 3D unit vector by an angle of my choosing. There are many things in life far less efficient, convenient, or sensible, so I'll take the win.