Consider the p-norm $$ || \boldsymbol{x} ||_p = \left( \sum_i |x_i|^p \right)^{\frac{1}{p}} $$
In python this translates to:
from numpy import array
def norm1(x, p):
"First-pass implementation of p-norm."
return (np.abs(x)**p).sum() ** (1./p)
Now, suppose $|x_i|^p$ causes overflow (for some $i$). This will occur for sufficiently large $p$ or sufficiently large $x_i$—even if $x_i$ is representable (i.e., not NaN or $\infty$).
For example:
>>> big = 1e300
>>> x = array([big])
>>> norm1(x, p=2)
[ inf] # expected: 1e+300
This fails because we can't square big
>>> np.array([big])**2
[ inf]
A Little Math
There is a way to avoid overflowing because of a few large $x_i$.
Here's a little fact about p-norms: for any $p$ and $\boldsymbol{x}$ $$ || \alpha \cdot \boldsymbol{x} ||_p = |\alpha| \cdot || \boldsymbol{x} ||_p $$
We'll use the following version (harder to remember) $$ || \boldsymbol{x} ||_p = |\alpha| \cdot || \boldsymbol{x} / \alpha ||_p $$
Don't believe it? Here's some algebra: $$ \begin{eqnarray*} || \boldsymbol{x} ||_p &=& \left( \sum_i |x_i|^p \right)^{\frac{1}{p}} \\ &=& \left( \sum_i \frac{|\alpha|^p}{|\alpha|^p} \cdot |x_i|^p \right)^{\frac{1}{p}} \\ &=& |\alpha| \cdot \left( \sum_i \left( \frac{|x_i| }{|\alpha|} \right)^p \right)^{\frac{1}{p}} \\ &=& |\alpha| \cdot \left( \sum_i \left| \frac{x_i }{\alpha} \right|^p \right)^{\frac{1}{p}} \\ &=& |\alpha| \cdot || \boldsymbol{x} / \alpha ||_p \end{eqnarray*} $$
Back to Numerical Stability
Suppose we pick $\alpha = \max_i |x_i|$. Now, the largest number we have to take the power of is one—making it very difficult to overflow on the account of $\boldsymbol{x}$. This should remind you of the infamous log-sum-exp trick.
def robust_norm(x, p):
a = np.abs(x).max()
return a * norm1(x / a, p)
Now, our example from before works :-)
>>> robust_norm(x, p=2)
1e+300
Remarks
-
It appears as if
scipy.linalg.normis robust to overflow, whilenumpy.linalg.normis not. Note thatscipy.linalg.normappears to be a bit slower. -
The
logsumexptrick is nearly identical, but operates in the log-domain, i.e., $\text{logsumexp}(\log(|x|) \cdot p) / p = \log || x ||_p$. You can implement both tricks with the same code, if you use different number classes for log-domain and real-domain—a trick you might have seen before.
from arsenal.math import logsumexp
from numpy import log, exp, abs
>>> p = 2
>>> x = array([1,2,4,5])
>>> logsumexp(log(abs(x)) * p) / p
1.91432069824
>>> log(robust_norm(x, p))
1.91432069824