Definition 2: Absolute Error The absolute error is the absolute value of the error: \(|\tilde - x|\) . For vector quantities this means the norm \(\| \tilde - x \|\) , and it can be any norm, so long as we again choose one and use it consistently. Two favorites are the Euclidean norm \(\| x \| = \sqrt\) , denoted \(\| x \|_2\) , and the maximum norm (also mysteriously known at the infinity norm):
\[\| x \|_\textFor real-valued quantities, the absolute error is related to the number of correct decimal places: \(p\) decimal places of accuracy corresponds roughly to absolute error no more than \(0.5 \times 10^\) .
Definition 3: Relative Error The relative error is the ratio of the absolute error to the size of the exact quantity: \(\displaystyle\frac <| \tilde
When working with computer arithmetic, \(p\) significant bits corresponds to relative error no more than \(2^\) .
An obvious problem is that we usually do not know the exact solution \(x\) , so cannot evaluate any of these; instead we typically seek upper bounds on the absolute or relative error. Thus, when talking of approximate solutions to an equation \(f(x) = 0\) the concept of backward error can be very useful, for example as a step in getting bounds on the size of the error.
Definition 4: Backward Error
The Backward Error in \(\tilde\) as an approxiate solution to the equation \(f(x) = 0\) is \(f(\tilde)\) ; the amount by which function \(f\) would have to be changed in order for \(\tilde\) to be an exact root.
For the case of solving simultaneous linear equations in matrix-vector form \(A x = b\) , this is \(b - A \tilde\) , also known as the residual.
Definition 5: Absolute Backward Error
The absolute backward error is — as you might have guessed — the absolute value of the backward error: \(|f(\tilde)|\) . This is sometimes also called simply the backward error. (The usual disclaimer about vector quantities applies.)
For the case of solving simultaneous linear equations in matrix-vector form \(A x = b\) , this is \(\| b - A \tilde \|\) , also known as the residual norm.
Notes
We have seen that for the sequence of approximations \(x_k\) to a quantity \(x\) given by the fixed point iteration \(x_ = g(x_k)\) , the absolute errors \(E_k := |x_k - x|\) typically have
\[\frac> \to C = |g'(x)|.\]so that eventually the errors diminsh in a roughly geometric fashion: \(E_k \approx K C^k\) . This is called linear convergence.Aside: Why “linear” rather than “geometric”? Because there is an approximately linear relationship between consecutive error values,
\[E_ \approx C E_n.\]This is a very common behavior for iterative numerical methods, but we will also see that a few methods do even better; for example, when Newton’s method converges to a simple root \(r\) of \(f\) (one with \(f'(r) \neq 0\) )
\[E_ \approx C E_k^2\]This is called quadratic convergence. More generally, convergence of order \(p\) means that
\[E_ \approx C E_k^p, \text < or more precisely, >\lim_ \fracWe have already observed experimentally the intermediate result that “ \(C = 0\) ” for Newtons’s method in this case; that is,
This is called super-linear convergence and includes any situation with order of convergence \(p > 1\) .
For most practical purposes, if you have established super-linear convergence, you can be happy, and not worry much about refinements like the particular order \(p\) .
Consider the error formula for approximation of a function \(f\) with the Taylor polynomial of degree \(n\) , center \(a\) :
\[ | f(a+h) - T_n(h) | \leq \frac> |h|^ \text < where >M_ = \max |f^(x)|. \]Since the coefficient of \(h^\) is typicaly not known in practice, it is wise to focus on the power law part, and for this the “big-O” and little-o” notation is convenient.If a function \(E(h)\) goes to zero at least as fast as \(h^p\) , we say that it is of order \(h^p\) , written \(O(h^p)\) .
More precisely, \(E(h)\) is no bigger than a multiple of \(h^p\) for \(h\) small enough; that is, there is a constant \(C\) such that for some positive number \(\delta\)
\[\frac<|E(h)|> <|h|^p>\leq C \text < for >|h| < \delta.\]Another way to say this is in terms of the lim-sup, if you have seen that jargon:
\[\limsup_This can be used to rephrase the above Taylor’s theorem error bound as
\[f(x) - T_n(x) = O(|x-a|^)\] \[f(a+h) - T_n(h) = O(h^),\]and for the case of the linearization,
\[ f(a+h) - L(x) = f(a+h) - (f(a) + f'(a) h) = O(h^2). \]Sometimes it is enough to say that some error term is small enough to be neglected, at least when \(h\) is close enough to zero. For example, with a Taylor series we might be able to neglect the powers of \(x-a\) or of \(h\) higher than \(p\) .
We will thus say that a quantity \(E(h)\) is small of order \(h^p\) , written \(o(h^p)\) when
\[\lim_Note the addition of the word small compared to the above description of the big-O case!
With this, the Taylor’s theorem error bound can be stated as
\[f(x) - T_n(x) = o(|x-a|^n),\] \[f(a+h) - T_n(h) = o(h^n),\]and for the case of the linearization,
\[f(a+h) - L(x) = f(a+h) - (f(a) + f'(a) h) = o(h).\]By Brenton LeMesurier, College of Charleston and University of Northern Colorado
© Copyright 2020–2021.