? estimate newton step
|
|
Thread rating:  |
Cheng Cosine - 02 Jul 2009 16:51 GMT Hi:
Newton method is an old and famous algorithm to solve equations. But in general we do not have an estimate on the steps we will get the convergent result. Nevertheless, will we be able to estimate this in the case that the problem at hand is linear or only 2nd order?
At the 1st though it seems that we can finish iteration in 1 step if f is s linear function.
Use Newton's method to solve f(x) = 0 -> x(k+1) = x(k)-df(k)'*f(k)
Since f(x) = A*x, then df(x) = A'. So we have:
x(k+1) = x(k)-A'*A*x(k) = ( I-A'*A )*x(k)
Don't see how we will go.
Did I do anything wrong?
Peter Spellucci - 02 Jul 2009 17:01 GMT >Hi: > [quoted text clipped - 9 lines] > > Since f(x) = A*x, then df(x) = A'. wrong
>So we have: > > x(k+1) = x(k)-A'*A*x(k) = ( I-A'*A )*x(k) even more wrong
> Don't see how we will go. > > Did I do anything wrong? yes , almost anything which could be done wrong peter
CCC - 02 Jul 2009 17:48 GMT On Jul 2, 12:01 pm, spellu...@fb04814.mathematik.tu-darmstadt.de
> In article <31a630a8-aa04-4c7c-857f-30d834331...@d4g2000yqa.googlegroups.com>, > [quoted text clipped - 25 lines] > > - Show quoted text - Well, where did I do wrong and how to fix it then?
The very first question is: can we always get correct result using Newton method IF the function is linear.
Peter Spellucci - 02 Jul 2009 20:09 GMT >On Jul 2, 12:01=A0pm, spellu...@fb04814.mathematik.tu-darmstadt.de >> In article <31a630a8-aa04-4c7c-857f-30d834331...@d4g2000yqa.googlegroups.= [quoted text clipped - 11 lines] >> >> > Use Newton's method to solve f(x) =3D 0 -> x(k+1) =3D x(k)-df(k)'*f(k) is this the zero of the tangent???
>> > Since f(x) =3D A*x, then df(x) =3D A'. >> wrong A' : you mean the transpose: no !
f is a vector function with components f(1), f(n)
f(i) = sum a(i,j)*x(j) - b(i) then df(i)/dx(j) = a(i,j)
>> >So we have: >> [quoted text clipped - 14 lines] >The very first question is: can we always get correct result using >Newton method IF the function is linear. hth peter
CCC - 02 Jul 2009 21:23 GMT f = A*x, df = A
Newton algorithm:
x(k+1) = x(k)-df'*f = x(k)-A'*A*x(k)
The remaining question is that can we estimate the steps required to converge in advance.
Torsten Hennig - 03 Jul 2009 07:24 GMT > f = A*x, df = A > [quoted text clipped - 5 lines] > steps required to > converge in advance. f(x) = A*x (df/dx)^(-1) = A^(-1)
Newton's algorithm: x_1 = x_0 - A^(-1)*A*x_0 = 0
Since A*x = 0 has solution x=0, convergence to the correct solution is achieved in one step.
Best wishes Torsten.
CCC - 03 Jul 2009 13:54 GMT On Jul 3, 2:24 am, Torsten Hennig <Torsten.Hen...@umsicht.fhg.de> wrote:
> > f = A*x, df = A > [quoted text clipped - 14 lines] > Since A*x = 0 has solution x=0, convergence to > the correct solution is achieved in one step. Thank you Torsten :)
How about in the case that A is not a square matrix?
In that case we do not have x_0 - inv(A)*A*x_0 = x_0 - x_0 = 0.
Peter Spellucci - 03 Jul 2009 14:18 GMT >On Jul 3, 2:24=A0am, Torsten Hennig <Torsten.Hen...@umsicht.fhg.de> >wrote: [quoted text clipped - 22 lines] > > In that case we do not have x_0 - inv(A)*A*x_0 =3D x_0 - x_0 =3D 0. an overdetermined system will not have a solution in general, neither in the linear nor in the nonlinear case. then you will have the problem ||F(x)||_2^2 = min_x a (non)linear least squares problem. Newton's method turns into Gauss-Newton-method: given x_k either solve
||J_F(x_k) d_k + F(x_k)||_2^2 = min_d_k subject to ||D_k|| <= \delta_k
and adapt \delta_k if necessary such that
0 < rho <= (||F(x_k)||_2^2 - ||F(x_k+d_k)||_2^2)/(||F(x_k)||_2^2-||||J_F(x_k) d_k + F(x_k)||_2^2) (i.e. use the trust region method)
or
solve ||J_F(x_k) d_k + F(x_k)||_2^2 = min_d_k (no constraints) and then compute the stepsize \sigma_k such that
||F(x_k)||_2^2 - ||F(x_k+\sigma_k d_k)||_2^2 >= \rho \sigma_k |F(x_k)^T J_F(x_k)^Td_k |
the stepsize approach.
This asumes thta the Jacobian of F J_F is of full rank. More tricks are required should J_F be rank deficient. hth peter
CCC - 03 Jul 2009 15:05 GMT Well, what if we simply use pseudoinverse in this case?
CCC - 03 Jul 2009 15:10 GMT > Well, what if we simply use pseudoinverse in this case? That is, we know in advanced that we do not have a solution to the original
problem when A is a rectangular matrix. So we re-define a problem that has
a solution. We add an additional constraint by minimizing the norm of the
best approximate solution we are to seek. Then we "always" have a solution?
I mean, will it always converge in this case when we still use Newton algorithm?
Chip Eastham - 04 Jul 2009 03:45 GMT > The very first question is: can we always get correct result using > Newton method IF the function is linear. Yes, but there's nothing magic going on here. If the function is linear, f(x) = Ax, then the "gradient" of f is A, and the Newton step will be to "solve" a linear system having the coefficient matrix A, which is tantamount to solving the original equation f(x) = c.
regards, chip
CCC - 04 Jul 2009 04:15 GMT > > The very first question is: can we always get correct result using > > Newton method IF the function is linear. [quoted text clipped - 5 lines] > coefficient matrix A, which is tantamount to > solving the original equation f(x) = c. Ah, indeed, Chip:
Understadning enough linear problem, let's move the nonlienar one.
Now, let's ask this question: do we have a way to estimate the maximum
iterations for a given nonlinear function solved by Newton's method?
What are the properties of a nonlinear that can help us to ensure the
convergence of Newton's method?
A very first one I can think of is a simple 2nd order function. In this
case, we are sure that using Newton's method will converge. But can we
estimate the max iterations, if yes how?
Thanks,
Peter Spellucci - 04 Jul 2009 13:10 GMT >> > The very first question is: can we always get correct result using >> > Newton method IF the function is linear. [quoted text clipped - 26 lines] > > Thanks, you had several questions: what if we use the pseudoinverse? the pseudoinverse is in use if you take the Gauss/Newton resp. Levenberg/Marquardt there is _no_ convergence theory for a Newton method (for a problem R^n->R^n with a zero solution) other than by Ben Israel but this one had the unrealistic assumption that the rank of the Jacobian is constant in a complete neighborhood of the solution. And this is a result of the type "provided that x0 is sufficiently near to the solution"
minimizing the sum of squares with a norm constraint on x has of course always a solution but this has nothing to do with guarantee of convergence of the methods mentioned.
For Gauss-Newton resp. Levenberg-Marquardt there is a complete convergence theory which guarantees convergence to _one of the possibly many_ stationary points. look at the classic text by dennis and schnabel: "numerical methods for unconstrained minimization" (SIAM reprint)
for the ordinary Newton's method there is a complexity result which says that this is proportional to the reciprocal of the distance of the initial point to the next point where the Jacobian is rank deficient -- but nothing quantitiative. (-> results by Smale)
hth peter
MeAmI.org - 05 Jul 2009 10:57 GMT MeAmI.org wrote:
> >> > The very first question is: can we always get correct result using > >> > Newton method IF the function is linear. [quoted text clipped - 53 lines] > hth > peter Qualitative results (-> further by Musatov):
In particular, if k := max(\D(z,x) : x GV — z), then every x belongs to \jj(z ... 6.10] Let D = (V, A) be a k-arc-strong directed multigraph and let x,y be...Slide 1Max-Cut Max-Directed-Cut Min- Bisection Sparsest-Cut Balanced-Separator .... technique of using Fourier transforms to analyze [Dictator Tests] seems very strong. ....
Test: Choose triple (x, y, z) from Dn. D = w. prob. = ). ) ) ) z. y. x ...
6 d(G) 6 2r(G) 6 2r(D) > 1 > 2rfrom u to v is the length of a shortest (directed) u v path. The eccentricity of a ...
The sought strong. digraph D is constructed in Fig. 1. z ...
min e(u) = e(y) = r = r(D);. max e(u) = e(z ....
x. n. = b) be a shortest oriented walk from a to b through w. and let the node x ...
going from x. i. = w to b is. 6 d ..
Methods for statistical data analysis of multivariate observations:
For step 3a, however, the following is substituted: if x and y are entities clustered in C]+ , but not in Cj, define d([x, y], z) = max{<f (x, z), d(y, z)}
Directed tree-width examples:
A set S ⊆ D \ Z is Z-normal, if there is no directed walk in D \ Z with ....
Let Y be the first position of the four cops, in ....
t > e} is the union of strong components in D \ X ...
-- Musatov
MeAmI.org - 05 Jul 2009 10:57 GMT MeAmI.org wrote:
> >> > The very first question is: can we always get correct result using > >> > Newton method IF the function is linear. [quoted text clipped - 53 lines] > hth > peter Qualitative results (-> further by Musatov):
In particular, if k := max(\D(z,x) : x GV — z), then every x belongs to \jj(z ... 6.10] Let D = (V, A) be a k-arc-strong directed multigraph and let x,y be...Slide 1Max-Cut Max-Directed-Cut Min- Bisection Sparsest-Cut Balanced-Separator .... technique of using Fourier transforms to analyze [Dictator Tests] seems very strong. ....
Test: Choose triple (x, y, z) from Dn. D = w. prob. = ). ) ) ) z. y. x ...
6 d(G) 6 2r(G) 6 2r(D) > 1 > 2rfrom u to v is the length of a shortest (directed) u v path. The eccentricity of a ...
The sought strong. digraph D is constructed in Fig. 1. z ...
min e(u) = e(y) = r = r(D);. max e(u) = e(z ....
x. n. = b) be a shortest oriented walk from a to b through w. and let the node x ...
going from x. i. = w to b is. 6 d ..
Methods for statistical data analysis of multivariate observations:
For step 3a, however, the following is substituted: if x and y are entities clustered in C]+ , but not in Cj, define d([x, y], z) = max{<f (x, z), d(y, z)}
Directed tree-width examples:
A set S ⊆ D \ Z is Z-normal, if there is no directed walk in D \ Z with ....
Let Y be the first position of the four cops, in ....
t > e} is the union of strong components in D \ X ...
-- Musatov
|
|
|