Now, since we know
partial derivatives and interaction of
maximal growth, the essential idea is
to understand how to find the point where our function has maximal
or minimal value, how to define function extremal. So basically, we are
going to generalize all as input what I just said for the single variate function. Let us start with the basics. Let's start with definition. What point is called extremal? A point is called extremal
if in some neighborhood of this point function has the greatest or the lowest
value at this very point. That's quite the same
as it was before. So what we do remember? We do remember that the extremal is also somehow connected
with the first degree, which of course a
covariate function. First, let us remind
ourselves that, we've covered the concept
of stationary points. The stationary point
is a point where all first partial derivatives
equal to zero. But function in this way, a point has zero speed of change at any
possible direction. As you maybe do remember, the stationarity is not sufficient but necessary
conditions for extremum. It basically means that the function has an
extremum when it is at the very point and
differentiable actually. It's here then this point
should be stationary. But it's not sufficient case and we're going to show it
in the following manner. Well, consider two functions, [inaudible] parabola
and function x squared minus y squared
which is hyperboloid. So first, let us assume
that we are going to compute a gradient for
both of these functions. For the first function, it is 2x, 2y. For the second function, it is 2x minus 2y. So basically, if we're looking for all the possible
stationary dot, here in both cases, the stationary point is
unique and it is 0, 0. But as we clearly understand, for the first function is almost certainly a minimum and
the global minimum, because both x squared and y squared couldn't
be less than zero, and at the point 0, 0, the function has value zero, which is a global minimum for it. But as for the second function, if we, for example, assumes a restriction of the
function by y equal to zero, that we get ninth, f, or z equal to x
squared and a point 0, 0 to over x squared is a minimum. But as we consider, in other restriction,
for example, x equal to zero and we get our
f equal to minus y squared which results into the
maximum in the point 0, 0. That basically offers
one restriction, its maximum value for
that, it's minimal value. Actually, we already
seen this surface as far as looking z1. Remember, we had one
parabolic function looking upwards and other parabolic
function looking down. What's basically,
this is a case where stationary point doesn't
have an extremum in it. So if we already established
a necessary conditions, then we need to think
about the sufficient one. Let's follow the same
rule pretty much here. For the single valid function, the sufficient
condition was almost about the concavity or
convexity of the function. If you do remember,
if the function was convex in the stationary point, as that, it was a minimum. In order to remember it, we draw functional [inaudible]. It is a complex function and
concave function [inaudible]. If this is a minimum.
This is also maximum. So convexity is extremely linked with the
optimization in our case. So let's just try to use
the same principle here. If the function is convex,
that's the minimum. If the function in concave, that's the maximum for the multivariate case which is extremely useful here because we already established how to find whether function is
convex or concave in case of multiple variables. In order to do so,
we've considered whether or not the
second differential of the function is positive
or negative semidefinite. I'm going to remind you. Firstly, we've looked
at the full formula second differential and
we've looked at it as a quadratic function towards Dx on Dy and we can
see the D well, which is as a result
of multiplication of second derivative targets
x_2 times multiplied by second derivative
targets y_2 times minus squared second
derivative towards xy. The idea was that if this value which is
minus discriminant over the parabolic function in
second differential is positive then by deciding whether the first coefficient is
positive or negative. We can differ between
convex and concave cases, and if the discriminant
is negative, then basically we have
non-extremum case. So let us try to use this
rule and an example show how all this works just from
start to the beginning. Firstly, our procedure
works as follows. Find partial
derivatives, then find stationary points by solving
the system of equations, then find the second
derivative and decide whether or not the
second differential is positive or negative
semidefinite. So the example we're going
to look at is the following; x powered 4 plus y powered 4 minus x squared
minus 2xy minus y squared. So let us start with
our procedure grades. Firstly, let us start
with partial derivatives. Partial derivatives here I'm
going to write them down. While all of them actually
because we are in need of second partial derivatives afterwards to decide whether or not this second differential
is semidefinite. So for the first one, partial derivative with regard
to x is 4x powered 3 minus 2x and minus 2y. So with regards to y, well pretty much is the same for y powered 3 minus 2x minus 2y. We need to find stationary points afterwards
but I'm going just to find second derivatives here. So if we differentiate first partial derivative
toward x at other time. Toward x, we get 12x
squared minus 2. As the same applies for second partial
derivative towards y, 2 times 12y squared minus 2. For the case of second
derivative toward x and y, we get minus 2 simply. So the next thing to do
is to solve our system of equations in terms of first partial
derivatives equal to zero. As you can see in both equations, there is a block
minus 2x minus 2y. Thus basically this means if those two are equal
because they do coincide, then the [inaudible] are
equal which is 4x powered 3, 4y powered 3 because
the right side of both equations are the same. So this implies that
in stationary points, x equals to y thus
we get an equation 4x powered 3 minus 4x equals 0. We can remove four, thus we get x powered
3 minus x equals to 0. Thus we get all
possible solutions x equals to 0 equals to y and x equals to y equals to 1x equals to
y equals to x minus 1. So we get basically 0, 0; 1, 1; 1 minus 1 minus 1
stationary points. So let us start with 1, 1 and minus 1 minus 1 because simply as you can see from the look
on second derivative, they're quite the
same towards the idea of the value of second
derivatives here. I'm going to use in green here to stress that I'm working
with these two functions. So basically, if we
substitute x and y with our stationary point here, we get 12 multiplied by
1 minus 2 which is 10. This is still minus 2, and this is 10. So we need to decide on what to do with our second differential. In order to do it,
we need to find our negative discriminant D
capital value which is 10. Second partial
derivative toward x, 2 times multiplied by 10. Second partial
derivative towards y, 2 times minus 2
squared which is 96, which is greater than zero. That's our second differential, is actually
semidefinite and since our second derivative targets x, 2 times is greater than zero, we get convex case, that goes to stationary
points at minimums. So the only thing we need
to decide is whether or not this works for
our 0, 0 point. As for 0, 0 point, everything is just a bit more complicated because
as you can understand, the values of all partial
derivatives here are minus 2. Thus our D value is
minus 2 multiplied minus 2 minus minus 2 squared which is zero and our rule for
convexity doesn't work. More importantly, let us
look at this very function. As you do remember, we've said that if the
convexity rule does not work, then we do not know
anything about convexity, and we should probably look
at the function itself. Basically, what we
are looking at, just consider for
example two things. Firstly, the last
three terms can be easily written as
x plus y squared. So let us do it. We're going
to write x powered 4 plus y powered 4 minus
x plus y squared. So for the direction for example, x equals to minus y. This last term actually
disappears and we get quite nice 2x powered 2 which has a minimum
[inaudible] zero-point. So at least on one direction, our 0, 0 point is a minimum. But it's easy to understand
that for example, if we consider x equals 2
minus y but x equals to y. We can easily gather our function turns into 2x powered 4 minus 4x squared which is easy to show has a maximum
value at zero point. So as a result we get
that at the 0, 0 point. Well, we get nothing. It is not an extremum. Not. That was a full scheme
of exploring whether or not this function
has the extremas and finding out what
is the extremas are.