Welcome to the Optimization 101. We need to understand
what we're looking for and how we can characterize it. The first start is understanding of what is called extrema point or extremum. First of all, you need
to understand that it's a [inaudible] in front, so it's plural is extrema, and we're going to learn with it. Point for example A, is called a local
extremum of function F, if it is greatest over
the lowest actually. A value in some neighborhoods
of is a good point. You should carefully
understand that, this definition actually looks if there is any neighborhoods that satisfies our demand for the greatest or
the lowest value. So you don't need to
push for every or somehow not be going
less than some value. Just any neighborhood
will suffice. Let us just have a look at
the most common example. I've taken just a cubic polynom here to represent
some basic cases. So what we do have
here is three dots A, B, C, and let us understand. Firstly, point A is a local extrema because it is the greatest value for
example, in this neighborhood. Point B is a local
extremum because it is the lowest value
in this neighborhood. Point A is called local maximum, and point B is called
local minimum, and point C is neither a
minimum or maximum extrema. Because whatever we are calling the closest
neighborhood here, whatever we are going to do as this square frame on the
left side from the point C, we are going to get greater
values than the point C, and at the right side, we are going to get
lower values than the point C. So it's
not an extremum at all. I'm going to spend quite
a bit on this graph, because you need to see
one key difference here. I'm not speaking
about global extrema, the greatest values and
lowest values of all. The thing here is that
there are times which you can mention that local
maximum and local minimum, does not necessarily mean
that the point that is called maximum is necessarily greater than point
that's called minimum. Since all the points
are called local, their relation between each other it's not actually defined. Just imagine some
kind of this graph, just like extremely
long stairs like curve. Well, let us just mark some
maximas and minimas here. So this point is actually, let me just take a red color. That's a maximum, that's minimum, and that's a maximum, that's a minimum.
That's a trouble. We have maximum and
we have minimum, and minimum is actually
greater than maximum. That's the case. That's
what we are talking about. All these things are local. We cannot derive any
conclusions just by its maximum and minimum here. That's actually a sad one, but it's just the case. So what we're going
to do here is, we are going to understand how it is connected
with derivatives. To understand how it is
connected to the derivatives, we are going to just relate
to our mean value theorem. As you remember, mean value theorem told
us that the change of a function is proportional to the derivative and the
change of the argument. So in case for example, our derivative is positive for all the segment or
negative for all the segment, function changes monotonically
on this segment. If derivative is positive then function
monotonically grows, and if derivative is negative then function's
monotonically falls. So as a result, we need to understand
the connection between extremum and derivative. To doing so we need to remember
our mean value theorem, because it stands for the
change of function is proportional to the change of argument and its derivative, as you assume that derivative is positive or negative
on all the segment. Then you necessarily get positive or negative
change of function. In other words, the sign of the derivative is
positive or negative, stands for the monotonicity
of the function. It seems monotonicity of the
function is crucial for us. Because let us
assume for example, to the point A to the left side we are going to the right side we are falling, in which we add a function
we are considering. So derivative has a point A cannot be either
positive or negative, since it's necessarily
should equal to zero. This kind of necessity
case for extremums. The rule here looks as follows, if function have an extremum at a given point and is
differentiable as a given point, then the derivative
here equals to zero. Okay, that's understandable. But how come it is connected with convexity as
a second derivative here? In order to understand this, we need to move
further and understand how this derivative is
connected with convexity.