So the thing we are
expected to find about the convexity is the link between the convexity and the
second derivatives. So let us assume that we
are going to find terms in common between single
variate case and its link between
the second G with different convexity
and multivariate case. Well, straight forward
connection here is differing because well, what you're expected to see
here is the idea that second derivative of single
variate functions should be positive for a
function to be convex. Of course, there
is a concave case as you all do remember but it's the same [inaudible] So
we are not discussing it. Well, it's obvious. So of course, we do not have the idea of the second partial
derivative here. We do not understand what it is. What is the concept, as
well as we do not know what is the first derivative of the whole multivariate
function is. But what we can do here, assume that our second derivative is positive for all x's on
for example, some segments. Then if we multiply it by dx squared non-negative when
the inequality holds. But the things that we've
actually gotten ourselves into, in the left part
of the equation is the second differential
of function f which is positive for all possible changes
of the argument. So that is the concept which or what exists
in the case of multivariate functions
and we're going to generalize our convexity
rule out of it. So assume that we're going to say pretty
much the same thing. Function is convex if the
second differential is positive for all possible changes of variables. That's fine. That's nice. But we have a problem here because changes of variables are a vector here. Changes towards x, changes towards y derives
the right vector. So we need to understand how
to properly define the case whether or not this second differential
is always positive. Firstly, let us look at the second differential
in its full form. Let's look at it as we've
been looking at school because what we're seeing here is a quadratic function
towards for example, dx or dy or maybe it's
relation whatsoever. We are looking at something which resembles ax squared plus bx plus c. What are we
expected to find here? We're expected to find this
function is always positive. How can it happen? As you too remember, the graph of this
function is parabola and it looks like this or like
this or maybe like this one. Of course, what we should demand from our
second differential, from our quadratic function in order for it in order to
be all responses, well, first of all it should be with a negative
discriminant here. Because if it does
have any roots, thus it has a positive
and negative values and thus we are not going to have our strict plus sign
as in our proposed rule. So we need to look at the
discriminant of our function. Remember B squared minus 4ac, and then just ask it to
be negative, no roots. For the other part, it depends if a parabolic function have no roots it's either rated
upwards or downwards. So which is basically defined by the first coefficient which is the second derivative
towards x twice. So what we get here, we get here the first one. We need to relay onto the discriminant of
the parabolic function. In our case it is our
variable D capital here, which means either not this
parabolic function has roots. So if this D which is a
negative discriminant in our case is positive
then it has no roots. Thus it can be either
convex or concave. If well, the rest of quite easy. If the first coefficient
is a positive, then it's convex, then if it's negative it's concave,
and that's all. By using only our school, algebra just come up
with the idea of how to define multivariate convexity. The thing is that
our convexity and our rule is repo from
linear algebra because it is the concept of semi
positive definity of the form of the second
differential but you are going to learn more about
it in the course of linear algebra or in our
additional materials as always. By the now to make sure that we do
understand what's going on, let us consider the
following example. Let us look at the
function a squared plus by squared plus c, xy. Firstly, compute the
second differential and define under what
conditions on a, b, c I'm just going
to say that a, b, c are real and are not zeros, just to simplify the case. So what we're going to do first? We're going to start with
partial derivatives which is the easiest case
because towards x is 2ax plus cy and towards y is 2by plus cx. Then we are going
to just write down the second partial derivatives. I think we're going even
to avoid writing down our second differential and just use our convexity or concavity roots
straightforward. So the second derivative
towards x is the derivative of the first partial
derivative towards x. So it is 2a, the same applies for the second partial
derivatives towards y, two times which is 2b. As a result and as for the
derivative towards xy, we are going to write as ac because we need
to for example, differentiate the
derivative towards x by y which is c. So what is
our negative discriminant? Is it D while in here. We need to multiply
our partial derivative towards x and towards ys and subtract partial derivative
towards x and y squared. So we're going to write
4ab minus c squared. Well, as you all can understand, it's related towards the
discriminant itself. So that's pretty much the
reason why we're using it. So let us assume for
example some cases. For example, what we need to ask to get some convexity
or concavity here, we need to ask of d is
a positive value so 4ab should be larger
than c squared. So what can we expect from here? If a is positive then we should
expect that it is convex. If a is negative, then we should expect that it is concave, and
that's the results. But the questions that we should finish with is the
question of whether or not we actually understand why we are speaking about for example the first coefficient
towards x and not first coefficient towards y. Because well, you can
easily see that x and y are interceptable here. So you can just look
not on the first or the second derivative
towards x twice but on the second derivative
towards y twice. This is where our d rule
comes into the light because assume that we have positive a thus we should understand
that our function is convex. So by our rule, if a is positive c
squared is positive thus b also should be positive. The same applies
for the negative a, there b should be
always be negative thus it doesn't actually matter
what we are looking at. The second derivative towards
x two times or towards y two times its always should be the same sign and our conclusions should
be always the same. Last thing that we are
going to see here is that the cases where
we have some zeros doing our calculations
are tricky and needs more attention as provided by this simple rule that
we've just proved, built, and actually fancied. So we are going to elaborate more on this
whilst we are speaking about our optimization routine on
the last week I believe. For now, we've just come up with the convexity determination
idea on the [inaudible]