Theory and Modern Applications

# Solution to a Function Equation and Divergence Measures

## Abstract

We investigate the solution to the following function equation , which arises from the theory of divergence measures. Moreover, new results on divergence measures are given.

## 1. Introduction

As early as in 1952, Chernoff [1] used the -divergence to evaluate classification errors. Since then, the study of various divergence measures has been attracting many researchers. So far, we have known that the Csiszár -divergence is a unique class of divergences having information monotonicity, from which the dual geometrical structure with the Fisher metric is derived, and the Bregman divergence is another class of divergences that gives a dually flat geometrical structure different from the -structure in general. Actually, a divergence measure between two probability distributions or positive measures have been proved a useful tool for solving optimization problems in optimization, signal processing, machine learning, and statistical inference. For more information on the theory of divergence measures, please see, for example, [25] and references therein.

Motivated by these studies, we investigate in this paper the solution to the following function equation

(1.1)

which arises from the discussion of the theory of divergence measures, and show that for , if , , and satisfy

(1.2)

then is the solution of a linear homogenous differential equation with constant coefficients. Moreover, new results on divergence measures are given.

Throughout this paper, we let be the set of real numbers and are a convex set.

Basic notations:; is strictly convex and twice differentiable; is differentiable injective map; is the general vector Bregman divergence; is strictly convex twice-continuously differentiable function satisfying ;   is the vector -divergence.

If for every ,

(1.3)

then we say the or is in the intersection of -divergence and general Bregman divergence.

For more information on some basic concepts of divergence measures, we refer the reader to, for example, [25] and references therein.

## 2. Main Results

Theorem 2.1.

Assume that there are differentiable functions

(2.1)

and such that

(2.2)

Then and

(2.3)

for some .

Proof.

Since is differentiable functions, it is clear that

(2.4)

Let

(2.5)

Then is a finite dimension space. So we can find differentiable functions

(2.6)

as the orthonormal bases of , where . Observing that

(2.7)

where

(2.8)

we have

(2.9)

Clearly,

(2.10)

Next we prove that

(2.11)

It is easy to see that we only need to prove the following fact:

(2.12)

Actually, if this is not true, that is,

(2.13)

then there exists such that

(2.14)

Therefore

(2.15)

Because

(2.16)

we get

(2.17)

that is,

(2.18)

Since is linearly independent, we see that

(2.19)

So

(2.20)

This is a contradiction. Hence (2.12) holds, and so does (2.11). Thus, there are such that

(2.21)

Therefore,

(2.22)

So we have

(2.23)

Define

(2.24)

Then

(2.25)

Let , and

(2.26)

Then

(2.27)

Since is a symmetric matrix, we have

(2.28)

for an orthogonal matrix , and a diagonal matrix

(2.29)

Write

(2.30)

Then

(2.31)

So, for all ,

(2.32)

Without loss the generalization, we can assume that

(2.33)

Thus, for all ,

(2.34)

By the similar arguments as above, we can prove

(2.35)

So there is a matrix satisfying

(2.36)

Thus,

(2.37)

By mathematical induction we obtain

(2.38)

So .

Let

(2.39)

be the annihilation polynomial of . Then

(2.40)

Since , we can find such that

(2.41)

The proof is then complete.

Theorem 2.2.

Let the -divergence be in the section of -divergence and general Bregman divergence. Then satisfies

(2.42)

for some .

Proof.

If are in the intersection of -divergence and general Bregmen divergence, then we have

(2.43)

where

(2.44)

Let

(2.45)

Then

(2.46)

Hence

(2.47)

Let

(2.48)

Then

(2.49)

Thus, a modification of Theorem 2.1 implies the conclusion.

Moreover, it is not so hard to deduce the following theorem.

Theorem 2.3.

Let a vector -divergence is are the intersection of vector -divergence and general Bregman divergence and satisfy

(2.50)

where is strictly monotone twice-continuously differentiable functions. Then the divergence is -divergence or vector -divergence times a positive constant .

## References

1. Chernoff H: A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Annals of Mathematical Statistics 1952, 23: 493-507. 10.1214/aoms/1177729330

2. Amari S: Information geometry and its applications: convex function and dually flat manifold. In Emerging Trends in Visual Computing, Lecture Notes in Computer Science. Volume 5416. Edited by: Nielsen F. Springer, Berlin, Germany; 2009:75-102. 10.1007/978-3-642-00826-9_4

3. Brègman LM: The relaxation method of finding a common point of convex sets and its application to the solution of problems in convex programming. Computational Mathematics and Mathematical Physics 1967, 7: 200-217.

4. Cichocki A, Zdunek R, Phan AH, Amari S: Non-Negative Matrix and Tensor Factorizations: Applications to Explanatory Multi-Way Data Analysis and Blind Source Separation. Wiley, New York, NY, USA; 2009.

5. Csiszár I: Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems. The Annals of Statistics 1991,19(4):2032-2066. 10.1214/aos/1176348385

## Acknowledgments

This work was supported partially by the NSF of China and the Specialized Research Fund for the Doctoral Program of Higher Education of China.

## Author information

Authors

### Corresponding author

Correspondence to Jin Liang.

## Rights and permissions

Reprints and Permissions

Dong, CL., Liang, J. Solution to a Function Equation and Divergence Measures. Adv Differ Equ 2011, 617564 (2011). https://doi.org/10.1155/2011/617564

• Accepted:

• Published:

• DOI: https://doi.org/10.1155/2011/617564

### Keywords

• Machine Learning
• Functional Equation
• Orthonormal Base
• Divergence Measure
• Statistical Inference