Estimation of divergences on time scales via the Green function and Fink’s identity

The aim of the present paper is to obtain new generalizations of an inequality for n-convex functions involving Csiszár divergence on time scales using the Green function along with Fink’s identity. Some new results in h-discrete calculus and quantum calculus are also presented. Moreover, inequalities for some divergence measures are also deduced.


Introduction
The development of the theory of time scales was initiated by Hilger in 1988 as a theory efficient to contain both difference and differential calculus in a steady approach. The books of Bohner and Peterson [8,9] related to time scales are compact and resolve a lot of time scales calculus. This theory allows one to get some insight into and right understanding of the precise differences between discrete and continuous systems.
In the past years, new developments in the theory and applications of dynamic derivatives on time scales have emerged. Many results from the continuous case are carried over to the discrete one very easily, but some seem to be completely different. The study on time scales comes to reveal such discrepancies and to make us understand the difference between the two cases. The results in time scale calculus are unified and extended. This hybrid theory is also extensively used on dynamic inequalities.
Various linear and nonlinear integral inequalities on time scales have been established by many authors [3,4,32,35].
Quantum calculus or q-calculus is usually called calculus without limits. In 1910, Jackson [18] described q-analogue of derivative and integral operator along with their applications. He was the first to establish q-calculus in an organized form. It is important to note that quantum integral inequalities are more significant and constructive than their classical counterparts. It has been primarily for the reason that quantum integral inequalities can interpret the hereditary properties of the fact and technique under consideration.
Recently, there has been a rapid development in q-calculus. Consequently, new generalizations of the classical approach of quantum calculus have been proposed and analyzed in various literature works, see [10,17,27,44] and the references therein. The concepts of quantum calculus on finite intervals were given by Tariboon and Ntouyas [37,38], and they obtained certain q-analogues of classical mathematical objects, which motivated numerous researchers to explore the subject in detail. Subsequently, several new results related to quantum counterpart of classical mathematical results have been established, see [7,29,34].
Divergence measure is the measure of distance between two probability distributions. The idea of divergence measure is used to solve a variety of problems in probability theory. In the literature, several types of divergence measures exist that compare two probability distributions and are used in statistics and information theory. Information and divergence measure are very useful and play a vital part in various areas, namely sensor networks [24], testing the order in a Markov chain [26], finance [33], economics [39], and approximation of probability distributions [14]. Shannon entropy and the related measures are often used in different fields such as information theory, molecular ecology, population genetics, statistical physics, and dynamical systems (see [13,25]). Kullback-Leibler divergence is one of the best known among information divergences. The well-known divergence measure is used in information theory, mathematical statistics, and signal processing (see [42]). Jeffreys distance and triangular discrimination have many applications in statistics, information theory, and pattern recognition (see [23,40,41]).
Recently, various types of bounds on the distance, divergence, and information measures have been obtained (see [2,6,12,15,19,22,36] and the references therein). In [1], Adeel et al. generalized Levinson's inequality for 3-convex function by using two Green functions. Moreover, the obtained results are applied to information theory via f -divergence, Rényi divergence, and Shannon entropy. In [21], Khan et al. introduced a new functional based on a classical f -divergence functional and obtained some estimates for the new functionals, the f -divergence, and Rényi divergence. In [11], Butt et al. established new refinements of Popoviciu's inequality for higher order convex functions utilizing Abel-Gontscharoff interpolation in combination with new Green functions. New inequalities are obtained for n-convex functions. They also gave applications in information theory by finding new estimates for relative, Shannon, and Zipf-Mandelbrot entropies.
Motivated by the above discussion, we generalize an inequality involving Csiszár divergence on time scales for n-convex functions by using the Green function along with Fink's identity. In addition, we estimate Kullback-Leibler divergence, differential entropy, Shannon entropy, Jeffreys distance, and triangular discrimination on time scales by using the obtained results.

Preliminaries
Throughout this paper, assume that T is a time scale, a, b ∈ T with a < b. The following definitions and results are given in [8].
For ζ ∈ T, the forward jump operator σ : T → T is defined as follows: A function g : T → R is known as right-dense continuous (rd-continuous), provided it is continuous at right-dense points in T and its left-sided limit exists (finite) at left-dense points in T. The set of all rd-continuous functions will be denoted in this paper by C rd . T k is defined as follows: Suppose that g : T → R and ζ ∈ T k . Delta derivative g (ζ ) is defined to be the number (provided it exists) if for each > 0 there exists a neighborhood U of ζ such that holds for all λ ∈ U. Then g is said to be delta differentiable at ζ . For T = R, g is the usual derivative g , and g turns into the forward difference operator

Theorem A (Existence of antiderivatives) Every rd-continuous function has an an-
is an antiderivative of f .

Improvement of the inequality involving Csiszár divergence
Assume T to be a time scale and consider the set of all probability densities on T to be where G is convex and continuous corresponding to both x and s. It is notable that (see for example [20,28,30,43]) any function ∈ C 2 ([ζ 1 , ζ 2 ], R) can be written as where G(x, s) is defined in (1). In [5], Ansari et al. proved the following inequality.
Motivated by inequality (3), we initiate with the following result.
In addition, if we reverse the inequality in both statements (c 1 ) and (c 2 ), then again (c 1 ) and (c 2 ) are equivalent.
Theorem 2 Assume the conditions of Theorem 1, we define the following functional: if the inequality in (4) holds for all s ∈ [ζ 1 , ζ 2 ].
Remark 2 Suppose that all the assumptions of Theorem 2 hold. If is continuous and convex, then J 1 ( ) ≥ 0.
The following theorem is proved by Fink in [16]. where
Example 1 Choose T = R in Theorem 4, to get the same result as one can obtain from [15, (2.1)] by utilizing (1) and (7).
Example 2 Put T = hZ (h > 0) in Theorem 4 to obtain a new identity in h-discrete calculus with the following values: and Remark 3 Choose h = 1 in Example 2. Suppose that a = 0, b = n, p 1 (j) = (p 1 ) j , and p 2 (j) = (p 2 ) j to get a new identity in the discrete case with the following values: Example 3 Use T = q N 0 (q > 1), a = q l , and b = q n with l < n in Theorem 4 to obtain a new identity in q-calculus with the following values: and As a result of the earlier obtained identities, the following theorem yields sublime generalization of inequalities involving Csiszár divergence on time scales for n-convex (n ≥ 3) functions.

Theorem 6
Suppose that all the assumptions of Theorem 4 hold. Let ∈ C n [ζ 1 , ζ 2 ] be such that (n-1) is absolutely continuous. Moreover, for the functional J 1 (·) given in (6), we get the following: (i) Inequality (17) holds provided that n is even and (n ≥ 4).
(ii) Let inequality (17) be satisfied and for all s ∈ [ζ 1 , ζ 2 ]. Then Proof It is obvious that the Green function G(·, s) given in (1) is convex. Therefore, by applying Theorem 2 and by using Remark 2, one has J 1 G(·, s) ≥ 0.
Remark 4 Grüss, Cebyšev, and Ostrowski-type bounds corresponding to the obtained generalizations can also be deduced.

Application to information theory
Shannon entropy is the fundamental term in information theory and is often dealt with measure of uncertainty. The random variable, entropy, is characterized regarding its probability distribution, and it can appear as a better measure of uncertainty or predictability. The Shannon entropy allows the estimation of the normal least number of bits essential to encode a string of symbols based on alphabet size and frequency of symbols.

Differential entropy on time scales
Consider a positive density function p on time scale to a continuous random variable X with b a p(ζ ) ζ = 1, wherever the integral exists. In [4], Ansari et al. defined the so-called differential entropy on a time scale by Theorem 7 Let X be a continuous random variable and p 1 , p 2 ∈ with ζ 1 ≤ p 1 (y) p 2 (y) ≤ ζ 2 for all y ∈ T. If n is even (n = 6, 8, . . .), then where and Proof It is obvious that the Green function G(·, s) given in (1) is convex, therefore by using Consequently, Since is n-convex for even n, where n > 4, (16) holds for even values of n ≥ 6. The function (x) = -log x is n-convex n = 6, 8, . . . . Use (x) = -log x in Theorem 5 to get (21), whereh¯b(X) is given in (20).
Example 4 Choose T = R in Theorem 7 to have a new inequality with the following values: , s dy.
Example 5 Choose T = hZ, h > 0 in Theorem 7 to get a new inequality for the Shannon entropy in h-discrete calculus with the following values: Remark 5 Choose h = 1 in Example 5. Suppose that a = 0, b = n, p 1 (j) = (p 1 ) j , and p 2 (j) = (p 2 ) j to get a new inequality involving the discrete Shannon entropy with the following values: Example 6 Choose T = q N 0 (q > 1), a = q l , and b = q n with l < n in Theorem 7 to obtain a new inequality for the Shannon entropy in q-calculus with the following values: where S q := n-1 j=l q j+1 p 2 q j log 1 p 2 (q j ) and j=l q j+1 p 2 q j G p 1 (q j ) p 2 (q j ) , s .