An Empirical Evaluation of the Approximation of Subjective Logic Operators Using Monte Carlo Simulations

Fabio Massimo Zennaro fabiomz@ifi.uio.no Magdalena Ivanovska magdalei@ifi.uio.no Audun Jøsang josang@mn.uio.no Department of Informatics, University of Oslo
PO Box 1080 Blindern
0316 Oslo, Norway

Abstract

In this paper we analyze the use of subjective logic as a framework for performing approximate transformations over probability distribution functions. As for any approximation, we evaluate subjective logic in terms of computational efficiency and bias. However, while the computational cost may be easily estimated, the bias of subjective logic operators have not yet been investigated. In order to evaluate this bias, we propose an experimental protocol that exploits Monte Carlo simulations and their properties to assess the distance between the result produced by subjective logic operators and the true result of the corresponding transformation over probability distributions. This protocol allows a modeler to get an estimate of the degree of approximation she must be ready to accept as a trade-off for the computational efficiency and the interpretability of the subjective logic framework. Concretely, we apply our method to the relevant case study of the subjective logic operator for binomial multiplication and fusion, and we study empirically their degree of approximation.

keywords:

subjective logic , Monte Carlo simulation , Beta distributions , binomial product , subjective logic fusion

1 Introduction

Subjective logic (SL) [4] defines a framework for expressing uncertain probabilistic statements in the form of subjective opinions. A subjective opinion allows a modeler to state probabilities over a set of alternative events along with a measure of the global uncertainty of such modeling. Subjective opinions thus integrate a form of first-order uncertainty, relative to the distribution of probability mass over events, and a form of second-order uncertainty, due to the incertitude in distributing the probability mass. Subjective opinions provide a simple, clean and interpretable way to encode and manipulate uncertainty; as such, they constitute a useful modelling tool in sensitive scenarios in which statistical models can not be inferred from data, but must be built relying on the domain knowledge or the intuition of experts. In this fashion, SL has been extensively adopted to model uncertainty in several fields such as trust modeling, biomedical data analysis or forensics analysis [4].

From a purely statistical point of view, subjective opinions can be seen as an alternative representation for standard probability distribution functions (pdfs), such as Beta pdfs or Dirichlet pdfs. Indeed, under certain assumptions, it is possible to define a unique mapping between subjective opinions and probability distribution functions [4]. This means that subjective opinions may be interpreted as a re-parametrization of standard distributions from the statistical literature.

SL also defines several operators over subjective opinions. These operators allow to carry out transformations over subjective opinions in a very efficient way. With respect to the underlying probability distributions, SL operators provide an extremely quick approximation of operations over probability distributions that would be otherwise very difficult or impossible to evaluate analytically.

Thus, beyond its original application, SL may also be seen as an effective statistical tool to compute approximate probability distributions generated by the transformations encoded into the SL operators. However, while the efficiency of SL operators may be easily evaluated, estimates about their bias are lacking. This shortcoming may limit the adoption of SL in favor of other better-studied approaches, such as Monte Carlo (MC) simulations. Modern probabilistic programming languages [2] provide a versatile language in which operations over probability distributions may be easily defined and evaluated using pre-coded inference algorithms. While being computationally more expensive, these techniques provide comforting guarantees on the convergence of the algorithms as a function of the number of sampling iterations. These guarantees, contrasting the lack of formal bounds of SL operators, may be a strong argument for many researchers to overlook SL and the related set of operators.

In this paper, we propose a protocol to address numerically the problem of characterizing the approximation of SL operators by offering an empirical analysis of their bias with respect to MC simulations. SL operators and MC simulations are taken as two distinct frameworks to approximate operations over pdfs, each one with its strenghts and limitations. Our analysis defines a quantitative comparison in which SL operators and MC simulations are contrasted in terms of the trade-off between computational efficiency and bias. More specifically, our approach allows to answer the question: What amount of approximation should we be ready to accept in exchange for the computational efficiency of subjective logic?

To show the usefulness of our protocol, we consider the specific case of binomial multiplication and fusion. Binomial multiplication is a simple SL operator that returns the approximation of the product of two Beta pdfs. Computing the product of independent Beta pdfs is a non-trivial problem [1] with relevant applications in fields such as reliability analysis and operations research [10]. Binomial multiplication in SL may then be seen as a simple and effective algorithm to compute an approximate solution to the problem of multiplying together two Beta pdfs. Fusion is a SL operator used for merging the opinions of different agents. This operator has been studied and applied in the context of second-order Bayesian networks [6]. For both operators, we compare the approximation obtained using SL to moment-matching approximation and kernel-density approximation produced via MC simulations. In this way, we are able to get an understanding of the amount of approximation that we should be ready to accept if we want to work in the framework of SL.

The rest of the paper is organized as follows. Section 2 reviews the basics of subjective logic and Section 3 presents the main aspects of computational statistics relevant to this work. Section 4 describes the computational complexity of SL approximations and MC approximations, while Section 5 discusses the bias of the same techniques. Section 6 proposes a grounded framework for evaluating the degree of approximation of SL operators in relation to MC simulations. Section 7 makes this framework concrete by applying it to the case study of the product of Beta pdfs, and it presents a set of empirical simulations to validate our approach; similarly, Section 8 applies our framework to another case study, the fusion of Beta pdfs, and it validates our methodology via empirical simulations. Finally, Section 9 summarizes the results and discusses possible directions for future work. For convenience and reference, Table 1 summarizes the notation that will be used throughout this paper.

$\Omega$	Collection of mutually exclusive events
$M$	Number of mutually exclusive events
$X,Y,Z...$	Random variables over $\Omega$
$x,y,z...$	Sample of a random variable
$\omega_{X}$	Subjective logic opinion
$\mathbf{b},b$	Belief (vector and scalar)
$d$	Disbelief (scalar)
$\mathbf{a},a$	Prior probability (vector, scalar)
$u$	Uncertainty (scalar)
$p_{X}$	Probability distribution function (pdf) of X
$M_{i}\left[X\right]$	$i$ -th moment of the pdf of X
$E\left[X\right]$ , $Var\left[X\right]$	Expected value and variance of the pdf of X
$D_{A}\left[p_{X},p_{Y}\right]$	Distance $A$ between the pdf of X and the pdf of Y
$\hat{p}_{X}$	Empirical pdf for X estimated from samples
$N$	Number of samples
$p_{X}^{SL}$	Pdf underlying a subjective logic opinion
$\hat{p}_{X}^{\mathrm{MC}}$	Empirical pdf for X estimated via Monte Carlo (MC) sampling
$\hat{p}_{X}^{\mathrm{KDE}}$	Empirical pdf for X estimated via MC and kernel density estimation (KDE)
$\hat{p}_{X}^{\mathrm{MM}}$	Empirical pdf for X estimated via MC and moment matching (MM)
$\hat{p}_{X}^{\mathrm{GAUSS}}$	Empirical pdf for X estimated via MC and MM with a Gaussian approximation
$\hat{p}_{X}^{\mathrm{BETA}}$	Empirical pdf for X estimated via MC and ad MM with a Beta approximation
${p}_{X}^{\mathrm{GAUSS}}$	Pdf for X estimated via analytic MM with a Gaussian approximation
${p}_{X}^{\mathrm{BETA}}$	Pdf for X estimated via analytic MM with a Beta approximation
$\mathcal{P}$	Space of probability distribution functions
$\mathcal{S}$	Space of subjective opinions
$\circ_{P}:\mathcal{P}\times\mathcal{P}\rightarrow\mathcal{P}$	Binary operator on the space of probability distribution functions
$\circ_{SL}:\mathcal{S}\times\mathcal{S}\rightarrow\mathcal{S}$	Binary operator on the space of subjective logic opinions

Table 1: Summary of notation.

2 Subjective Logic

In this section, we present the fundamentals of SL. We start with a formalization of subjective opinions and we show how they may be mapped to probability distributions.

Subjective opinions

Let $\Omega$ be a discrete collection of $M$ mutually exclusive and exhaustive events. A subjective opinion $\omega$ is a triple:

\left(\mathbf{b},u,\mathbf{a}\right),

(1)

such that

\sum_{i=1}^{M}b_{i}+u=1,

(2)

where $\mathbf{b}\in\mathbb{R}^{M}$ , with $b_{i}\in\mathbb{R}_{\geq 0}$ , is the belief vector expressing the probability mass that the modeler places on each event $x_{i}$ in $\Omega$ , $u\in\mathbb{R}_{\geq 0}$ is the uncertainty scalar quantifying the uncertainty of the modeler in its definition of $\mathbf{b}$ , and $\mathbf{a}\in\mathbb{R}^{M}$ is the prior vector encoding a prior probability distribution over the events in $\Omega$ . This subjective opinion is called a multinomial opinion.
Notice that the constraint in Equation 2 limits the degrees of freedom of $\mathbf{b}$ and $u$ to $M$ and, consequently, defines a $M$ -dimensional simplex on which subjective opinions may be represented.

The limit-case multinomial opinion is the binomial opinion for $M=2$ . In this case $\Omega=\{x,\overline{x}\}$ and the subjective opinion in Equation 1 may be re-written for simplicity as:

\left(b,d,u,a\right),

(3)

such that

b+d+u=1,

(4)

where $b\in\mathbb{R}_{\geq 0}$ is the belief scalar expressing the probability of $x$ , $d\in\mathbb{R}_{\geq 0}$ is the disbelief scalar expressing the probability of $\overline{x}$ , $u\in\mathbb{R}_{\geq 0}$ is the uncertainty scalar and $a\in\mathbb{R}_{\geq 0}$ is a scalar expressing the prior probability of $x$ .
Having only two degrees of freedom, binomial opinions in the form $\left(b,d,u,a\right)$ belong to a two-dimensional simplex and may be visualized together with $a$ in a barycentric coordinate system¹¹1See http://folk.uio.no/josang/sl/BV.html for an illustration..

Mapping of subjective opinions

In order to ground SL, a mapping has been defined between multinomial opinions and Dirichlet pdfs and between binomial opinions and Beta pdfs.

Given a mapping constant $W\in\mathbb{R}_{\geq 0}$ , it is possible to define a unique mapping from opinions to pdfs. Let $\omega=\left(\mathbf{b},u,\mathbf{a}\right)$ be a multinomial opinion with $u\neq 0$ ; $\omega$ can be mapped to a Dirichlet pdf $p$ with distribution $\mathtt{Dir}\left(\boldsymbol{\alpha}\right)$ , where the vector of parameters $\boldsymbol{\alpha}$ is defined as:

\boldsymbol{\alpha}=W\left(\frac{\mathbf{b}}{u}+\mathbf{a}\right).

(5)

For binomial opinions, specifically, given a mapping constant $W\in\mathbb{R}_{\geq 0}$ , it is possible to define a unique mapping from opinions to pdfs. Let $\omega=\left(b,d,u,a\right)$ be a binomial opinion with $u\neq 0$ ; $\omega$ can be mapped to a Beta pdf $p$ with distribution $\mathtt{Beta}\left(\alpha,\beta\right)$ , where $\alpha$ and $\beta$ are parameters defined as:

\begin{cases}\alpha=W\left(\frac{b}{u}+a\right)\\ \beta=W\left(\frac{d}{u}+(1-a)\right).\end{cases}

(6)

Notice that, for reasons of consistency, $W$ is usually fixed to $2$ [4]. We then have a mapping $s$ from opinion $\omega$ to pdf $p$ :

s:\omega\mapsto p.

Vice versa, given a mapping constant $W\in\mathbb{R}_{\geq 0}$ and a fixed prior distribution $\mathbf{a}$ , it is possible to define a unique mapping from pdfs to opinions. Let $p$ be a Dirichlet pdf with distribution $\mathtt{Dir}\left(\boldsymbol{\alpha}\right)$ with $\alpha_{i}>1$ ; $p$ can be mapped to a multinomial opinion $\omega=\left(\mathbf{b},u,\mathbf{a}\right)$ , where the parameters are computed as:

\begin{cases}\mathbf{b}=\frac{\boldsymbol{\alpha}-W\mathbf{a}}{W+\sum_{i}\left(\boldsymbol{\alpha}_{i}-W\mathbf{a}_{i}\right)}=\frac{\boldsymbol{\alpha}-W\mathbf{a}}{\sum_{i}\boldsymbol{\alpha}_{i}}\\ \mathbf{u}=\frac{W}{W+\sum_{i}\left(\boldsymbol{\alpha}_{i}-W\mathbf{a}_{i}\right)}=\frac{W}{\sum_{i}\boldsymbol{\alpha}_{i}}\\ \mathbf{a}=\mathbf{a}\end{cases}

(7)

Again, for a binomial opinion, given a mapping constant $W\in\mathbb{R}_{\geq 0}$ and a fixed prior distribution $a$ , it is possible to define a unique mapping from pdfs to opinions. Let $p$ be a Beta pdf with distribution $\mathtt{Beta}\left(\alpha,\beta\right)$ with $\alpha,\beta>1$ ; $p$ can be mapped to a binomial opinion $\omega=\left(b,d,u,a\right)$ , where the parameters are computed as:

\begin{cases}b=\frac{\alpha-Wa}{\alpha+\beta}\\ d=\frac{\beta-W(1-a)}{\alpha+\beta}\\ u=\frac{W}{\alpha+\beta}\\ a=a\end{cases}

(8)

For reasons of consistency, $W$ is usually fixed to $2$ [4]. Given a prior distribution $a$ , this generates the mapping $t$ from pdf $p$ to opinion $\omega$ :

t:p\mapsto\omega.

Subjective opinion operators

SL defines several operators over subjective opinions, such as addition, product or fusion [4]. In general, these operators are computed over the parameters of subjective opinions. Let $\omega_{X}=\left(\mathbf{b}_{X},u_{X},\mathbf{a}_{X}\right)$ and $\omega_{Y}=\left(\mathbf{b}_{Y},u_{Y},\mathbf{a}_{Y}\right)$ be two subjective opinions and let $\circ_{SL}:\mathcal{S}\times\mathcal{S}\rightarrow\mathcal{S}$ be a generic operator over the space of subjective opinions $\mathcal{S}$ . Then, $\omega_{Z}=\left(\mathbf{b}_{Z},u_{Z},\mathbf{a}_{Z}\right)$ resulting from the application of the operator to $\omega_{X}$ and $\omega_{Y}$ is given as:

\omega_{Z}=\omega_{X}\circ_{SL}\omega_{Y}=\begin{cases}\mathbf{b}_{Z}=f_{b}\left(\omega_{X},\omega_{Y}\right)\\ u_{Z}=f_{u}\left(\omega_{X},\omega_{Y}\right)\\ \mathbf{a}_{Z}=f_{a}\left(\omega_{X},\omega_{Y}\right),\end{cases}

(9)

where $f_{b},f_{u},f_{a}:\mathcal{S}\times\mathcal{S}\rightarrow\mathbb{R}_{\geq 0}$ are operator-specific functions returning the values of belief, uncertainty and prior for the opinion $\omega_{Z}$ .

Subjective opinion operators for evaluating operations over pdfs

When properly defined, SL operators can be used to approximate operations over probability distribution functions. Suppose we are given two pdfs, $p_{X}$ and $p_{Y}$ , and we want to compute a generic operation over them, $\circ_{P}:\mathcal{P}\times\mathcal{P}\rightarrow\mathcal{P}$ over the space of probability distributions $\mathcal{P}$ . Computing this operation over probability distributions may be very complex. However, if we have an SL operator $\circ_{SL}:\mathcal{S}\times\mathcal{S}\rightarrow\mathcal{S}$ that approximates $\circ_{P}$ , we may find a workaround computing $p_{X}\circ_{P}p_{Y}$ by projecting the two distribution onto the opinions $\omega_{X}$ and $\omega_{Y}$ , computing the resulting opinion $\omega_{Z}=\omega_{X}\circ_{SL}\omega_{Y}$ , and then mapping the result back onto a probability distribution function $p_{Z}^{SL}$ . In this way, the resulting pdf $p_{Z}^{SL}$ provides an easy-to-compute approximation of the real pdf $p_{Z}$ (see Figure 1).

Figure 1: If the application of the operator

\circ_{P}

to two pdfs

p_{X}

and

p_{Y}

can not be solved analytically, we can map

p_{X}

and

p_{Y}

to the opinions

\omega_{X}

and

\omega_{Y}

and apply the SL operator

\circ_{SL}

to compute the opinion

\omega_{Z}

. The pdf

p_{Z}^{SL}

associated with

\omega_{Z}

provides an approximation of

p_{Z}

3 Computational Statistics

In this section, we review some elements of computational statistics that are relevant to our work. We describe how sampling is used in MC simulations; we show how unbiased estimators can be built via MC integration; we discuss how unbiased estimators can be used to build moment-matching approximation; we show how pdfs may be reconstructed through kernel density estimation; and, finally, we bring these parts together to show how MC simulations may be used to compute the product of pdfs via moment-matching or kernel-density estimation.

Monte Carlo sampling

MC simulations are stochastic numerical algorithms designed to find approximate solutions through repeated random sampling. This paradigm has been applied in many areas of research to solve problems whose exact analytical solution is impossible or too difficult to derive. In statistics, MC simulations are widely used to evaluate probability distributions whose analytical form can not be explicitly expressed. Let $X$ be a random variable with a probability distribution $p_{X}$ on the support $\Omega$ ; let us also assume that the analytical form of $p_{X}$ is unknown but that we can sample realizations $x_{i}$ of the random variable $X$ ; then, MC simulations allow us to draw a large number of independent samples $x_{i}$ and use them to (i) compute useful empirical statistical descriptors $\hat{S}_{X}$ of the probability distribution $p_{X}$ , or, eventually, (ii) reconstruct the approximate shape of the probability distribution $p_{X}$ .

Monte Carlo integration

In order to compute useful empirical statistical descriptors $S_{X}$ of the probability distribution $p_{X}$ , MC simulations rely on integration and on the law of large numbers. Let $S_{X}$ be a statistics of the probability distribution $p_{X}$ that can be computed from a function $f\left(\cdot\right)$ applied to the samples $x_{i}$ . The statistics $S_{X}$ is then defined as:

S_{X}=\int_{\Omega}f(x)p_{X}(x)dx.

(10)

By the law of large numbers, an estimator of $S_{X}$ can be computed using $N$ samples of $x_{i}$ as:

\hat{S}_{X}=\frac{1}{N}\sum_{i=1}^{N}f\left(x_{i}\right).

(11)

It is immediate to see that using Equation 11 and choosing an appropriate function $f(\cdot)$ we can directly estimate useful statistics of the distribution $p_{X}$ , such as moments and quantiles. Thus, through a MC simulation we can sample points from $p_{X}$ and compute informative estimator statistics $\hat{S}_{X}$ .

Moment-matching approximation

A probability distribution $p_{X}$ is completely characterized by the collection of all its moments; if we know the parametric form of the function $p_{X}$ from which we are sampling from, but we ignore the exact value of its parameters, we can compute an estimate $\hat{p}_{X}$ by setting the moments to the estimated values $\hat{M}_{i}\left[X\right]$ . In several scenarios of interest it may actually be possible to compute analytically the value of few lower moments $M_{i}\left[X\right]$ of interest (such as, mean and variance); this approach is well-known and it has been used in the study of SL operators as well (see, for instance, [7]). In general, though, MC simulation and integration provide an empirical and robust way to compute estimators of the $i$ -th moments $\hat{M}_{i}\left[X\right]$ of a probability distribution $p_{X}$ , even when no exact analytical formula for computing the moments $M_{i}\left[X\right]$ of interest is available.

Kernel density estimation

Beyond computing statistics, it is possible to use samples $x_{i}$ generated in a MC simulation to reconstruct the actual probability distribution $p_{X}$ . A standard approach to reconstruct a continuous function $p_{X}$ from a set of finite points $x_{i}$ is kernel density estimation (KDE). Any function may be expressed as a convolution with a kernel function $\kappa(\cdot)$ :

p_{X}=\int_{\Omega}\kappa(x)dx.

(12)

Practically, it is possible to get an empirical approximation using only a finite set of points $x_{i}$ :

\hat{p}_{X}(x)=\frac{1}{Nw}\sum_{i=1}^{N}\kappa\left(\frac{x-x_{i}}{w}\right),

(13)

where the kernel $\kappa(\cdot)$ is a symmetric function, like a triangular function or a Gaussian, and $w$ denotes the width of the kernel; empirical rules are available to select an optimal value for this parameter in relation to the number of samples available [11]. Thus, using the same MC simulation procedure to sample points from $p_{X}$ it is possible also to estimate an approximate probability distribution $\hat{p}_{X}$ .

Monte Carlo simulation for evaluating operations over pdfs

Suppose we are given two probability distributions, $p_{X}$ and $p_{Y}$ , and suppose we want to compute the distribution $p_{Z}$ determined by the application of operation $\circ_{P}:\mathcal{P}\times\mathcal{P}\rightarrow\mathcal{P}$ , that is, $p_{Z}=p_{X}\circ_{P}p_{Y}$ . If the pdf $p_{Z}$ can not be computed analytically, MC simulations may be used to sample from $p_{Z}$ and to estimate a pdf that approximates $p_{Z}$ . As a first solution, we could rely on the samples $\{z_{1},z_{2}\dots z_{N}\}$ obtained by sampling from $p_{X}$ and $p_{Y}$ to estimate the moments $\hat{M}_{i}\left[Z\right]$ and then instantiate a moment-matching approximation $\hat{p}_{Z}^{\mathrm{MM}}$ (see Figure 2). Alternatively, we could use the same samples $\{z_{1},z_{2}\dots z_{N}\}$ from $p_{Z}$ to perform a kernel-density estimation and compute the KDE approximation $\hat{p}_{Z}^{\mathrm{KDE}}$ (see Figure 3). Notice that, differently from the SL approximation $p_{Z}^{SL}$ , we decorate the approximations computed via MC simulations $\hat{p}_{Z}^{\mathrm{MM}}$ and $\hat{p}_{Z}^{\mathrm{KDE}}$ with a hat to underline that they are empirical statistics.

Figure 2: Moment-matching approximation via MC simulation. If the product of two pdfs

p_{X}

and

p_{Y}

can not be solved analytically, we can sample, integrate and estimate the pdf

\hat{p}_{Z}^{\mathrm{MM}}

, which provides an approximation of

p_{Z}

. MCS stands for MC sampling, MCI stands for MC integration.

Figure 3: Kernel-density approximation via MC simulation. If the product of two pdfs

p_{X}

and

p_{Y}

can not be solved analytically, we can sample and estimate the pdf

\hat{p}_{Z}^{\mathrm{KDE}}

, which provides an approximation of

p_{Z}

. MCS stands for MC sampling, KDE stands for kernel-density estimation.

4 Computational Complexity

In this section, we discuss and compare the computational complexity of SL operators and MC simulations. We will evaluate the computational complexity using the $\mathcal{O}\left(\cdot\right)$ notation as the time complexity of running a given algorithm as a function of its input.

Subjective logic

SL operators are defined to be extremely efficient. Indeed, given two opinions $\omega_{X}$ and $\omega_{Y}$ and the generic operator $\circ_{SL}:\mathcal{S}\times\mathcal{S}\rightarrow\mathcal{S}$ , the computation of $\omega_{Z}=\omega_{X}\circ_{SL}\omega_{Y}$ usually requires only a limited number of function evaluations, as shown in Equation 9. The number of evaluations is $2\cdot M+1$ , where $M$ is the number of events over which the opinions are defined. Thus, the overall complexity is $\mathcal{O}\left(M\right)$ : it depends only on the number of events considered, and it is independent of the actual form of the mapped distributions. This makes SL operators an attractive choice especially when working in lower dimensions.

Monte Carlo simulation

The MC approach is, by definition, computationally intensive. The computational complexity of a MC simulation scales as a function of the number $N$ of samples that must be produced. Each iteration requires random sampling and the execution of all the operations necessary to sample from $p_{Z}$ . Overall, the computational complexity of the MC simulation is $\mathcal{O}\left(N\right)$ . If we are using MC simulations to estimate a moment-matching approximation $\hat{p}_{Z}^{\mathrm{MM}}$ , MC integration allows us to compute statistics from the samples generated during the MC simulation with no additional overall computational complexity. However, if we want to estimate the actual pdf via KDE we have to take into account an increase in the overall computational complexity from the linear order to the quadratic order $\mathcal{O}\left(N^{2}\right)$ . Computing the pdf $\hat{p}_{Z}^{\mathrm{KDE}}$ is then a significantly computationally expensive procedure.

It is evident that, taking into account computational complexity only, SL operators dominate MC simulations, with or without KDE, especially considering that the number $N$ of samples in a MC simulation is required to grow large in order to return reliable results even in low dimensions.

5 Bias

In this section, we start analyzing the degree of approximation of SL operators and MC simulations. We will evaluate the degree of approximation in terms of bias of the estimator $\hat{p}_{Z}$ , that is, as the expected value of the difference between the true distribution and the estimated approximation: $E\left[p_{Z}-\hat{p}_{Z}\right]$ .

Subjective logic

The bias of SL operators is dependent on the definition of the specific operator, and a generic theoretical treatment is not possible. Moreover, an analytic study of the bias is not always available for all possible SL operators. In Section 7 we will consider the case study of the binomial operator for subjective logic and we will analyze more in detail its specific bias.

Monte Carlo simulation

MC simulations are known to provide asymptotically unbiased estimators. If we estimate a statistics $S_{X}$ of the pdf $p_{X}$ using a MC integration as in Equation 11, then $\hat{S}_{X}$ is an asymptotically unbiased estimator, that is, in the limit of infinite samples, it converges to the true quantity it approximates:

\lim_{N\rightarrow\infty}\hat{S}_{X}(N)=S_{X},

(14)

where we made explicit the dependence of $\hat{S}_{X}$ on the number of samples $N$ .

If we use MC integration to estimate the moments $\hat{M}\left[Z\right]$ for a moment-matching approximation $\hat{p}_{Z}^{\mathrm{MM}}$ , the MC simulation provides us with unbiased estimators of the moments; this means that, by increasing the number of samples generated in a MC simulation, we can get arbitrarily close to the true value of the estimated quantity. However, notice that while the estimated moments $\hat{M}\left[Z\right]$ are asymptotically unbiased, the $\hat{p}_{Z}^{\mathrm{MM}}$ is biased; this bias is due to the limited set of moments $\hat{M}\left[Z\right]$ used to approximate $p_{Z}$ .

If we use a MC simulation to estimate the true pdf directly via KDE, the empirical pdf $\hat{p}_{Z}^{\mathrm{KDE}}$ is biased. In this case, it is known that the width parameter $w$ of KDE regulates the trade-off between bias and variance. In general, the bias can be shown to be proportional to the width $w$ of the kernel $\kappa(\cdot)$ :

E_{KDE}\left[p_{Z}-\hat{p}_{Z}^{\mathrm{KDE}}\right]\propto w^{2},

(15)

under the constraint that $w$ can not be reduced to zero, for statistical and computational reasons [11]. When using a Gaussian kernel, the widely-adopted Silverman rule suggests the adoption of a kernel width of the following size:

w=1.06\hat{\sigma}\frac{1}{\sqrt[5]{N}},

(16)

where $\hat{\sigma}$ is the empirical standard deviation computed from the samples:

\hat{\sigma}=\sqrt{\frac{\sum_{i=1}^{N}\left(x_{i}-\hat{\mu}\right)^{2}}{N-1}},

(17)

where $\hat{\mu}$ is the empirical mean. It follows, then, that the bias of the KDE approximation is proportional to:

E_{KDE}\left[p_{Z}-\hat{p}_{Z}^{\mathrm{KDE}}\right]\propto 1.06^{2}\hat{\sigma}^{2}N^{\frac{2}{5}}.

(18)

As said, this bias can never be reduced to zero. However, in specific computational setting, this bias may be bounded by finding an optimal trade-off between the number of samples $N$ and the empirical standard deviation $\hat{\sigma}$ . In particular, if the domain of $p_{Z}$ is a discrete domain, as in the case of multinomial opinions and Dirichlet pdfs which underlie subjective opinions, then the empirical standard deviation $\hat{\sigma}$ may be bounded and it may be possible to estimate the magnitude of the bias as a function of the number of samples $N$ .

In summary, from the point of view of approximation, MC simulations represents a safer choice than SL operators, as they are grounded in solid theory and they allow us to quantify and to control the bias. The lack of any bound for SL operators may be seen as an obstacle in adopting them when working in critical domains where precise approximations are required. In the next section, we will introduce our protocol to solve this problem and estimate the degree of approximation of SL operators.

6 Computational Evaluation of the Degree of Approximation of Subjective Logic Operators

In this section we present a framework to evaluate the bias of an SL operator. We start by discussing how MC approximations may be related to SL approximation using a distance measure; then, we define what precise distance measure we will use and how it relates to bias.

Relating subjective logic approximation and Monte Carlo approximations via a distance measure

In the previous sections we illustrated two methodologies for finding an approximation of the pdf $p_{Z}$ , one based on SL operators ( $p_{Z}^{SL}$ ) and one relying on MC simulations ( $\hat{p}_{Z}^{\mathrm{KDE}},\hat{p}_{Z}^{\mathrm{MM}}$ ). Figure 4 merges the graphs in Figure 1, 2 and 3 to illustrate the alternative computational paths that are offered to compute an approximation of $p_{Z}$ ; starting from the distributions $\left(p_{X},p_{Y}\right)$ , the upper path represents the SL approach to finding an approximation of $p_{Z}=p_{X}\circ_{P}p_{Y}$ , while the lower paths represent MC approaches to finding an approximation of the same quantity $p_{Z}$ .

Figure 4: Approximations of

p_{Z}

. MCS stands for MC sampling, MCI stands for MC integration, KDE stands for kernel-density estimation.

Now, approximate methods trade off precision in the results for simplicity in computation. In order to make a grounded decision on which approximation path in Figure 4 to use, it is necessary to quantify the trade-off between computational complexity and bias. As discussed in Section 4 and 5, in the case of KDE approximation via MC simulations, both complexity and bias are known. However, in the case of SL operators, we may easily derive their computational complexity, but we have no simple way of evaluating their bias. Exploiting the properties of MC integration and the idea of distance between pdfs, it is possible to assess the degree of approximation of SL operators in a computational fashion by relating them to MC simulations.

A simple way to evaluate how well a pdf $p$ approximates another pdf $q$ is to estimate the distance between them, $D\left[p,q\right]$ , where $D\left[\cdot,\cdot\right]$ is a measure of distance or divergence between pdfs [8]. The degree of approximation of $p_{Z}^{SL}$ could then be obtained by measuring the distance from the true pdf $p_{Z}$ :

D\left[p_{Z},p_{Z}^{SL}\right].

(19)

However, since the true pdf $p_{Z}$ is taken to be unknown or hard to compute, it is challenging to get a direct estimate of these quantities. Since we can not rely directly on $p_{Z}$ , we can instead exploit MC simulations and its properties.

From Equation 15 in Section 5, we know that the KDE estimation $\hat{p}_{Z}^{\mathrm{KDE}}$ is biased and we know how to evaluate it. Moreover, from Equation 18 in Section 5, we see that this bias depends on the number of samples $N$ and the standard deviation $\hat{\sigma}$ . Now, if the domain of $p_{Z}$ is a discrete domain, as in the case of multinomial opinions and Dirichlet pdfs, then the empirical standard deviation $\hat{\sigma}$ may be bounded and it may be possible to estimate the magnitude of the bias as a function of the number of samples $N$ . It may be possible to select a number of samples $N$ that shrinks the bias to a negligible quantity; in such case, we can then accept the KDE estimation $\hat{p}_{Z}^{\mathrm{KDE}}$ as a close approximation of the true pdf $p_{Z}$ :

D\left[p_{Z},\hat{p}_{Z}^{\mathrm{KDE}}(N)\right]\approx 0.

(20)

We underline that this approximation holds only under the assumption that, for an increasing number of samples $N$ , the bias of the estimate $\hat{p}_{Z}^{\mathrm{KDE}}(N)$ tends, if not to zero, to a quantity whose order of magnitude is negligible with respect to further analysis; in other words, the validity of the approximation in Equation 20 is conditional on the pdf $p_{Z}$ we are considering, the analysis we will be carrying out, and the number of samples we can produce (for an example of an evaluation of these conditions, see the application to the case study of the product of Beta pdfs in Section 7 and Section 7.1).

The approximation in Equation 20 is extremely useful because it means that while we can not evaluate absolute distances with respect to the true distribution $p_{Z}$ , we can still evaluate the relative distance between the KDE approximation and the SL approximation, and use it as a proxy for the distance between the SL approximation $p_{Z}^{SL}$ and the true distribution $p_{Z}$ :

D\left[\hat{p}_{Z}^{\mathrm{KDE}}(N),p_{Z}^{SL}\right]\approx D\left[p_{Z},p_{Z}^{SL}\right].

(21)

Thus, given only a finite set of samples $N$ we can obtain an empirical statistic of the distance as:

\hat{D}\left[p_{Z},p_{Z}^{SL}\right]\hat{=}D\left[\hat{p}_{Z}^{\mathrm{KDE}}(N),p_{Z}^{SL}\right],

(22)

If the condition in Equation 20 holds, we expect the distance $D\left[\hat{p}_{Z}^{\mathrm{KDE}}(N),p_{Z}^{SL}\right]$ to be orders of magnitudes greater than $D\left[p_{Z},\hat{p}_{Z}^{\mathrm{KDE}}(N)\right]$ ; this would indeed confirm that the bias of $\hat{p}_{Z}^{\mathrm{KDE}}(N)$ is negligible and that the computation of $\hat{D}\left[p_{Z},p_{Z}^{SL}\right]$ (using a finite number of samples) provides a good estimate of the degree of approximation of the SL approximation.

Relating distance measure to bias

So far, we have discussed distance measures in abstract terms. The quantity $\hat{D}\left[p_{Z},p_{Z}^{SL}\right]$ may indeed be computed using different pdf distance, such as $\phi$ -divergences or integral probability metrics [13].

In this paper, we will rely on computing a simple integral distance, defined as:

D_{I}\left[p,q\right]=\int_{-\infty}^{+\infty}\left|p(x)-q(x)\right|dx.

(23)

This distance $D_{I}\left[p,q\right]$ is the same as the total variation distance except for the scaling constant:

D_{TV}\left[p,q\right]=\frac{1}{2}\int_{-\infty}^{+\infty}\left|p(x)-q(x)\right|dx.

(24)

The constant $\frac{1}{2}$ rescales the distance on the interval $[0,1]$ . However, in order to get an absolute evaluation of how the mass of the two distributions $p$ and $q$ overlaps, we drop the scaling constant.

The choice of an integral distance $D_{I}\left[\cdot,\cdot\right]$ is justified for three reasons. First, from a conceptual point of view, an integral distance allows us to get a complete picture of the difference between two pdfs. While measures based on the evaluation of a limited set of synthetic statistics such as moments would provide us with a rough evaluation of the difference between two distributions, an integral distance provides a more precise way to assess the distribution of the mass of probability, taking into account, for instance, the potential presence of multiple modes or how mass subtly distributes on the tails.

Second, from a computational point of view, the integral distance $D_{I}\left[\cdot,\cdot\right]$ allows us, once again, to exploit MC integration. Recall that we want to get an estimation of $\hat{D}\left[p_{Z},p_{Z}^{SL}\right]$ through the approximation $D\left[\hat{p}_{Z}^{\mathrm{KDE}}(N),p_{Z}^{SL}\right]$ . Now, if we reconstruct $\hat{p}_{Z}^{\mathrm{KDE}}$ via KDE, we can estimate the integral distance via MC integration over the domain $\Omega$ of the events as:

\int_{\Omega}\left|\hat{p}_{Z}^{\mathrm{KDE}}(z)-p_{Z}^{SL}(z)\right|dz\hat{=}\frac{1}{N}\sum_{i=1}^{N}\left|\hat{p}_{Z}^{\mathrm{KDE}}(z_{i})-p_{Z}^{SL}(z_{i})\right|.

(25)

Third, from a theoretical point of view, the integral in Equation 25 is related to the bias:

$\displaystyle\int_{\Omega}\left\|\hat{p}_{Z}^{\mathrm{KDE}}(z)-p_{Z}^{SL}(z)\right\|dz$	$\displaystyle\hat{=}\frac{1}{N}\sum_{i=1}^{N}\left\|\hat{p}_{Z}^{\mathrm{KDE}}(z_{i})-p_{Z}^{SL}(z_{i})\right\|$	(26)
	$\displaystyle\hat{=}E\left[\left\|\hat{p}_{Z}^{\mathrm{KDE}}(Z)-p_{Z}^{SL}(Z)\right\|\right]$	(27)
	$\displaystyle\geq E\left[\hat{p}_{Z}^{\mathrm{KDE}}(Z)-p_{Z}^{SL}(Z)\right].$	(28)

Thus, using the integral distance $D_{I}\left[\hat{p}_{Z}^{\mathrm{KDE}},p_{Z}^{SL}\right]$ we can obtain an estimation of the distance $\hat{D}\left[p_{Z},p_{Z}^{SL}\right]$ as well as an upper bound on the bias of $p_{Z}^{SL}$ . Notice that the absolute value in the integral distance provides a more honest evaluation of the absolute difference between pdfs, avoiding an averaging effect in absence of the absolute value operator.

Figure 5 summarizes our overall framework to evaluate the degree of approximation of the SL approximation as the integral distance $\int\left|p_{Z}^{SL}-\hat{p}_{Z}^{\mathrm{KDE}}\right|$ , under the assumption that $D\left[p_{Z},\hat{p}_{Z}^{\mathrm{KDE}}(N)\right]\approx 0$ . This approach is generic and it is not tied to the SL approximation. If the condition in Equation 20 can be guaranteed, the same approach may be used to get an estimation of the distance between the true pdf $p_{Z}$ and other potential approximation. For instance, Figure 5 shows our methodology applied also to the problem of estimating the distance from the true pdf of the moment-matching approximation $D\left[p_{Z},\hat{p}_{Z}^{\mathrm{MM}}(N)\right]$ by computing the distance $\int\left|\hat{p}_{Z}^{\mathrm{MM}}-\hat{p}_{Z}^{\mathrm{KDE}}\right|$ .

Figure 5: Evaluations of the distances between

p_{Z}

and its approximations based on the assumption that

D\left[p_{Z},\hat{p}_{Z}^{\mathrm{KDE}}(N)\right]\approx 0

. MCS stands for MC sampling, MCI stands for MC integration, KDE stands for kernel-density estimation.

7 Case Study: Product of Beta Distributions

In this section, we show how our framework may be applied to the problem of computing the product of Beta distributions. We first recall the definition of a Beta distribution and the definition of the product of Beta distributions; we then introduce the SL operator for binomial multiplication and we discuss how it can be used for approximating the distribution of the random variable given by the product of two independent random variables with Beta distributions; we work out the computational complexity of binomial multiplication and show the lack of generic estimate of its degree of approximation; to solve this problem, we apply our framework to get an evaluation of the degree of approximation of binomial multiplication; finally, we run an extensive set of empirical simulations to validate our theoretical results.

Beta pdf

Let $X$ be a random variable on the support $[0,1]$ ; we say that $X$ follows a Beta distribution $X\sim\mathtt{Beta}\left(\alpha,\beta\right)$ with parameters $\alpha\in\mathbb{R}_{\geq 0}$ and $\beta\in\mathbb{R}_{\geq 0}$ when its probability density function $p_{X}$ has the following form:

p_{X}\left(x;\alpha,\beta\right)=\frac{1}{B\left(\alpha,\beta\right)}x^{\alpha-1}\left(1-x\right)^{\beta-1},

(29)

where $B\left(\alpha,\beta\right)$ is the Beta function.

Product of Beta pdfs

Let $X\sim\mathtt{Beta}\left(\alpha_{X},\beta_{X}\right)$ and $Y\sim\mathtt{Beta}\left(\alpha_{Y},\beta_{Y}\right)$ be two independent Beta random variables with associated pdfs $p_{X}$ and $p_{Y}$ . Let us define a third random variable $Z$ as the product of the two Beta random variables $Z=X\cdot Y$ . The probability density function $p_{Z}$ of $Z$ does not follow a Beta distribution anymore, and its precise analytical form can not be easily expressed using elementary functions [9]. An analytical solution to the evaluation of the pdf of the product of two Beta distributions has been offered in [10]²²2This paper actually presents the more generic solution to the problem of multiplying two general Beta distribution, which subsume the multiplication of two simple Beta distributions as defined above.:

\begin{array}[]{c}p_{Z}\left(z;\alpha_{X},\alpha_{Y},\beta_{X},\beta_{Y}\right)=\\ B\left(\beta_{X},\beta_{Y}\right)\cdot z^{-\beta_{X}}\cdot\left(1-z\right)^{\beta_{X}+\beta_{Y}-1}\cdot z^{\alpha_{Y}}\cdot\\ \cdot\frac{F_{D}^{(3)}\left(\beta_{X};1-\alpha_{X},1-\alpha_{Y},\alpha_{X}+\beta_{X}-1;\beta_{X}+\beta_{Y};0,\frac{z-1}{z},\frac{z-1}{z}\right)}{B\left(\alpha_{X},\beta_{X}\right)B\left(\alpha_{Y},\beta_{Y}\right)},\end{array}

(30)

where $F_{D}^{(3)}$ is the Lauricella D hyper-geometric series. While this formula provides an elegant solution to the problem of finding the pdf of the product of two Beta pdfs, its straightforward evaluation is challenging as the Lauricella function requires the computation of factorial products and series.

Other analytical approaches to evaluate the product of two or more Beta distributions have been proposed, including methods relying on high-order functions, such as the Meijer G-function or Fox’s H function, or modeling the pdf of the product using an infinite mixture of simpler distributions [14, 1]. These approaches also present computational challenges, despite more efficient solutions have been investigated [14, 1].

Finally, common approaches rely on MC simulations to sample points from the probability distribution of $Z$ and to compute statistics of the pdf by matching the moments or the quantiles of $Z$ [9], as we reviewed in Section 3. Notice that when considering the product of two independent random variables $Z=X\cdot Y$ , it is straightforward to compute the mean $E\left[Z\right]$ and the variance $Var\left[Z\right]$ of $p_{Z}$ analytically as:

\begin{array}[]{c}E\left[Z\right]=E\left[X\right]\cdot E\left[Y\right]\\ Var\left[Z\right]=E\left[X\right]^{2}\cdot Var\left[Y\right]+E\left[Y\right]^{2}\cdot Var\left[X\right]+Var\left[X\right]\cdot Var\left[Y\right].\end{array}

(31)

Thus, if we were to perform a moment-matching approximation considering only the first two moments, we could instantiate such an approximation without running any MC simulation with a constant computational complexity of $\mathcal{O}\left(1\right)$ .

Subjective logic binomial multiplication

An alternative solution to compute the product of Beta distributions is based on the use of the SL operator for binomial multiplication. Given two binomial opinions $\omega_{X}$ and $\omega_{Y}$ defined on different domains, the binomial opinion $\omega_{Z}$ resulting from the multiplication $\omega_{X}\cdot\omega_{Y}$ is computed as [4]:

\omega_{Z}=\begin{cases}b_{Z}=b_{X}b_{Y}+\frac{\left(1-a_{X}\right)a_{Y}b_{X}u_{Y}+a_{X}\left(1-a_{Y}\right)u_{X}b_{Y}}{1-a_{X}a_{Y}}\\ d_{Z}=d_{X}+d_{Y}-d_{X}d_{Y}\\ u_{Z}=u_{X}u_{Y}+\frac{\left(1-a_{Y}\right)b_{X}u_{Y}+(1-a_{X})u_{X}b_{Y}}{1-a_{X}a_{Y}}\\ a_{Z}=a_{X}a_{Y}.\end{cases}

(32)

Practically, a binomial product operator allows us to evaluate the combination of two opinions over two different facts. In the domain of probability distributions, the multiplication of opinions $\omega_{Z}=\omega_{X}\cdot\omega_{Y}$ translates into the multiplication of the mapped pdfs $p_{Z}=p_{X}\cdot p_{Y}$ .

Approximating the product of Beta pdfs

Now, assume we are interested in computing the product $Z=X\cdot Y$ , where $X$ and $Y$ are two independent Beta random variables. Since an analytic solution is hard to compute, we may decide to rely either on the SL approximation or on a MC approximation.

Concerning moment-matching approximations we may consider a Gaussian pdf and a Beta pdf. Using a Gaussian pdf is a choice motivated by the simplicity and the ubiquity of this distribution; however, this is clearly a naive choice, as a Gaussian pdf has an unbounded support, is symmetrical and it assumes that all the statistical moments greater than the second are zero. Using a Beta distribution is a more prudent choice: even if it is known that the product of two Betas is not, in general, a Beta distribution, a Beta pdf still fits the right support and it may have other moments different from zero. In order to evaluate the parameters of our Gaussian and Beta approximation, we may rely on MC simulations or on an analytic evaluation. If we opt for MC simulations, we can use MC integration to estimate the mean $\hat{\mu}$ and the variance $\hat{\sigma}^{2}$ of $p_{Z}$ ; the Gaussian approximation $\hat{p}_{Z}^{\mathrm{GAUSS}}$ is then instantiated as $\mathtt{N}\left(\hat{\mu},\hat{\sigma}^{2}\right)$ , while the Beta approximation $\hat{p}_{Z}^{\mathrm{BETA}}$ is defined as $\mathtt{Beta}\left(\frac{-\hat{\mu}\left(\hat{\sigma}^{2}+\hat{\mu}^{2}-\hat{\mu}\right)}{\hat{\sigma}^{2}},\frac{\left(\hat{\mu}-1\right)\left(\hat{\sigma}^{2}+\hat{\mu}^{2}-\hat{\mu}\right)}{\hat{\sigma}^{2}}\right)$ , thus guaranteeing that $p_{Z}$ , $\hat{p}_{Z}^{\mathrm{GAUSS}}$ and $\hat{p}_{Z}^{\mathrm{BETA}}$ have the same mean and variance. If we rely on an analytic approach, we can easily compute the mean $\mu$ and the variance $\sigma^{2}$ of $p_{Z}$ using Equation 31; as before, the Gaussian approximation ${p}_{Z}^{\mathrm{GAUSS}}$ is then instantiated as $\mathtt{N}\left({\mu},{\sigma}^{2}\right)$ , while the Beta approximation ${p}_{Z}^{\mathrm{BETA}}$ is defined as $\mathtt{Beta}\left(\frac{-{\mu}\left({\sigma}^{2}+{\mu}^{2}-{\mu}\right)}{{\sigma}^{2}},\frac{\left({\mu}-1\right)\left({\sigma}^{2}+{\mu}^{2}-{\mu}\right)}{{\sigma}^{2}}\right).$

Figure 6 provides a concrete instantiation of the diagram in Figure 4, in which the generic operators $\circ_{P}$ and $\circ_{SL}$ have been substituted with multiplication and the generic moment-matching approximation $\hat{p}_{Z}^{\mathrm{MM}}$ has been replaced by the empirical Gaussian approximation $\hat{p}_{Z}^{\mathrm{GAUSS}}$ , the empirical Beta approximation $\hat{p}_{Z}^{\mathrm{BETA}}$ , the analytic Gaussian approximation ${p}_{Z}^{\mathrm{GAUSS}}$ , and the analytic Beta approximation ${p}_{Z}^{\mathrm{BETA}}$ .

Figure 6: Approximations of the product of two Beta pdfs,

p_{X}

and

p_{Y}

. MCS stands for MC sampling, MCI stands for MC integration, KDE stands for kernel-density estimation, AMC stands for analytic moment computation.

As discussed earlier, choosing which path to take, whether to follow the SL approximation path in upper part of the graph or opt for one of the MC approximations in the lower part, requires evaluating the trade-off between computational complexity and degree of approximation of the different approaches. As these parameters are known in the case of MC simulations, we will review here the computational complexity and the approximation of the binomial multiplication.

Computational complexity of the binomial product operator

Binomial multiplication is extremely efficient. Given two binomial opinions $\omega_{X}$ and $\omega_{Y}$ it is possible to compute their product $\omega_{Z}$ through a fixed and finite number of arithmetic operations. Independently from the actual form of the mapped distributions, the product is always computed in the same amount of time. As such, the computational complexity of these SL operators is constant $\mathcal{O}\left(1\right)$ .

Approximation of the binomial operator

The original paper that introduced the SL operator for binomial multiplication [5] proposed a first qualitative analysis of the degree of approximation of this operator. In particular, it considered the specific instance of the multiplication of two Beta pdfs of the form $X,Y\sim\mathtt{Beta}\left(1,1\right)$ ; the pdf of $X$ and $Y$ reduces to a uniform distribution over $[0,1]$ , which is taken to be a worst-case scenario with maximal variance and entropy. The analytical solution $Z=X\cdot Y$ to this particular case was then computed and graphically compared to the pdf associated with product $\omega_{Z}=\omega_{X}\cdot\omega_{Y}$ . This study provided a clear visual appraisal of the difference between the exact pdf and the SL-approximated pdf, but no quantitative estimation were provided for more general cases.

Relating the binomial multiplication and Monte Carlo approximations

In order to compute a numerical estimation of the degree of approximation of the SL operator for binomial multiplication we want to rely on the framework described in Section 6.

The basic condition expressed in Equation 20 requires the bias of $\hat{p}_{Z}^{\mathrm{KDE}}$ to be bounded and negligible. Recall that this bias, using a Gaussian kernel with width computed using the Silverman rule, is $E_{KDE}\left[p_{Z}-\hat{p}_{Z}^{\mathrm{KDE}}\right]\propto 1.06^{2}\hat{\sigma}^{2}N^{\frac{2}{5}}$ , where

\hat{\sigma}=\sqrt{\frac{\sum_{i=1}^{N}\left(x_{i}-\hat{\mu}\right)^{2}}{N-1}}.

(33)

Now, notice that on our bounded support $\left[0,1\right]$ we can expect the difference $\left(x_{i}-\hat{\mu}\right)$ to, be at most, in the order of $10^{-1}$ . This implies that, in the worst case, the order of magnitude of $\hat{\sigma}$ may be estimated as:

\hat{\sigma}<\sqrt{\frac{{N}\left(10^{-1}\right)^{2}}{N-1}}

(34)

\hat{\sigma}<10^{-1}.

(35)

Consequently, relying on Silverman rule in Equation 16, the order of magnitude of largest kernel width $w$ may be bounded as:

w<1.06\cdot 10^{-1}\frac{1}{\sqrt[5]{N}}

(36)

w<10^{-1}N^{-\frac{1}{5}}.

(37)

As such, from Equation 18, the bias will be proportional to this upper bound:

E_{KDE}\left[p_{Z}-\hat{p}_{Z}^{\mathrm{KDE}}\right]\propto\left(10^{-1}N^{-\frac{1}{5}}\right)^{2}.

(38)

Thus, for instance, if we were to run our MC simulation sampling $N=10^{5}$ samples, then we can expect the bias of $\hat{p}_{Z}^{\mathrm{KDE}}$ to be in the order of $10^{-4}$ . This analysis on the bias allows us to consider the bias negligible if we are comparing it with quantities, such as $D\left[\hat{p}_{Z}^{\mathrm{KDE}}(N),p_{Z}^{SL}\right]$ , order of magnitude greater than $10^{-4}$ . If this condition is met, then we can estimate the degree of approximation of binomial multiplication adopting the framework illustrated in Figure 5 and instantiated for this specific SL operator as in Figure 7.

Figure 7: Evaluations of the distances between

p_{Z}

and its approximations based on the assumption that

D\left[p_{Z},\hat{p}_{Z}^{\mathrm{KDE}}(N)\right]\approx 0

. MCS stands for MC sampling, MCI stands for MC integration, KDE stands for kernel-density estimation, AMC stands for analytic moment computation.

7.1 Empirical Evaluation

In this section we describe our experimental simulations for the evaluation of the degree of approximation of the binomial product. We will first offer a qualitative analysis of the SL approximation $p_{Z}^{SL}$ and the approximation generated via MC simulation $\hat{p}_{Z}^{\mathrm{MC}}$ ; then, we will provide a quantitative statistical assessment of the distance $D\left[p_{Z},p_{Z}^{SL}\right]$ ; next, we will analyze a specific study case concerning the worst-case scenario of the product of two degenerate Beta random variables; finally, we will assess quantitatively the degree of approximation in the product of multiple opinions.

In our simulations starting in the domain of subjective logic, opinions $\omega=\left(b,d,u,a\right)$ are sampled randomly. The parameters $b$ , $d$ and $u$ must be sampled from a simplex defined by the constraint $b+d+u=1$ ; therefore we sample them from a Dirichlet distribution $\mathtt{Dir}\left(\boldsymbol{\alpha}\right)$ with $\boldsymbol{\alpha}=\left[1,1,1\right]$ , which guarantees a uniform sampling over the simplex. The parameter $a$ , instead, is sampled from a uniform distribution $\mathtt{Unif}\left(0,1\right)$ . In the simulations starting in the domain of probability distributions, Beta distributions $\mathtt{Beta}\left(\alpha,\beta\right)$ are sampled randomly; both parameters $\alpha$ and $\beta$ are drawn from a uniform pdf on a bounded domain, $\mathtt{Unif}\left(0,10\right)$ .

All the simulations are carried out using the WebPPL probabilistic programming language [3] and the scripts are available online³³3https://github.com/FMZennaro/SLMC/BinomialProduct.

7.1.1 Qualitative simulations

In the qualitative simulations we aim at getting a first intuitive feeling about the approximation of $p_{Z}^{SL}$ .

Protocol

In order to compare the SL approximation $p_{Z}^{SL}$ and the MC approximation $\hat{p}_{Z}^{\mathrm{MC}}$ we adopt the following protocol: (i) we sample two random opinions $\omega_{X}$ and $\omega_{Y}$ ; (ii) we compute their product $\omega_{Z}=\omega_{X}\cdot\omega_{Y}$ ; (iii) we project $\omega_{Z}$ onto the distribution $p_{Z}^{SL}$ ; (iv) we project the opinions $\omega_{X}$ and $\omega_{Y}$ onto the Beta distributions $p_{X}$ and $p_{Y}$ ; (v) we re-create $\hat{p}_{Z}^{\mathrm{MC}}$ using MC simulation to draw $N$ samples $\{z_{1},z_{2},\dots,z_{N}\}$ from $p_{Z}$ ; finally, (vi) we plot $p_{Z}^{SL}$ against $\hat{p}_{Z}^{\mathrm{MC}}$ numerically, without any smoothing or interpolation. On the side, (vii) we use the samples $\{z_{1},z_{2},\dots,z_{N}\}$ to estimate moments via MC integration and then instantiate the moment-matching approximations $\hat{p}_{Z}^{\mathrm{GAUSS}}$ and $\hat{p}_{Z}^{\mathrm{BETA}}$ ; (viii) we use Equation 31 to compute the analytic moment-matching approximations ${p}_{Z}^{\mathrm{GAUSS}}$ and ${p}_{Z}^{\mathrm{BETA}}$ ; (ix) we plot the moment-matching approximations against $\hat{p}_{Z}^{\mathrm{MC}}$ numerically.

Results

Figures 8 and 9 illustrates the difference between the SL binomial multiplication $p_{Z}^{SL}$ and the approximation of the true pdf $p_{Z}$ plotted via MC. In some instances, $p_{Z}^{SL}$ seems to provide a very good approximation of $p_{Z}$ , as shown in Figure 8. In other instances, as shown in Figure 9, this approximation is more coarse, especially when it comes to values of the support near the extremes.

Refer to caption — Figure 8: Qualitative simulation. The first opinion $\omega_{X}$ has parameters $b=0.61$ , $d=0.30$ , $u=0.09$ and $a=0.79$ , the second opinion $\omega_{Y}$ has parameters $b=0.28$ , $d=0.66$ , $u=0.06$ and $a=0.46$ . The number of samples is $N=10^{5}$ .

The discrepancy shown in Figure 9 may be theoretically imputed to a poor approximation of the MC simulation due to a limited number of samples. In order to confute this hypothesis, another identical simulation with a number of samples one order of magnitude larger was run. Figure 10 shows that this simulation returned the same qualitative result. This suggests that the gap between $\hat{p}_{Z}^{\mathrm{MC}}$ and $p_{Z}^{SL}$ may not be imputed to a poor MC approximation.

Figure 11 offers a visual comparison of the approximations offered by $\hat{p}_{Z}^{\mathrm{MC}}$ and $p_{Z}^{SL}$ contrasted now with the Gaussian $\hat{p}_{Z}^{\mathrm{GAUSS}}$ , ${p}_{Z}^{\mathrm{GAUSS}}$ and the Beta $\hat{p}_{Z}^{\mathrm{BETA}}$ , ${p}_{Z}^{\mathrm{BETA}}$ approximations. The analytic and empirical (via MC) approximations behave in a very similar fashion. Overall, the Gaussian approximations are the farthest from $p_{Z}$ , while the Beta approximations follow very closely $p_{Z}^{SL}$ and $\hat{p}_{Z}^{\mathrm{MC}}$ ; in particular, the analytic Beta approximation ${p}_{Z}^{\mathrm{BETA}}$ almost overlap $p_{Z}^{SL}$ , because of a similar approach in evaluating the moments of $p_{Z}$ .

Discussion

This analysis suggests that the SL approximation $p_{Z}^{SL}$ may provide a good and useful estimation of the true pdf $p_{Z}$ ; indeed, $p_{Z}^{SL}$ follows very closely the shape $\hat{p}_{Z}^{\mathrm{MC}}$ which, in turn, is close to $p_{Z}$ . Given that $p_{Z}^{SL}$ consists of a smooth Beta distribution, it is not surprising that the approximation suffers the worst near the boundaries where the MC estimate $\hat{p}_{Z}^{\mathrm{MC}}$ diverges from $p_{Z}^{SL}$ , as shown in Figure 9. The results also discourage the naive possibility of using a Gaussian approximation, since $\hat{p}_{Z}^{\mathrm{GAUSS}}$ mismodels the true pdf $p_{Z}$ by centering the mean but spreading probability mass too widely beyond the domain $[0,1]$ . Instead, a Beta approximation $\hat{p}_{Z}^{\mathrm{BETA}}$ or ${p}_{Z}^{\mathrm{BETA}}$ provides an approximation qualitatively very close to $p_{Z}^{SL}$ and $\hat{p}_{Z}^{\mathrm{MC}}$ ; also, notice that the Beta approximation ${p}_{Z}^{\mathrm{BETA}}$ can be computed as cheaply as the SL approximation, with complexity $\mathcal{O}\left(1\right)$ , by evaluating its mean and variance analytically.

7.1.2 Quantitative simulations

Quantifying the gap between $p_{Z}$ and $p_{Z}^{SL}$ that we observed in the qualitative study above is the aim of the quantitative simulations.

Protocol

The first part of our quantitative protocol is the same as the qualitative protocol: (i-a) we sample two random opinions $\omega_{X}$ and $\omega_{Y}$ ; (ii-a) we compute their product $\omega_{Z}=\omega_{X}\cdot\omega_{Y}$ ; (iii-a) we project $\omega_{Z}$ onto the distribution $p_{Z}^{SL}$ ; (iv-a) we project the opinions $\omega_{X}$ and $\omega_{Y}$ onto the Beta distributions $p_{X}$ and $p_{Y}$ ; (v-a) we re-create $\hat{p}_{Z}^{\mathrm{MC}}$ using MC simulation to draw $N$ samples $\{z_{1},z_{2},\dots,z_{N}\}$ from $p_{Z}$ . Then, instead of plotting our results, (vi-a) we use a KDE to explicitly estimate $\hat{p}_{Z}^{\mathrm{KDE}}$ ; and, (vii-a) we compute via MC integration the area determined by the integral $\int_{0}^{1}\left|p_{Z}^{SL}(z)-\hat{p}_{Z}^{\mathrm{KDE}}(z)\right|dz$ . On the side, we compute moment-matching approximations as before: (viii) we use the samples $\{z_{1},z_{2},\dots,z_{N}\}$ to estimate moments via MC integration and then instantiate $\hat{p}_{Z}^{\mathrm{GAUSS}}$ and $\hat{p}_{Z}^{\mathrm{BETA}}$ ; (ix) we use Equation 31 to compute the analytic approximations ${p}_{Z}^{\mathrm{GAUSS}}$ and ${p}_{Z}^{\mathrm{BETA}}$ ; (x) we compute via MC integration the area determined by the absolute difference between $\hat{p}_{Z}^{\mathrm{KDE}}$ and each moment-matching approximation.

For completeness, we also run a simulation starting in the domain of pdfs: (i-b) we sample two random Beta pdfs $p_{X}$ and $p_{Y}$ ; (ii-b) we re-create $\hat{p}_{Z}^{\mathrm{MC}}$ using MC simulation to draw $N$ samples $\{z_{1},z_{2},\dots,z_{N}\}$ from $p_{Z}$ ; (iii-b) we use a KDE to explicitly estimate $\hat{p}_{Z}^{\mathrm{KDE}}$ ; (iv-b) we map the Beta distributions $p_{X}$ and $p_{Y}$ onto the opinions $\omega_{X}$ and $\omega_{Y}$ ; (v-b) we compute their product $\omega_{Z}=\omega_{X}\cdot\omega_{Y}$ ; (vi-b) we project $\omega_{Z}$ onto the distribution $p_{Z}^{SL}$ ; and, (vii-b) we compute via MC integration the area determined by the integral $\int_{0}^{1}\left|p_{Z}^{SL}(z)-\hat{p}_{Z}^{\mathrm{KDE}}(z)\right|dz$ . As before, we also compute distances between $\hat{p}_{Z}^{\mathrm{KDE}}$ and empirical (via MC) and analytical moment-matching approximations as explained in the steps (viii)-(x) above.

In order to get significant statistical result, we repeat each simulation $100$ times and we compute the mean and the standard deviation of the distance $\hat{D}_{I}\left[p_{Z},p_{Z}^{SL}\right]$ .

Notice that, since the pdf $\hat{p}_{Z}^{\mathrm{KDE}}$ that we are trying to estimate is defined on a bounded interval, using a Gaussian kernel for KDE is a sub-optimal choice. The Gaussian kernel distributes the mass of probability over the entire real line, and thus we would inevitably spill part of the probability mass beyond the domain $[0,1]$ . To solve this problem we adopt the logit trick [12]: instead of applying a Gaussian KDE to estimate $\hat{p}_{Z}^{\mathrm{KDE}}$ directly from the samples $\{z_{1},z_{2},\dots,z_{N}\}$ , we use a logit transform $\textnormal{logit}(x)=\log\frac{x}{1-x}$ to project the sample $\{z_{1},z_{2},\dots,z_{N}\}$ onto the entire real line; we then apply a Gaussian KDE to the projected samples and rescale back the learned pdf to $\hat{p}_{Z}^{\mathrm{KDE}}$ .

Refer to Figure 7 for the diagram of the experimental protocol for the quantitative simulations.

Results

Figure 12 shows the variation in the distances $\hat{D}\left[p_{Z},\cdot\right]$ estimated as $D\left[\hat{p}_{Z}^{\mathrm{KDE}}(N),\cdot\right]$ as a function of the number samples $N$ generated in the MC simulation. All the statistics are computed from $100$ repetitions and using $10^{3}$ uniformly sampled points on the support $[0,1]$ to perform MC integration.

The stable trend of all the distances $D\left[\hat{p}_{Z}^{\mathrm{KDE}}(N),\cdot\right]$ suggests that the MC simulations sampled enough points, for all the values of $N$ that we considered, to return a good approximation.

More importantly, recall that our whole analysis holds only if Equation 20 is satisfied. Using $N=10^{5}$ , we know from Equation 38 that the bias in evaluating $\hat{p}_{Z}^{\mathrm{KDE}}(N)$ is in the order of $10^{-4}$ . Thus compared to the scale of the mean and variance error in our results, which are in the scale of $10^{-1}$ , we can confirm that the bias is negligible. We can then state that $D\left[\hat{p}_{Z}^{\mathrm{KDE}}(N),\cdot\right]$ does indeed provide a good estimate of $\hat{D}\left[p_{Z},\cdot\right]$ .

Consistently with the previous experiments, the Gaussian approximations provides the worst approximation. Indeed, with distances $D\left[\hat{p}_{Z}^{\mathrm{KDE}}(N),\hat{p}_{Z}^{\mathrm{GAUSS}}\right]$ and $D\left[\hat{p}_{Z}^{\mathrm{KDE}}(N),{p}_{Z}^{\mathrm{GAUSS}}\right]$ averaging around $0.30-0.35$ , we can expect one sixth of the probability mass of a Gaussian approximation not to overlap with the true distribution $p_{Z}$ .

The Beta approximations $\hat{p}_{Z}^{\mathrm{BETA}}$ and ${p}_{Z}^{\mathrm{BETA}}$ clearly offer a better solution. Even if the product of two Beta distributions $p_{Z}$ is not a Beta distribution, it is clear from these results that the shape of $p_{Z}$ is in general very close to a Beta pdf. Indeed, the expected value of $D\left[\hat{p}_{Z}^{\mathrm{KDE}}(N),\hat{p}_{Z}^{\mathrm{BETA}}\right]$ and $D\left[\hat{p}_{Z}^{\mathrm{KDE}}(N),{p}_{Z}^{\mathrm{BETA}}\right]$ point out that $95\%$ of the mass of a Beta approximation and $p_{Z}$ overlap with very limited variance.

The SL approximation $p_{Z}^{SL}$ also offers a good solution. The result of the simulation in which we started from opinions and the one in which we started from Beta pdfs are extremely close. This offers a confirmation of the robustness of the transformations between the domain of opinions and the domain of pdfs. Overall, the expected value of $D\left[\hat{p}_{Z}^{\mathrm{KDE}}(N),p_{Z}^{SL}\right]$ suggests that the typical overlap between the mass of $p_{Z}^{SL}$ and $p_{Z}$ settles around $90\%$ , slightly worse than the Beta approximation. The high variance points to a strong case-by-case variability: in certain scenario $p_{Z}^{SL}$ may provide a model as good or better than $\hat{p}_{Z}^{\mathrm{BETA}}$ or ${p}_{Z}^{\mathrm{BETA}}$ , but on other instances its quality may degrade further.

Discussion

The results of our quantitative analysis agree with the qualitative study. A Gaussian approximation $\hat{p}_{Z}^{\mathrm{GAUSS}}$ or ${p}_{Z}^{\mathrm{GAUSS}}$ was shown to be a poor choice for modelling the product of two Beta distributions (and, for this reason, we will drop this approximation from the next simulations). Instead, the SL approximation $p_{Z}^{SL}$ and the Beta approximations $\hat{p}_{Z}^{\mathrm{BETA}}$ or ${p}_{Z}^{\mathrm{BETA}}$ are both good approximations, assuming that we can accept a difference between the true pdf and the approximation up to $5\%-10\%$ of the probability density. With limited computational resources, ${p}_{Z}^{\mathrm{BETA}}$ seems to be, on average, the best bet.

7.1.3 Limit-case Study

In this limit-case study, we consider the worst-case scenario considered in [5]. This study provides a way to enrich the previous study and reconnect this paper to it.

Protocol

We quantitatively analyze the case in which both opinions $\omega_{X}$ and $\omega_{Y}$ are degenerate Beta pdfs of the form $\mathtt{Beta}\left(1,1\right)$ with $a=\frac{1}{2}$ . To provide a quantitative analysis we follow the same protocol used in Section 7.1.2: first, we derive the MC approximation $\hat{p}_{Z}^{\mathrm{KDE}}$ (using the KDE algorithm), the SL approximation $p_{Z}^{SL}$ , the empirical (via MC) Beta approximation $\hat{p}_{Z}^{\mathrm{BETA}}$ and the analytic Beta approximation ${p}_{Z}^{\mathrm{BETA}}$ ; then, we compute the distance between the aforementioned distributions and the true pdf $p_{Z}$ , whose exact form, $-\log(z)$ , is given in [5].

Results

Table 2 shows the evaluation of the distance $\hat{D}\left[p_{Z},\cdot\right]$ with respect to the true pdf $p_{Z}=-\log(z)$ , when performing MC simulations with $N=10^{6}$ points and using $10^{3}$ uniformly sampled points on the support $[0,1]$ to perform MC integration.

The results show that the difference between the approximations is about one order of magnitude from each other. The MC approximation is, as expected, very close to the true pdf $p_{Z}$ , with a distance averaging around $9\cdot 10^{-3}$ . This is higher than the theoretically computed value, likely due to the fact that we are evaluating a limit case; however, this difference is still small enough to allow us a comparison with the other approximations. The Beta approximations have a slightly higher distance around $3\cdot 10^{-2}$ . Finally, the SL approximation has the highest distance at around $2\cdot 10^{-1}$ , meaning that the probability mass of $p_{Z}^{SL}$ and $p_{Z}$ overlap for about $90\%$ . All the results also show a high variance, which is caused by the difficulty in numerically approximating values near $0$ , where the true pdf $p_{Z}=-\log(z)$ diverges.

	$\hat{D}\left[p_{Z},\cdot\right]$
KDE	0.00871 $\pm$ 0.01242
SL	0.20793 $\pm$ 0.93301
Beta (MC)	0.03070 $\pm$ 0.10018
Beta (Analytic)	0.03635 $\pm$ 0.14617

Table 2: Limit case study. Evaluation of the distance

\hat{D}\left[p_{Z},\cdot\right]

when using a MC approximation

\hat{p}_{Z}^{\mathrm{MC}}

, a SL approximation

p_{Z}^{SL}

and an empirical Beta approximation

\hat{p}_{Z}^{\mathrm{BETA}}

, and an analytic Beta approximation

{p}_{Z}^{\mathrm{BETA}}

Discussion

The results are consistent with our previous results obtained in the quantitative analysis in Section 7.1.2 and they confirm that the scenario considered in [5] with two degenerate Beta pdfs of the form $\mathtt{Beta}\left(1,1\right)$ and $a=\frac{1}{2}$ is indeed a hard case for SL approximation. The MC approximation performs better in modeling the true form of the pdf $-\log(z)$ and, compared to it, the SL approximation is two orders of magnitude less precise in terms of integral distance. This simulation thus clearly highlights the cost in terms of accuracy that the computational simplicity of SL implies.

7.1.4 Multiple Products

In this last experimental section we consider the product of multiple opinions and we examine how approximation spread.

Protocol

We quantitatively evaluate the product of multiple opinions $\omega_{X_{1}}$ , $\omega_{X_{2}}$ … $\omega_{X_{L}}$ by randomly sampling $L$ opinions and then defining the product $\omega_{Z}=\left(\left(\left(\omega_{X_{1}}\cdot\omega_{X_{2}}\right)\cdot\omega_{X_{3}}\right)\dots\cdot\omega_{X_{L}}\right)$ and $p_{Z}=p_{X_{1}}\cdot p_{X_{2}}\dots\cdot p_{X_{L}}$ . The following analysis adopts the same protocol used in the quantitative simulations in Section 7.1.2 in order to compute the SL approximation $p_{Z}^{SL}$ and the analytic Beta approximation ${p}_{Z}^{\mathrm{BETA}}$ , and then evaluate the integral distances $\hat{D}_{I}\left[p_{Z},\cdot\right]\hat{=}D\left[\hat{p}_{Z}^{\mathrm{KDE}}(N),\cdot\right]$ , where now the final pdf over $Z$ is given by the product of multiple opinions. Notice that, while the convergence properties of the MC simulation remains the same, we may expect the precision of the SL approximation and the Beta approximation to degrade over multiple products as successive approximations cumulate.

Results

Figure 13 shows the variation of the distance $D\left[\hat{p}_{Z}^{\mathrm{KDE}}(N),p_{Z}^{SL}\right]$ and $D\left[\hat{p}_{Z}^{\mathrm{KDE}}(N),{p}_{Z}^{\mathrm{BETA}}\right]$ as a function of the number $L$ of opinions that are multiplied together to determine $Z$ . A slight increase in the degree of approximation may be observed as the number of factors increases from $2$ to $5$ .

Discussion

The hypothesis that the degree of approximation of the SL operator and analytical Beta approximation degrades over multiple products because of the accumulation of approximation appears to be correct. As more factors are taken into consideration, both $p_{Z}^{SL}$ and ${p}_{Z}^{\mathrm{BETA}}$ slowly diverges from the true pdf $p_{Z}$ . A modeler should be aware of these dynamics in case she were to use SL to approximate the product of multiple Beta random variables.

8 Case Study: Fusion of Beta Distributions

In this section, we further showcase the versatility of our framework by applying it to yet another subjective logic operator. We first provide the definition of fusion of Beta distributions; we then introduce the SL operator for fusion and we discuss its approximation and complexity; finally, we apply our framework to the problem of getting an approximation of fusion and we run empirical simulations to validate our theoretical results.

Fusion of Beta pdfs

X\circledcirc Y=\frac{x\cdot y}{x\cdot y+(1-x)\cdot(1-y)},

where $x$ and $y$ are realizations of the random variables $X$ and $Y$ . As in the case of the product of Beta random variables, determining the shape of $p_{Z}$ is a non-trivial problem. MC simulations offer a robust method to sample points from the probability distribution of $Z$ and to estimate $p_{Z}$ by moment-matching or kernel density estimation. Notice that, differently from the previous case study, we do not have an exact analytical solution for computing the first two moments of $p_{Z}$ [6].

Subjective logic fusion

A SL operator may be instantiated to compute an approximate fusion over two binomial opinions. Given two binomial opinions $\omega_{X}$ and $\omega_{Y}$ defined on the same domain $\Omega$ and with the same prior $a$ , we define the binomial opinion $\omega_{Z}$ resulting from the fusion $\omega_{X}\circledcirc\omega_{Y}$ as:

\omega_{Z}=\begin{cases}b_{Z}=\frac{m_{Z}s_{Z}-Wa}{s_{Z}}\\ d_{Z}=\frac{(1-m_{Z})s_{Z}-W(1-a)}{s_{Z}}\\ u_{Z}=\frac{W}{s_{Z}}\\ a_{Z}=a,\end{cases}

(39)

where

m_{Z}=\frac{b_{X}b_{Y}+b_{Y}au_{X}+b_{X}au_{Y}+a^{2}u_{X}u_{Y}}{2\left(b_{X}b_{Y}+b_{Y}au_{X}+b_{X}au_{Y}+a^{2}u_{X}u_{Y}\right)+1-b_{Y}-au_{Y}-b_{X}-au_{X}}\\

(40)

\begin{split}s_{Z}=&\max\left\{\frac{Wa}{m_{Z}},\frac{W(1-a)}{(1-m_{Z})},\right.\\ &\left.\frac{\left(\frac{W}{u_{X}}+1\right)\left(\frac{W}{u_{Y}}+1\right)\left(b_{X}-b_{X}^{2}-au_{X}-a^{2}u_{X}^{2}\right)\left(b_{Y}-b_{Y}^{2}-au_{Y}-a^{2}u_{Y}^{2}\right)}{m_{Z}(1-m_{Z})\left[\left(\frac{W}{u_{Y}}+1\right)\left(b_{Y}-b_{Y}^{2}-au_{Y}-a^{2}u_{Y}^{2}\right)+\left(\frac{W}{u_{X}}+1\right)\left(b_{X}-b_{X}^{2}-au_{X}-a^{2}u_{X}^{2}\right)\right]}-1\right\}.\end{split}

This formula expresses in the subjective logic formalism the moment-matching approximation of fusion defined in [6]. Practically, a fusion operator allows us to evaluate the aggregation of two different opinions on the same fact.

Approximating the product of Beta pdfs

Now, if we are interested in computing the fusion $Z=X\circledcirc Y$ , where $X$ and $Y$ are two independent Beta random variables, we may rely either on the SL approximation defined in Equation 39 or on approximations computed via MC simulations; as before we will consider the following approximations for the true pdf $p_{Z}$ : a SL pdf $p_{Z}^{SL}$ , a KDE estimation $\hat{p}_{Z}^{\mathrm{KDE}}$ , a Beta and a Gaussian moment-matching approximation $\hat{p}_{Z}^{\mathrm{BETA}}$ , $\hat{p}_{Z}^{\mathrm{GAUSS}}$ computed by evaluating mean and variance of $p_{Z}$ via MC integration. Differently from before, we do not consider an exact analytical moment-matching approximation ( ${p}_{Z}^{\mathrm{BETA}}$ or ${p}_{Z}^{\mathrm{GAUSS}}$ ) because no exact analytical formula exists. Notice, also, that since we used the moment-matching approximation provided in [6] to define the SL operator in Equation 39, our results on the degree of approximation of the SL operator immediately extend to Operator 1 defined in [6].

Figure 14 provides the concrete instantiation of the diagram in Figure 4 for the SL operator of fusion and illustrates the alternative between the path of MC simulations and SL approximation.

Figure 14: Approximations of the fusion of two Beta pdfs,

p_{X}

and

p_{Y}

. MCS stands for MC sampling, MCI stands for MC integration, KDE stands for kernel-density estimation.

Computational complexity of the fusion operator

Despite the more involved expressions in Equation 39 and 40, the computational complexity for evaluating $\omega_{Z}$ is still constant with respect to the given opinions $\omega_{X}$ and $\omega_{Y}$ ; as such, the asymptotic complexity of the SL operator for fusion is $\mathcal{O}\left(1\right)$ .

Approximation of the binomial operator

The degree of approximation of the SL operator for fusion is equivalent to the approximation of the moment-matching solution on which it is defined; [6] offers an evaluation of this approximation within the wider context of belief propagation in second-order Bayesian networks. In the following paragraphs, we will instead aim at estimating, in a more directed way, the approximation of the SL operator via the computation of the distance $\hat{D}_{I}\left[p_{Z},p_{Z}^{SL}\right]$ .

Relating the binomial multiplication and Monte Carlo approximations

Following the approach described in the previous section, we will compute a numerical estimation of the degree of approximation of the SL operator for fusion relying on the framework described in Section 6.

Once again, we need to check that the basic condition expressed in Equation 20 requiring the bias of $\hat{p}_{Z}^{\mathrm{KDE}}$ to be negligible is satisfied. Given that we will compute KDE using a Gaussian kernel with width defined by the Silverman rule (Equation 16), and given that the support of $Z$ is bounded on $\left[0,1\right]$ , we can again expect the bias of our KDE estimator to be in the order of $\left(10^{-1}N^{-\frac{1}{5}}\right)^{2}$ (Equation 38). Thus, the KDE bias may be considered negligible in the estimation of distances, if the distances we are considering are orders of magnitude greater than this bias. If this condition is met, then we can estimate the degree of approximation of fusion adopting the framework defined in Figure 5 and instantiated in Figure 15.

Figure 15: Evaluations of the distances between

p_{Z}

and its approximations based on the assumption that

D\left[p_{Z},\hat{p}_{Z}^{\mathrm{KDE}}(N)\right]\approx 0

. MCS stands for MC sampling, MCI stands for MC integration, KDE stands for kernel-density estimation.

8.1 Empirical Evaluation

In this section we describe our experimental simulations for the evaluation of the degree of approximation of fusion. We will first provide a qualitative assessment of the SL approximation $p_{Z}^{SL}$ and the approximations generated via MC simulations; then, we will provide a quantitative statistical evaluation of the distance $D\left[p_{Z},\cdot\right]$ for the SL approximation and for Beta and Gaussian moment-matching approximations.

In our simulations we randomly generate opinion and pdfs using the same protocol defined in Section 7.1. These simulations are also run using the WebPPL probabilistic programming language [3] and the scripts are available online⁴⁴4https://github.com/FMZennaro/SLMC/Fusion.

8.1.1 Qualitative simulations

In the qualitative simulations we try to offer a first visual assessment of the quality of the approximation of $p_{Z}^{SL}$ .

Protocol

We follow the same protocol defined for the qualitative simulations in Section 7.1, just replacing the operations for binomial multiplication with the operations for fusion.

Results

Figures 16 and 17 offer a visual comparison of the approximations offered by $\hat{p}_{Z}^{\mathrm{MC}}$ and $p_{Z}^{SL}$ along with the Gaussian $\hat{p}_{Z}^{\mathrm{GAUSS}}$ and the Beta $\hat{p}_{Z}^{\mathrm{BETA}}$ moment-matching approximations computed via MC integration. While the pdf underlying the fusion of two random variables is not Beta distributed, both the SL approximation $p_{Z}^{SL}$ and the moment-matching approximation $\hat{p}_{Z}^{\mathrm{BETA}}$ seem to offer a good coarse approximation of the true distribution $p_{Z}$ ; both approximations match very well the true distribution within the support $[0,1]$ , but they may show problems modelling the behaviour of $p_{Z}$ near the boundaries of the domain. As before, the Gaussian approximation $\hat{p}_{Z}^{\mathrm{GAUSS}}$ offers the worst match, as its shape is not ideal to model a pdf on a bounded domain.

Discussion

This qualitative assessment suggests that the SL approximation $p_{Z}^{SL}$ may provide a good enough approximation of the true pdf $p_{Z}$ at a very low computational cost. The Beta moment-matching approximation evaluated computing mean and variance via MC simulations seems to behave in a very similar fashion; however, in this case where no exact analytical moment-matching approximation is possible, the cost of evaluating $\hat{p}_{Z}^{\mathrm{BETA}}$ corresponds to the cost of running the whole MC simulation. Last, the Gaussian moment-matching approximation $\hat{p}_{Z}^{\mathrm{GAUSS}}$ constitutes a sub-optimal choice, as its fit is worse than the alternative approximations and its computational cost is equivalent to the Beta moment-matching approximation.

8.1.2 Quantitative simulations

We now move to quantifying the gap between $p_{Z}$ and $p_{Z}^{SL}$ observed in the above simulations.

Protocol

We follow the same protocol defined for the quantitative simulations in Section 7.1, just replacing the operations for binomial multiplication with the operations for fusion. As before, Figure 7 offers a diagram of our experimental protocol.

Results

Figure 18 presents the estimation of the distances $\hat{D}\left[p_{Z},\cdot\right]$ as a function of the number samples $N$ generated in the MC simulation. These statistics are computed performing $100$ repetitions and using $10^{3}$ uniformly sampled points on the support $[0,1]$ for MC integration.

In general, as in the previous simulation, we can make two preliminary observation: (i) all the distances show a stable trend, thus suggesting that our MC results are approaching their asymptotic limit; (ii) the order of magnitude of the distances is significantly greater than the KDE bias (Equation 38), thus meaning that its bias is negligible with respect to the distances.

Confirming the previous qualitative investigation, the distance between the true pdf $p_{Z}$ and the Gaussian approximation $\hat{p}_{Z}^{\mathrm{GAUSS}}$ reaches values as high as $0.5$ , meaning that up to one fourth of the probability mass of $\hat{p}_{Z}^{\mathrm{GAUSS}}$ does not to overlap with the true distribution $p_{Z}$ .

The Beta approximation $\hat{p}_{Z}^{\mathrm{BETA}}$ models the true pdf $p_{Z}$ better, with a distance around $0.10-0.15$ , suggesting a good approximation in which $95\%$ of the mass of $\hat{p}_{Z}^{\mathrm{BETA}}$ overlaps with $p_{Z}$ .

Similarly, the SL approximation $p_{Z}^{SL}$ provides an equally good solution. The values of distance for $p_{Z}^{SL}$ are well within the range of the standard deviation of $\hat{p}_{Z}^{\mathrm{BETA}}$ , suggesting that the degree of approximation of SL and Beta moment-matching are very close.

Discussion

The results of this analysis agree with the previous qualitative simulations. Moreover, even if these results are quantitatively different, they are qualitatively in line with our study of the binomial product operator. The quantitative difference between the approximation of the binomial product and the fusion may be likely ascribed to the fact that it is easier to model the product of independent random variables instead of an arbitrary operation like fusion. From a qualitative point of view, though, the Gaussian approximation $\hat{p}_{Z}^{\mathrm{GAUSS}}$ ranks again last among the modeling options, while the SL approximation $p_{Z}^{SL}$ and the Beta approximations $\hat{p}_{Z}^{\mathrm{BETA}}$ offer better solutions, at a computational cost that is constant (in the case of SL) or linear in the number of MC samples (in the case of Beta approximation).

9 Conclusion and Future Work

In this paper we studied the use of subjective logic as a framework for approximating operations over probability distributions. As in the case of any approximation, we considered SL operators from the perspective of the trade-off between the computational simplicity they guarantee and the precision they sacrifice. We proposed a protocol based on MC simulations to evaluate quantitatively this trade-off, estimating the distance between the SL approximation and a KDE estimation, under the assumption of a negligible bias between the KDE reconstruction and the true probability distribution.

We applied our protocol to the case study of the product and the fusion of two independent Beta distributions. The first case is relevant to fields like reliability analysis, while the second one is used in the field of subjective logic. In general, SL operators guarantee the preservation of the first moment, but do not strictly preserve higher moments or quantiles. To quantify the degree of approximation of the SL operators, we compared them with other standard approximations, such as moment-matching with a Gaussian pdf, moment-matching with a Beta pdf, and KDE via MC. Our simulations showed that, at the cost of accepting a difference between the SL approximation and the true pdf, SL offers a computationally efficient approximation. Both in the case of binomial products and in the case of fusion, the degree of approximation can be quantified in a mismatch between the SL approximation and the true pdf of up to $10\%$ of the probability mass. In general, KDE approximation and Beta approximation provided better estimation; KDE, however, has a computational cost that is quadratic in the number of samples generated via MC; moment-matching has a computational cost that can be constant and comparable to SL when moments of interest can be computed analytically, or, otherwise, linear in the number of samples generated via MC.

In summary, it is possible to enjoy the computational efficiency and the interpretability of SL if the modeling scenario allows room for approximations up to the amount estimated using our protocol. The recommendation is that, were SL operators to be used to model critical systems (as in the case of reliability analysis or when higher-order moments are critical), this divergence between the true pdf and the SL approximation that we highlighted should be factored in the analysis.

Further work will be developed for better characterizing the difference between true pdfs and SL approximations; in particular, understanding how the mass is differently allocated with respect to the overall shape of the pdf, whether, for instance, these differences are more accentuated near the mode (assuming one exists) or around the tail. According to the way in which probability mass is misplaced in SL approximations different forms of correction may be then considered.

Acknowledgment

The authors would like to express their gratitude to the anonymous reviewers who commented on the first version of this article and helped improving it.

References

[1] CA Coelho and RP Alberto. On the distribution of the product of independent beta random variables–applications. Technical report, Technical report, CMA 12 Google Scholar, 2012.
[2] Zoubin Ghahramani. Probabilistic machine learning and artificial intelligence. Nature, 521(7553):452, 2015.
[3] Noah D Goodman and Andreas Stuhlmüller. The design and implementation of probabilistic programming languages, 2014.
[4] A. Jøsang. Subjective Logic: A Formalism for Reasoning Under Uncertainty. Artificial Intelligence: Foundations, Theory, and Algorithms. Springer International Publishing, 2016.
[5] Audun Jøsang and David McAnally. Multiplication and comultiplication of beliefs. International Journal of Approximate Reasoning, 38(1):19–51, 2005.
[6] Lance Kaplan and Magdalena Ivanovska. Efficient belief propagation in second-order Bayesian networks for singly-connected graphs. International Journal of Approximate Reasoning, 93:132–152, 2018.
[7] Lance M Kaplan, Murat Şensoy, Yuqing Tang, Supriyo Chakraborty, Chatschik Bisdikian, and Geeth de Mel. Reasoning under uncertainty: Variations of subjective logic deduction. In Information Fusion (FUSION), 2013 16th International Conference on, pages 1910–1917. IEEE, 2013.
[8] David J.C. MacKay. Information theory, inference, and learning algorithms. Cambridge University Press, 2003.
[9] TG Pham and N Turkkan. Reliability of a standby system with beta-distributed component lives. IEEE Transactions on Reliability, 43(1):71–75, 1994.
[10] Thu Pham-Gia and Noyan Turkkan. The product and quotient of general beta distributions. Statistical Papers, 43(4):537–550, 2002.
[11] Jose C. Principe. Information theoretic learning: Rényi’s entropy and kernel perspectives. Springer, 2010.
[12] Cosma Shalizi. Advanced data analysis from an elementary point of view, 2013.
[13] Bharath K Sriperumbudur, Kenji Fukumizu, Arthur Gretton, Bernhard Schölkopf, and Gert RG Lanckriet. On integral probability metrics, $\backslash$ phi-divergences and binary classification. arXiv preprint arXiv:0901.2698, 2009.
[14] Jen Tang and AK Gupta. On the distribution of the product of independent beta random variables. Statistics & Probability Letters, 2(3):165–168, 1984.