Suppose \(X\) has \(f\) as pdf. For any evaluation \(x\) of \(X\), produce a random variable with conditional distribution, \[Y\sim\text{Bernoulli}\left(\frac{(1-a)h(x)}{(ag(x)+(1-a)h(x)}\right).\] Then \(X|(Y=0)\sim g\) and \(X|(Y=1)\sim h\) since \begin{align} p(x|0) &= \frac{p(0|x)p(x)}{\sum_{x'}p(0|x')p(x')} = \frac{\frac{ag(x)}{f(x)}f(x)}{\sum_{x'}\frac{ag(x)}{f(x)}f(x)}\\ &= \frac{ag(x)}{\sum_{x'}ag(x')} = \frac{a\cdot g(x)}{a\cdot1} = g(x)\\ p(x|1) &= \frac{p(1|x)p(x)}{\sum_{x'}p(1|x')p(x')} = \frac{\frac{(1-a)h(x)}{f(x)}f(x)}{\sum_{x'}\frac{(1-a)h(x)}{f(x)}f(x)}\\ &= \frac{(1-a)h(x)}{\sum_{x'}(1-a)h(x')} = \frac{(1-a)\cdot h(x)}{(1-a)\cdot1} = h(x). \end{align}
The utility of this construction is as follows: It lets you turn a probabilistic experiment with definite outcomes (say the probabilities are \([0.2,\,0.6,\,0.2]\)) into an experiment with outcomes which are again probability distributions, such as \([0.4,\,0.6,\,0]\) and \([0,\,0.6,\,0.4]\), each with probability \(\frac12\). This blurs the distinction between deterministic variables (or often equivalently zero variance, zero entropy) and nondeterministic variables (positive variance, positive entropy).
In fact, it suggests that entropy can be calculated from the coefficients of an arbitrary convex combination. In the example just given, the entropy of \([0.2,\,0.6,\,0.2]\)) with respect to \([0.4,\,0.6,\,0]\) and \([0,\,0.6,\,0.4]\) is the entropy of \([0.5,\,0.5]\) with respect to \([1,\,0]\) and \([0,\,1]\), i.e. \(\frac12\log2+\frac12\log2=1\).
And why not try affine combinations! \[[0.2,\,0.6,\,0.2] = 2\cdot[\frac13,\,\frac13,\,\frac13]-1\cdot[\frac7{15},\,\frac1{15},\,\frac7{15}].\] The associated entropy in base 2 is \[-2\log2+1\log(-1) = -2+\pi i\log e = -2 + 4.53236i.\]
No comments:
Post a Comment