This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Entropic probability and context states

Benjamin Schumacher
Department of Physics, Kenyon College
Corresponding author: Department of Physics, Kenyon College, Gambier, OH 43022 USA. E-mail schumacherb@kenyon.edu
   Michael D. Westmoreland
Department of Mathematics, Denison University
Abstract

In a previous paper, we introduced an axiomatic system for information thermodynamics, deriving an entropy function that includes both thermodynamic and information components. From this function we derived an entropic probability distribution for certain uniform collections of states. Here we extend the concept of entropic probability to more general collections, augmenting the states by reservoir and context states. This leads to an abstract concept of free energy and establishes a relation between free energy, information erasure, and generalized work.

1 Introduction

In [1], we developed an axiomatic system for thermodynamics that incorporated information as a fundamental concept. This system was inspired by previous axiomatic approaches [2, 3] and discussions of Maxwell’s demon [4, 5]. The basic concept of our system is the eidostate, which is a collection of possible states from the point of view of some agent. A review of our axioms and a few their consequences can be found in the Appendix. The axioms imply the existence of additive conserved quantities called components of content and an entropy function 𝕊\mathbb{S} that identifies reversible and irreversible processes. The entropy includes both thermodynamic and information components.

One of the surprising things about this axiomatic system is that, despite the absence of probabilistic ideas in the axioms, a concept of probability emerges from the entropy 𝕊\mathbb{S}. If state ee is an element of a uniform eidostate EE, then we can define

P(e|E)=2𝕊(e)2𝕊(E).P(e|E)=\frac{2^{\mathbb{S}(e)}}{2^{\mathbb{S}(E)}}. (1)

States in EE with higher entropy are assigned higher probability. As we will review below, this distribution has a uniquely simple relationship to the entropies of the individual states and the overall eidostate EE.

The emergence of an entropic probability distribution motivates us to ask several questions. Can this idea be extended beyond uniform eidostates? Can we interpret an arbitrary probability distribution over a set of states as an entropic distribution within a wider context? What does the entropic probability tell us about probabilistic processes affecting the states within our axiomatic system? In this paper we will address these questions.

2 Coin-and-box model

We first review a few of the ideas of the system in [1] by introducing a simple model of the axioms. None of our later results depend on this model, but a definite example will be convenient for explanatory purposes. Our theory deals with configurations of coins and boxes; as we will see below, the states are arrangements of coins, memory records, and closed boxes containing coins. States are combined together using the ++ operation, which simply stands for ordered pairing of two states. If aba\neq b, a+ba+b is not the same as b+ab+a, and a+(a+a)a+(a+a) is distinct from (a+a)+a(a+a)+a. Thus, the combination operation ++ is neither commutative nor associative.

We construct our states from some elementary pieces:

  • Coin states, which can be either hh (heads) or tt (tails) or combinations of these. It is also convenient to define a stack state sns_{n} to be a particular combination of nn coin states hh: sn=h+(h+(h+))s_{n}=h+(h+(h+\cdots)). The coin value QQ of a compound of coin states is just the total number of coins involved. A finite set KK of coin states is said to be QQ-uniform if every element has the same QQ-value.

  • Record states rr. As the name suggests, these should be interpreted as specific values in some available memory register. The combination of two record states is another record state. Thus, rr, r+rr+r, r+(r+r)r+(r+r), etc., are all distinct record states. Record states are not coins, so Q(r)=0Q(r)=0.

  • Box states. For any QQ-uniform set of coin states CC, there is a sequence of box states bnKb^{K}_{n}. Intuitively, this represents a kind of closed box containing nQ(K)nQ(K) coins, so that Q(bnK)=nQ(K)Q\left(b^{K}_{n}\right)=nQ(K). If K={h,t}K=\{h,t\} then we denote the corresponding “basic” box states by bnb_{n}.

An eidostate is any finite, non-empty, QQ-uniform set of states. The ++ operation on eidostates is just the Cartesian product of the sets, and always yields another eidostate. For convenience, we identify the state ss with the singleton eidostate {s}\{s\}.

We now must define the relation \rightarrow, which tells us which states can be transformed into which other states. We will first give some elementary relations:

  • Two eidostates are similar (written ABA\sim B) if they are composed of the same Cartesian factors, perhaps combined in a different way. If ABA\sim B, then ABA\leftrightarrow B. (The notation \leftrightarrow means ABA\rightarrow B and BAB\rightarrow A.) As far as the \rightarrow relation is concerned, we can freely rearrange the “pieces” in a compound eidostate.

  • For coin states, hth\leftrightarrow t.

  • If rr is a record state, a+raa+r\leftrightarrow a for any aa. In a similar way, for an empty box state, a+b0Kaa+b^{K}_{0}\leftrightarrow a.

  • If KK is a QQ-uniform eidostate of coin states, bnK+Kbn+1Kb^{K}_{n}+K\leftrightarrow b^{K}_{n+1}.

Now we add some rules that allow us to extend these to more complex situations. In what follows, AA, AA^{\prime}, BB, etc., are eidostates, and ss is a state.

Transitivity.

If ABA\rightarrow B and BCB\rightarrow C, then ACA\rightarrow C.

Augmentation.

If ABA\rightarrow B, then A+CB+CA+C\rightarrow B+C.

Cancelation.

If A+sB+sA+s\rightarrow B+s, then ABA\rightarrow B.

Subset.

If AsA\rightarrow s and AAA^{\prime}\subseteq A, then AsA^{\prime}\rightarrow s.

Disjoint union.

If AA and BB are both disjoint unions A=A1A2A=A_{1}\cup A_{2} and B1B2B_{1}\cup B_{2}, and both A1B1A_{1}\rightarrow B_{1} and A2B2A_{2}\rightarrow B_{2}, then ABA\rightarrow B.

Using these rules we can prove a lot of \rightarrow relations. For example, for a basic box state we have bn+{h,t}bn+1b_{n}+\{h,t\}\leftrightarrow b_{n+1}. From the subset rule we have bn+hbn+1b_{n}+h\rightarrow b_{n+1} (but not the reverse). Then we can say,

bn+hbn+1bn+{h,t},b_{n}+h\rightarrow b_{n+1}\rightarrow b_{n}+\{h,t\}, (2)

from which we can conclude (via transitivity and cancelation) that h{h,t}h\rightarrow\{h,t\}. The use of a basic box allows us to “randomize” the state of one coin.

Or consider two coin states and distinct record states r0r_{0} and r1r_{1}. Then

hh+r0andtt+r1h+r1,h\rightarrow h+r_{0}\qquad\mbox{and}\qquad t\rightarrow t+r_{1}\rightarrow h+r_{1}, (3)

from which can show that {h,t}h+{r0,r1}\{h,t\}\rightarrow h+\{r_{0},r_{1}\}. That is, we can set an unknown coin state to hh, if we also make a record of which state it is. A pretty similar argument establishes the following:

(h+{r0,r1})+bn\displaystyle(h+\{r_{0},r_{1}\})+b_{n} \displaystyle\rightarrow {h+r0,t+r1}+bn\displaystyle\{h+r_{0},t+r_{1}\}+b_{n} (4)
\displaystyle\rightarrow ({h,t}+r0)+bn{h,t}+bnbn+1.\displaystyle(\{h,t\}+r_{0})+b_{n}\rightarrow\{h,t\}+b_{n}\rightarrow b_{n+1}.

The eidostate {r0,r1}\{r_{0},r_{1}\} (called a bit state) can be deleted at the cost of a coin absorbed by the basic box. The basic box is a coin-operated deletion device; and since each step above is reversible, we can also use it to dispense a coin together with a bit state (that is, an unknown bit in a memory register).

These examples help us to clarify an important distinction. What is the difference between the box state b1b_{1} and the eidostate {h,t}\{h,t\}? Could we simply replace all box states bnKb^{K}_{n} with a simple combination K+(K+)K+(K+\ldots) of possible coin eidostates? We cannot, because such a replacement would preclude us from using the subset rule to obtain Equation 2. The whole point of the box state is that the detailed state of its contents is entirely inaccessible for determining possible processes. Putting a coin in a box effectively randomizes it.

It is not difficult to show that our model satisfies all of the axioms presented in the Appendix, with the mechanical states in \mathscr{M} identified as coin states. The key idea in the proof is that we can reversibly reduce any eidostate to one with a special form:

Asq+Ik,A\leftrightarrow s_{q}+I_{k}, (5)

where sqs_{q} is a stack state of q=Q(A)q=Q(A) coins and IkI_{k} is an information state containing kk possible record states. Relations between eidostates are thus reduced to relations between states of this form. We note that the coin value qq is conserved in every \rightarrow relation, and no relation allows us to decrease the value of kk. In our model, there is just one independent component of content (QQ itself), and the entropy function is 𝕊(A)=logk\mathbb{S}(A)=\log k. (We use base-2 logarithms throughout.)

3 The entropy formula and entropic probability

Now let us return to the general axiomatic system. A uniform eidostate EE is one for which, given any two states e,fEe,f\in E, either efe\rightarrow f or fef\rightarrow e. (We may write this disjunction as efe\rightleftharpoons f.) The set of all uniform eidostates is called 𝒰\mathcal{U}. Then the axioms imply the following theorem (Theorem 8 in [1]):

Theorem.

There exist an entropy function 𝕊\mathbb{S} and a set of components of content QQ on 𝒰\mathcal{U} with the following properties:

(a) For any E,F𝒰E,F\in\mbox{$\mathcal{U}$}, 𝕊(E+F)=𝕊(E)+𝕊(F)\mathbb{S}(E+F)=\mathbb{S}(E)+\mathbb{S}(F).

(b) For any E,F𝒰E,F\in\mbox{$\mathcal{U}$} and component of content QQ, Q(E+F)=Q(E)+Q(F)Q(E+F)=Q(E)+Q(F).

(c) For any E,F𝒰E,F\in\mbox{$\mathcal{U}$}, EFE\rightarrow F if and only if 𝕊(E)𝕊(F)\mathbb{S}(E)\leq\mathbb{S}(F) and Q(E)=Q(F)Q(E)=Q(F) for every component of content QQ.

(d) 𝕊(m)=0\mathbb{S}(m)=0 for all mm\in\mbox{$\mathscr{M}$}.

The entropy function 𝕊\mathbb{S} is determined111Up to a non-mechanical component of content by the \rightarrow relations among the eidostates.

We can compute the entropy of a uniform eidostate EE in terms of the entropies of its elements ee. This is

𝕊(E)=log(eE2𝕊(e)).\mathbb{S}(E)=\log\left(\sum_{e\in E}2^{\mathbb{S}(e)}\right). (6)

It is this equation that motivates our definition of the entropic probability of ee within the eidostate EE:

P(e|E)=2𝕊(e)2𝕊(E).P(e|E)=\frac{2^{\mathbb{S}(e)}}{2^{\mathbb{S}(E)}}. (7)

Then P(e|E)0P(e|E)\geq 0 and the probabilities sum over EE to 1. As we have mentioned, the entropy function 𝕊\mathbb{S} may not be quite unique; nevertheless, two different admissible entropy functions lead to the same entropic probability distribution. Even better, our definition gives us a very suggestive formula for the entropy of EE:

𝕊(E)\displaystyle\mathbb{S}(E) =\displaystyle= eEP(e|E)𝕊(e)eEP(e|E)logP(e|E)\displaystyle\sum_{e\in E}P(e|E)\mathbb{S}(e)-\sum_{e\in E}P(e|E)\log P(e|E) (8)
=\displaystyle= 𝕊(a)+H(P),\displaystyle\bigg{\langle}\mathbb{S}(a)\bigg{\rangle}+H(\vec{P}), (9)

where the mean \langle\cdots\rangle is taken with respect to the entropic probability, and H(P)H(\vec{P}) is the Shannon entropy of the distribution PP [6, 7].

Equation 9 is very special. If we choose an arbitrary distribution (say P(e|E)P^{\prime}(e|E)), then with respect to this probability we find

𝕊(E)𝕊(a)P+H(P),\mathbb{S}(E)\geq\biggl{\langle}\mathbb{S}(a)\biggr{\rangle}_{\!\!P^{\prime}}+H(\vec{P}^{\prime}), (10)

with equality if and only if PP^{\prime} is the entropic distribution [7]. Therefore we might define the entropic probability to be the distribution that maximizes the sum of average state entropy and Shannon entropy—a kind of “maximum entropy” characterization.

4 Uniformization

A unique entropic probability rule arises from our \rightarrow relations among eidostates, which in the real world might summarize empirical data about possible state transformations. But so far, this entropic probability distribution P(e|E)P(e|E) is only defined within a uniform eidostate EE.

In part this makes sense. An eidostate represents represents the knowledge of an agent—i.e., that the state must be one of those included in the set. This is the knowledge upon which the agent will assign probabilities, which is why we have indicated the eidostate EE as the condition for the distribution. Furthermore, these might be the only eidostates, since the axioms themselves do not guarantee that any non-uniform eidostates exist. (Some models of the axioms have them, and some do not.) But can we generalize the probabilities to distributions over non-uniform collections of states?

Suppose A={a,a,}A=\{a,a^{\prime},\ldots\} is a finite set of states, possibly not uniform. Then we say that AA is uniformizable if there exists a uniform eidostate A^={a+ma,a+ma,}\hat{A}=\{a+m_{a},a^{\prime}+m_{a^{\prime}},\ldots\}, where the states mam_{a} are mechanical states in \mathscr{M}. The idea is that the states in AA, which vary in their components of content, can be extended by mechanical states that “even out” these variations. Since A^\hat{A} is uniform, then a+maa+maa+m_{a}\rightleftharpoons a^{\prime}+m_{a^{\prime}} for any a,aAa,a^{\prime}\in A. The abstract process a,a\left\langle a,a^{\prime}\right\rangle is said to be adiabatically possible [2]. Mechanical states have 𝕊(ma)=0\mathbb{S}(m_{a})=0, so the entropy of the extended A^\hat{A} is just

𝕊(A^)=log(aA2𝕊(a)),\mathbb{S}(\hat{A})=\log\left(\sum_{a\in A}2^{\mathbb{S}(a)}\right), (11)

which is independent of our choice of the uniformizing mechanical states.

What is the significance of this entropy? Suppose AA and BB are not themselves uniform, but their union ABA\cup B is uniformizable. Then we may construct uniform eidostates A^={a+ma,}\hat{A}=\{a+m_{a},\ldots\} and B^={b+mb,}\hat{B}=\{b+m_{b},\ldots\} such that either A^B^\hat{A}\rightarrow\hat{B} or B^A^\hat{B}\rightarrow\hat{A}, depending on whether 𝕊(A^)𝕊(B^)\mathbb{S}(\hat{A})\geq\mathbb{S}(\hat{B}) or the reverse. In short, the entropies of the extended eidostates determine whether the set of states AA can be turned into the set BB, if we imagine that these states can be augmented by mechanical states, embedding them in a larger, uniform context.

Given the entropy of the extended state, we can define

P(a|A)=P(a+ma|A^)=2𝕊(a)2𝕊(A^).P(a|A)=P(a+m_{a}|\hat{A})=\frac{2^{\mathbb{S}(a)}}{2^{\mathbb{S}(\hat{A})}}. (12)

This extends the entropic probability to the uniformizable set AA.

Let us consider an example from our coin-and-box model. We start out with the non-uniform set B={bn,bn+1}B=\{b_{n},b_{n+1}\}. These two basic box states have different numbers of coins. But we can uniformize this set by adding stack states, so that B^={bn+sm+1,bn+1+sm}\hat{B}=\{b_{n}+s_{m+1},b_{n+1}+s_{m}\} is a uniform eidostate. The entropy of a basic box state is 𝕊(bn)=n\mathbb{S}(b_{n})=n, so we have

𝕊(B^)=log(2n+2n+1)=log(32n)=n+log3.\mathbb{S}(\hat{B})=\log\left(2^{n}+2^{n+1}\right)=\log\left(3\cdot 2^{n}\right)=n+\log 3. (13)

The entropic probabilities are thus

P(bn|B)=13andP(bn+1|B)=23.P(b_{n}|B)=\frac{1}{3}\qquad\mbox{and}\qquad P(b_{n+1}|B)=\frac{2}{3}. (14)

5 Reservoir states

So far, we have uniformized a non-uniform set AA by augmenting its elements with mechanical states, which act as a sort of “reservoir” of components of content. These mechanical states have no entropy of their own. But we can also consider a procedure in which the augmenting states act more like the states of a thermal reservoir in conventional thermodynamics.

We begin with a mechanical state μ\mu, and posit a sequence of reservoir states θn\theta_{n}, which have the following properties.

  • For any nn, θn+μθn+1\theta_{n}+\mu\rightarrow\theta_{n+1}.

  • θk+θlθm+θn\theta_{k}+\theta_{l}\leftrightarrow\theta_{m}+\theta_{n} if and only if k+l=m+nk+l=m+n.

The reservoir states θn\theta_{n} form a ladder. We can ascend one rung in the ladder by “dissolving” the mechanical state μ\mu into the reservoir. If we have more than one reservoir state, we can ascend one ladder provided we descend another by the same number of rungs.

For any nn and mm, we have that 𝕊(θn)+𝕊(θm+1)=𝕊(θn+1)+𝕊(θm)\mathbb{S}(\theta_{n})+\mathbb{S}(\theta_{m+1})=\mathbb{S}(\theta_{n+1})+\mathbb{S}(\theta_{m}), so that

σ=𝕊(θn+1)𝕊(θn)=𝕊(θm+1)𝕊(θm)\sigma=\mathbb{S}(\theta_{n+1})-\mathbb{S}(\theta_{n})=\mathbb{S}(\theta_{m+1})-\mathbb{S}(\theta_{m}) (15)

is a non-negative constant for the particular sequence of reservoir states. This sequence {θn}\{\theta_{n}\} is characterized by the state μ\mu and the entropy increment σ\sigma. Note that we can write 𝕊(θn)=nσ+S0\mathbb{S}(\theta_{n})=n\sigma+S_{0}, where S0=𝕊(θ0)S_{0}=\mathbb{S}(\theta_{0}).

For example, in our coin-and-box model, the basic box states bnb_{n} act as a sequence of reservoir states with a mechanical (coin) state μ=h\mu=h and an entropy increment σ=log2=1\sigma=\log 2=1. The more general box states bnKb^{K}_{n} form a reservoir state sequence with μ=sq\mu=s_{q} and σ=logk\sigma=\log k, where q=Q(K)q=Q(K) and kk is the number of states in KK. For each of these bojx-state reservoir sequences, S0=𝕊(θ0)=0S_{0}=\mathbb{S}(\theta_{0})=0.

One particular type of reservoir is a mechanical reservoir consisting of the states μ\mu, μ+μ\mu+\mu, μ+(μ+μ)\mu+(\mu+\mu), etc. We denote the nnth such state by μn\mu_{n}. For the μn\mu_{n} reservoir states, σ=0\sigma=0. If we have a finite set of states A={a,a,}A=\{a,a^{\prime},\ldots\} that can be uniformized by the addition of the μn\mu_{n} states, they can also be uniformized by a corresponding set of non-mechanical reservoir states θn\theta_{n}:

A^={a+θna,a+θna,}.\hat{A}=\{a+\theta_{n_{a}},a^{\prime}+\theta_{n_{a^{\prime}}},\ldots\}. (16)

As before, we can find the entropy of this uniform eidostate and define entropic probabilities. But the θ\theta reservoir states now contribute to the entropy and affect the probabilities.

First, the entropy:

𝕊(A^)=log(aA2𝕊(a)+𝕊(θna))=S0+log(aA2𝕊(a)2naσ).\mathbb{S}(\hat{A})=\log\left(\sum_{a\in A}2^{\mathbb{S}(a)+\mathbb{S}(\theta_{n_{a}})}\right)=S_{0}+\log\left(\sum_{a\in A}2^{\mathbb{S}(a)}2^{n_{a}\sigma}\right). (17)

The entropic probability—which now depends on the choice of reservoir states—is

Pθ(a|A)=P(a+θna|A^)=2𝕊(a)+naσ(aA2𝕊(a)+naσ).P_{\theta}(a|A)=P(a+\theta_{n_{a}}|\hat{A})=\frac{2^{\mathbb{S}(a)+n_{a}\sigma}}{\displaystyle\left(\sum_{a\in A}2^{\mathbb{S}(a)+n_{a}\sigma}\right)}. (18)

The reservoir states affect the relative probabilities of the states. For example, suppose 𝕊(a)=𝕊(a)\mathbb{S}(a)=\mathbb{S}(a^{\prime}) for a pair of states in AA. We might naively think that these states would end up with the same entropic probability, as they would if we uniformized AA by mechanical states. But since we are uniformizing using the θ\theta reservoir states, it may be that θna\theta_{n_{a}} and θna\theta_{n_{a^{\prime}}} have different entropies. Then the ratio of the probabilities is

Pθ(a|A)Pθ(a|A)=2naσ2naσ=2(nana)σ,\frac{P_{\theta}(a|A)}{P_{\theta}(a^{\prime}|A)}=\frac{2^{n_{a}\sigma}}{2^{n_{a^{\prime}}\sigma}}=2^{(n_{a}-n_{a^{\prime}})\sigma}, (19)

which may be very different from 1.

Again, let us consider our coin-and-box model. We begin with the non-uniform set A={h,t,h+t}A=\{h,t,h+t\}. Each of these states has the same entropy 𝕊\mathbb{S}, that is, zero. We choose to uniformize using basic box states bnb_{n}. For instance, we might have

A^={h+b1,t+b1,(h+t)+b0}.\hat{A}=\{h+b_{1},t+b_{1},(h+t)+b_{0}\}. (20)

Recalling that σ=1\sigma=1, the entropy is

𝕊(A^)=log(21+21+20)=log5.\mathbb{S}(\hat{A})=\log\left(2^{1}+2^{1}+2^{0}\right)=\log 5. (21)

This yields probabilities

Pb(h|A)=25Pb(t|A)=25Pb(h+t|A)=15.P_{b}(h|A)=\frac{2}{5}\qquad P_{b}(t|A)=\frac{2}{5}\qquad P_{b}(h+t|A)=\frac{1}{5}. (22)

As an illustration of these ideas, consider the version of Maxwell’s demon shown in Figure 1.

Refer to caption
Figure 1: A simple Maxwell’s demon.

The demon is a reversible computer with an initial memory state r0r_{0}. It is equipped with a reversible battery for storing energy, initially in mechanical state m0m_{0}. The demon interacts with a one-particle “Szilard” gas, in which the single particle can move freely within its volume (state s0s_{0}). The gas is maintained in thermal equilibrium with a heat reservoir, whose initial state is θ0\theta_{0}. We might denote the overall initial state by ((r0+m0)+s0)+θ0((r_{0}+m_{0})+s_{0})+\theta_{0}.

Now the demon introducers a partition into the gas, separating the enclosure into unequal subvolumes, as in Figure 2. The two resulting states are sas_{a} and sbs_{b}, which are not equally probable. The probabilities here are entropic probabilities due to the difference in entropy of sas_{a} and sbs_{b}.

Refer to caption
Figure 2: Extraction of work by dividing gas enclosure into unequal volumes..

Now the demon records the location of the particle in its memory and uses this to control the isothermal expansion of the one-particle gas. The work is stored in the battery. At the end of this process, the demon retains its memory record, the battery is in one of two mechanical states mam_{a} and mbm_{b}. The gas is again in state s0s_{0}. But different amounts of heat have been extracted from the reservoir during the expansion, so the reservoir has two different states θa\theta_{a} and θb\theta_{b}.

The overall final eidostate might be represented as

F={((ra+ma)+s0)+θa,((rb+mb)+s0)+θb}.F=\{((r_{a}+m_{a})+s_{0})+\theta_{a},((r_{b}+m_{b})+s_{0})+\theta_{b}\}. (23)

The states of the demon and the gas, (ra+ma)+s0(r_{a}+m_{a})+s_{0} and (rb+mb)+s0(r_{b}+m_{b})+s_{0}, have different energies and the same entropy. It is the reservoir states θa\theta_{a} and θb\theta_{b} that (1) make FF uniform (constant energy), and (2) introduce the entropy differences leading to different entropic probabilities for the two states.

A conventional view would suppose that the unequal probabilities for the two final demon states comes from their history—that is, that the probabilities are inherited from the unequal partition of the gas. In the entropic view, the unequal probabilities are due to differences in the environment of the demon, represented by the different reservoir states θa\theta_{a} and θb\theta_{b}. The environment, in effect, serves as the “memory” of the history of the process.

6 Context states

When we uniformize a non-uniform AA by means of a sequence of reservoir states, the reservoir states affect the entropic probabilities. We can use this idea more generally.

For example, in our coin-and-box model, suppose we flip a coin but do not know how it lands. This might be represented by the eidostate F={h,t}F=\{h,t\}. Without further information, we would assign the coin states equal probability 1/2, which is the simple entropic probability. But suppose we have additional information about the situation that would lead us to assign probabilities 1/3 and 2/3 to the coin states. This additional information—this context—must be reflected in the eidostate. The example in Equation 14 tells us that this does the job:

F^={h+(bn+sm+1),t+(bn+1+sm)}.\hat{F}=\{h+(b_{n}+s_{m+1}),t+(b_{n+1}+s_{m})\}. (24)

The extended coin-flip state F^\hat{F} includes extra context so that the entropic probability reflects our additional information.

In general, we can adjust our entropic probabilities by incorporating context states. Suppose we have a uniform eidostate E={e1,e2,}E=\{e_{1},e_{2},\ldots\}, but we wish to specify a particular non-entropic distribution pkp_{k} over these states. Then for each eke_{k} we introduce eidostates CkC_{k}, leading to an extended eidostate

E^=k(ek+Ck),\hat{E}=\bigcup_{k}\left(e_{k}+C_{k}\right), (25)

which we assume is uniform. The CkC_{k}’s are the context states. Our challenge is to find a set of context states so that the entropic probability in E^\hat{E} equals the desired distribution pkp_{k}.

We cannot always do this exactly, but we can always approximate it as closely as we like. First, we note that we can always choose our context eidostates to be information states. The information state InI_{n} containing nn record states has entropy logn\log n. Now for each kk, we closely approximate the ratio pk/2𝕊(ek)p_{k}/2^{\mathbb{S}(e_{k})} by a rational number; and since there are finitely many of these numbers, we can represent them using a common denominator. In our approximation,

pk2𝕊(ek)=nkN.\frac{p_{k}}{2^{\mathbb{S}(e_{k})}}=\frac{n_{k}}{N}. (26)

Now choose Ck=InkC_{k}=I_{n_{k}} for each kk. The entropy of E^\hat{E} becomes

𝕊(E^)\displaystyle\mathbb{S}(\hat{E}) =\displaystyle= log(k2𝕊(ek+lognk))\displaystyle\log\left(\sum_{k}2^{\mathbb{S}(e_{k}+\log n_{k})}\right) (27)
=\displaystyle= log(knk2𝕊(ek))=log(kpkN)=logN.\displaystyle\log\left(\sum_{k}n_{k}2^{\mathbb{S}(e_{k})}\right)=\log\left(\sum_{k}p_{k}N\right)=\log N.

From this, we find that the entropic probability is

P(ek+Ik|E^)=nk2𝕊(ek)N=pk,P(e_{k}+I_{k}|\hat{E})=\frac{n_{k}2^{\mathbb{S}(e_{k})}}{N}=p_{k}, (28)

as desired.

We find, therefore, that the introduction of context states CkC_{k} allows us to “tune” the entropic probability to approximate any distribution pkp_{k} that we like. This is more than a trick. The distribution pkp_{k} represents additional implicit information (beyond the mere list of states E={ek}E=\{e_{k}\}), and such additional information must have a physical representation. The context states are that representation.

7 Free energy

The tools we have developed can lead to some interesting places. Suppose we have two sets of states, A={ai}A=\{a_{i}\} and B={bj}B=\{b_{j}\}, endowed with a priori probability distributions pip_{i} and qjq_{j}, respectively. We wish to know when the states in AA can be turned into the states in BB, perhaps augmented by reservoir states. That is, we wish to know when A^B^\hat{A}\rightarrow\hat{B}.

We suppose we have a mechanical state μ\mu, leading to a ladder of mechanical reservoir states μn=μ+(μ+)\mu_{n}=\mu+(\mu+\ldots). The mechanical state μ\mu is non-trivial, in the the sense that s+μ↛ss+\mu\not\rightarrow s for any ss. This means that there is a component of content QQ such that Q(μ)0Q(\mu)\neq 0. The set ABA\cup B can be uniformized by augmenting the aia_{i} and bjb_{j} states by μn\mu_{n} mechanical reservoir states.

However, we still need to realize the pip_{i} and qjq_{j} probabilities. We do this by introducing as context states a corresponding ladder of reservoir states θn\theta_{n} such that σ=𝕊(θn+1)𝕊(θn)\sigma=\mathbb{S}(\theta_{n+1})-\mathbb{S}(\theta_{n}) is very small. Essentially, we assume that the reservoir states are “fine-grained” enough that we can approximate any positive number by 2nσ2^{n\sigma} for some positive or negative integer nn. Then, if we augment the aia_{i} and bjb_{j} states by combinations of μn\mu_{n} and θn\theta_{n} states, we can uniformize ABA\cup B and also tune the entropic probabilities to match the a priori pip_{i} and qjq_{j}. The final overall uniform eidostate is

{ai+(μli+θki),bj+(μhj+θnj)},\{a_{i}+(\mu_{l_{i}}+\theta_{k_{i}}),b_{j}+(\mu_{h_{j}}+\theta_{n_{j}})\}, (29)

for integers lil_{i}, kik_{i}, hjh_{j} and njn_{j}. The uniformized A^\hat{A} and B^\hat{B} eidostates are subsets of this, and thus are themselves uniform eidostates. The entropic probabilities have been adjusted so that

pi=2𝕊(ai)+kiσ+𝕊(θ0)2𝕊(A^)andqj=2𝕊(bj)+njσ+𝕊(θ0)2𝕊(B^).p_{i}=\frac{2^{\mathbb{S}(a_{i})+k_{i}\sigma+\mathbb{S}(\theta_{0})}}{2^{\mathbb{S}(\hat{A})}}\quad\mbox{and}\quad q_{j}=\frac{2^{\mathbb{S}(b_{j})+n_{j}\sigma+\mathbb{S}(\theta_{0})}}{2^{\mathbb{S}(\hat{B})}}. (30)

We now choose a component of content QQ such that Q(μ)=ε>0Q(\mu)=\varepsilon>0. Since the overall state is is uniform, it must be true that

Q(ai)+liε+kiε=Q(bj)+hjε+njε=constantQ(a_{i})+l_{i}\varepsilon+k_{i}\varepsilon=Q(b_{j})+h_{j}\varepsilon+n_{j}\varepsilon=\mbox{constant} (31)

for all choices of i,ji,j. Of course, if all of these values are the same, we can average them together and obtain

Q(ai)p+lipε+kipε=Q(bj)q+hjqε+njqε.\bigl{\langle}Q(a_{i})\bigr{\rangle}_{\!p}+\bigl{\langle}l_{i}\bigr{\rangle}_{\!p}\varepsilon+\bigl{\langle}k_{i}\bigr{\rangle}_{\!p}\varepsilon=\bigl{\langle}Q(b_{j})\bigr{\rangle}_{\!q}+\bigl{\langle}h_{j}\bigr{\rangle}_{\!q}\varepsilon+\bigl{\langle}n_{j}\bigr{\rangle}_{\!q}\varepsilon. (32)

We can write the average change in the QQ-value of the mechanical state as

(hjqlip)ε=(kipnjq)ε+Q(ai)pQ(bj)q.\left(\bigl{\langle}h_{j}\bigr{\rangle}_{\!q}-\bigl{\langle}l_{i}\bigr{\rangle}_{\!p}\right)\varepsilon=\left(\bigl{\langle}k_{i}\bigr{\rangle}_{\!p}-\bigl{\langle}n_{j}\bigr{\rangle}_{\!q}\right)\varepsilon+\bigl{\langle}Q(a_{i})\bigr{\rangle}_{\!p}-\bigl{\langle}Q(b_{j})\bigr{\rangle}_{\!q}. (33)

Since all of the states lie within the same uniform eidostate, A^B^\hat{A}\rightarrow\hat{B} if and only if 𝕊(A^)𝕊(B^)\mathbb{S}(\hat{A})\leq\mathbb{S}(\hat{B})—that is,

H(p)+𝕊(ai)p+kipσH(q)+𝕊(bj)q+njqσ.H(\vec{p})+\bigl{\langle}\mathbb{S}(a_{i})\bigr{\rangle}_{\!p}+\bigl{\langle}k_{i}\bigr{\rangle}_{\!p}\sigma\leq H(\vec{q})+\bigl{\langle}\mathbb{S}(b_{j})\bigr{\rangle}_{\!q}+\bigl{\langle}n_{j}\bigr{\rangle}_{\!q}\sigma. (34)

From this it follows that

(kipnjq)1σ(H(q)H(p)+𝕊(bj)q𝕊(ai)p).\left(\bigl{\langle}k_{i}\bigr{\rangle}_{\!p}-\bigl{\langle}n_{j}\bigr{\rangle}_{\!q}\right)\leq\frac{1}{\sigma}\left(H(\vec{q})-H(\vec{p})+\bigl{\langle}\mathbb{S}(b_{j})\bigr{\rangle}_{\!q}-\bigl{\langle}\mathbb{S}(a_{i})\bigr{\rangle}_{\!p}\right). (35)

If we substitute this inequality into Equation 33, we obtain

(hjqlip)εεσ(H(q)H(p))\displaystyle\left(\bigl{\langle}h_{j}\bigr{\rangle}_{\!q}-\bigl{\langle}l_{i}\bigr{\rangle}_{\!p}\right)\varepsilon-\frac{\varepsilon}{\sigma}\left(H(\vec{q})-H(\vec{p})\right) \displaystyle\leq +εσ(𝕊(bj)q𝕊(ai)p)\displaystyle+\frac{\varepsilon}{\sigma}\left(\bigl{\langle}\mathbb{S}(b_{j})\bigr{\rangle}_{\!q}-\bigl{\langle}\mathbb{S}(a_{i})\bigr{\rangle}_{\!p}\right)
(Q(bj)qQ(ai)p).\displaystyle-\left(\bigl{\langle}Q(b_{j})\bigr{\rangle}_{\!q}-\bigl{\langle}Q(a_{i})\bigr{\rangle}_{\!p}\right).

We can get insight into this expression as follows. Given the process A^B^\hat{A}\rightarrow\hat{B},

  • (hjqlip)ε\left(\bigl{\langle}h_{j}\bigr{\rangle}_{\!q}-\bigl{\langle}l_{i}\bigr{\rangle}_{\!p}\right)\varepsilon is the average increase in QQ-value of the mechanical state, which we can call ΔQμ\left\langle\Delta Q_{\mu}\right\rangle. Intuitively, this might be regarded as the “work” stored in the A^B^\hat{A}\rightarrow\hat{B} process.

  • We can denote the change in the Shannon entropy of the probabilities by ΔH=H(q)H(p)\Delta H=H(\vec{q})-H(\vec{p}). Since each aia_{i} or bjb_{j} state could be augmented by a corresponding record state, this is the change in the information entropy of the stored record.

  • For each state aa, we can define the free energy F(a)=Q(a)εσ𝕊(a)F(a)=Q(a)-\frac{\varepsilon}{\sigma}\mathbb{S}(a). We call this free “energy”, even though QQ does not necessarily represent energy, because of the analogy with the familiar expression F=ETSF=E-TS for the Helmholtz free energy in conventional thermodynamics. The average change in the free energy FF is

    ΔF=(Q(bj)qQ(ai)p)εσ(𝕊(bj)q𝕊(ai)p).\left\langle\Delta F\right\rangle=\left(\bigl{\langle}Q(b_{j})\bigr{\rangle}_{\!q}-\bigl{\langle}Q(a_{i})\bigr{\rangle}_{\!p}\right)-\frac{\varepsilon}{\sigma}\left(\bigl{\langle}\mathbb{S}(b_{j})\bigr{\rangle}_{\!q}-\bigl{\langle}\mathbb{S}(a_{i})\bigr{\rangle}_{\!p}\right). (37)

    The free energy FF depends on the particular reservoir states θn\theta_{n} only via the ratio ε/σ\varepsilon/\sigma. Given this value, ΔF\left\langle\Delta F\right\rangle depends only on the aia_{i} and bjb_{j} states, together with their a priori probabilities.

    To return to our coin-and-box example, suppose we use the basic box states bnb_{n} as reservoir states θn\theta_{n}, and we choose the coin number QQ as our component of content. Then ε=1\varepsilon=1 and σ=1\sigma=1, so that the free energy function F(a)=Q(a)𝕊(a)F(a)=Q(a)-\mathbb{S}(a). (If we use different box states bnKb^{K}_{n} as reservoir states, the ratio ε/σ\varepsilon/\sigma is different.)

With these definitions, Equation 7 becomes

ΔQμεσΔHΔF.\left\langle\Delta Q_{\mu}\right\rangle-\frac{\varepsilon}{\sigma}\Delta H\leq-\left\langle\Delta F\right\rangle. (38)

Increases in the average stored mechanical work, and decreases in the stored information, must be paid for by a corresponding decrease in the average free energy.

Many useful inferences can be drawn from this. For example, the erasure QQ-cost of one bit of information in the presence of the θ\theta-reservoir is ε/σ\varepsilon/\sigma. This cost can be paid from either the mechanical QQ-reservoir state or the average free energy, or from a combination of these. This amounts to a very general version of Landauer’s principle [8], one that involves any type of mechanical component of content.

Appendix

In this appendix we review some of the main definitions and axioms of the theory, as well as some of its key results. For more details, please see [1].

The theory is built a few essential elements:

  • A set 𝒮\mathscr{S} of states and a set \mathscr{E} of eidostates. Each eidostate is a finite collection of states. Without too much confusion, we may identify a state a𝒮a\in\mbox{$\mathscr{S}$} with the singleton eidostate {a}\{a\}\in\mbox{$\mathscr{E}$}, so that 𝒮\mathscr{S} can be regarded as a subset of \mathscr{E}.

  • An operation ++ by which eidostates are combined. This is just the Cartesian product of the sets. Two eidostates AA and BB are similar (ABA\sim B) if they are formed by the same Cartesian factors, perhaps put together in a different way.

  • A relation \rightarrow on \mathscr{E}. We interpret ABA\rightarrow B to mean that eidostate AA may be transformed into eidostate BB. A process is a pair A,B\left\langle A,B\right\rangle of eidostates, and it is said to be possible if either ABA\rightarrow B or BAB\rightarrow A. An eidostate AA is uniform if, for all a,bAa,b\in A, a,b\left\langle a,b\right\rangle is possible.

  • Special states in 𝒮\mathscr{S} called record states. State rr is a record state if there exists another state such that a+raa+r\leftrightarrow a. An information state is an eidostate containing only record states; the set of these is called \mathscr{I}. A bit state IbI_{\mathrm{b}} is an information state with exactly two elements, and a bit process is a process of the form r,Ib\left\langle r,I_{b}\right\rangle.

Given this background, we can present our axioms.

Axiom I

(Eidostates.) \mathscr{E} is a collection of sets called eidostates such that:

(a)

Every AA\in\mbox{$\mathscr{E}$} is a finite nonempty set with a finite prime Cartesian factorization.

(b)

A+BA+B\in\mbox{$\mathscr{E}$} if and only if A,BA,B\in\mbox{$\mathscr{E}$}.

(c)

Every nonempty subset of an eidostate is also an eidostate.

Axiom II

(Processes.) Let eidostates A,B,CA,B,C\in\mbox{$\mathscr{E}$}, and s𝒮s\in\mbox{$\mathscr{S}$}.

(a)

If ABA\sim B, then ABA\rightarrow B.

(b)

If ABA\rightarrow B and BCB\rightarrow C, then ACA\rightarrow C.

(c)

If ABA\rightarrow B, then A+CB+CA+C\rightarrow B+C.

(d)

If A+sB+sA+s\rightarrow B+s, then ABA\rightarrow B.

Axiom III

If A,BA,B\in\mbox{$\mathscr{E}$} and BB is a proper subset of AA, then ABA\nrightarrow B.

Axiom IV

(Conditional processes.)

(a)

Suppose A,AA,A^{\prime}\in\mbox{$\mathscr{E}$} and b𝒮b\in\mbox{$\mathscr{S}$}. If AbA\rightarrow b and AAA^{\prime}\subseteq A then AbA^{\prime}\rightarrow b.

(b)

Suppose AA and BB are uniform eidostates that are each disjoint unions of eidostates: A=A1A2A=A_{1}\cup A_{2} and B=B1B2B=B_{1}\cup B_{2}. If A1B1A_{1}\rightarrow B_{1} and A2B2A_{2}\rightarrow B_{2} then ABA\rightarrow B.

Axiom V

(Information.) There exist a bit state and a possible bit process.

Axiom VI

(Demons.) Suppose a,b𝒮a,b\in\mbox{$\mathscr{S}$} and JJ\in\mbox{$\mathscr{I}$} such that ab+Ja\rightarrow b+J.

(a)

There exists II\in\mbox{$\mathscr{I}$} such that ba+Ib\rightarrow a+I.

(b)

For any II\in\mbox{$\mathscr{I}$}, either ab+Ia\rightarrow b+I or b+Iab+I\rightarrow a.

Axiom VII

(Stability.) Suppose A,BA,B\in\mbox{$\mathscr{E}$} and JJ\in\mbox{$\mathscr{I}$}. If nAnB+JnA\rightarrow nB+J for arbitrarily large values of nn, then ABA\rightarrow B.

Axiom VIII

(Mechanical states.) There exists a subset 𝒮\mbox{$\mathscr{M}$}\subseteq\mbox{$\mathscr{S}$} of mechanical states such that:

(a)

If l,ml,m\in\mbox{$\mathscr{M}$}, then l+ml+m\in\mbox{$\mathscr{M}$}.

(b)

For l,ml,m\in\mbox{$\mathscr{M}$}, if lml\rightarrow m then mlm\rightarrow l.

Axiom IX

(State equivalence.) If EE is a uniform eidostate then there exist states e,x,y𝒮e,x,y\in\mbox{$\mathscr{S}$} such that xyx\rightarrow y and E+xe+yE+x\leftrightarrow e+y.

A component of content QQ is a real-valued additive function on the set of states 𝒮\mathscr{S}. (Additive in this context means that Q(a+b)=Q(a)+Q(b)Q(a+b)=Q(a)+Q(b).) Components of content represent quantities that are conserved in every possible process. In a uniform eidostate EE, every element has the same values of all components of content, so we can without ambiguity refer to the value Q(E)Q(E). The set of uniform eidostates is denoted 𝒰\mathcal{U}. This set includes all singleton states in 𝒮\mathscr{S}, all information states in \mathscr{I} and so forth, and it is closed under the ++ operation.

References

  • [1] Austin Hulse, Benjamin Schumacher, and Michael D. Westmoreland. Axiomatic information thermodynamics. Entropy, 20(4):237, 2018.
  • [2] R. Giles. Mathematical Foundations of Thermodynamics. Pergamon Press Ltd., Oxford, 1964.
  • [3] Elliott H. Lieb and Jakob Yngvason. A guide to entropy and the second law of thermodynamics. Notices of the American Mathematical Society, 45:571–581, 1998.
  • [4] Leo Szilard. On the decrease of entropy in a thermodynamic system by the intervention of intelligent beings. Zeitschrift fur Physik, 53:840–856, 1929. (English translation in Behavioral Science 1964, 9, 301–310.).
  • [5] Charles H. Bennett. The thermodynamics of computation—a review. International Journal of Theoretical Physics, 21:905–940, 1982.
  • [6] C. E. Shannon. A mathematical theory of communication. Bell System Technical Journal, 27:379–423,623–656, 1948.
  • [7] Thomas M. Cover and Joy A. Thomas. Elements of Information Theory (Second Edition). John Wiley and Sons, Hoboken, 2006.
  • [8] R. Landauer. Irreversibility and heat generation in the computing process. IBM Journal of Research and Development, 5:183–191, 1961.