the rise (and fall) of entropy
Revised version of an Angelfire page published ca. 2010
Tip: use "control f" to quickly access footnotes other significant items
Consider using a service such as ChangeDetection.com to keep track of revisions to this page.
Please let me know about errors or points in need of clarifying at krypto78 attt gmaaail dottt commm.
Please see The many worlds of probability, reality and cognition; Part V focuses on the topic of entropy.
http://randompaulr.blogspot.com/2013/11/the-many-worlds-of-probability-reality.html
My thoughts on probability in the current paper should be taken as provisional.
By PAUL CONANT
One might describe the increase of the entropy (FN 0) of a gas to mean that the net vector -- sum of vectors of all particles -- at between time t0 and tn tends toward some constant, such as 0, and that once this equilibrium is reached at tn, the net vector stays near 0 at any subsequent time.
One would expect a nearly 0 net vector if the individual particle vectors are random. This randomness is exactly what one would find in an asymmetrical n-body scenario, where the bodies are close together and about the same size. The difference is that gravity isn't the determinant, but rather collisional kinetic energy. It has been demonstrated that n-body problems can yield orbits that become extraordinarily tangled. The randomness is then of the Chaitin-Kolmogorov variety: determining future position of a particular particle becomes computationally very difficult. And usually, over some time interval, the calculation errors increase to the point that all predictability for a specific particle is lost.
But there is also quantum randomness at work. The direction that an excited photon exits an atom is probabilistic only, meaning that the recoil is random. This recoil vector must be added to the other electric charge recoil vector associated with particle collision -- though its effect is very slight and usually ignored.
Further, if one were to observe one or more of the particles, the observation would affect the knowledge of the momentum or position of the observed particles.
Now supposing we keep the gas at a single temperature in a closed container attached via a closed valve to another evacuated container, when we open the valve, the gas expands to fill both containers. This expansion is a consequence of the effectively random behavior of the particles, which on average "find less resistance" in the direction of the vacuum.
In general, gases tend to expand by inverse square, or that is spherically (or really, as a ball), which implies randomization of the molecules.
The drunkard's walk
Consider a computerized random walk (aka "drunkard's walk") in a plane. As n increases, the area covered by the walk tends toward that of a circle. In the infinite limit, there is probability 1 that a perfect circle has been covered (though probability 1 in such cases does not exclude exceptions).
So the real question is: what about the n-body problem yields pi-randomness? It is really a statistical question. When enough collisions occur in a sufficiently small volume (or area), the particle vectors tend to cancel each other out.
Let's go down to the pool hall and break a few racks of balls. It is possible to shoot the cue ball in such a way that the rack of balls scatters symmetrically. But in most situations, the cue ball strikes the triangular array at a point that yields an asymmetrical scattering. This is the sensitive dependence on initial conditions associated with mathematical chaos. We also see Chaitin-Kolmogorov complexity enter the picture, because the asymmetry means that for most balls predicting where one will be after a few ricochets is computationally very difficult.
Now suppose we have perfectly inelastic, perfectly spherical pool balls that encounter idealized banks. We also neglect friction. After a few minutes, the asymmetrically scattered balls are "all over the place" in effectively random motion. Now such discrete systems eventually return to their original state: the balls coalesce back into a triangle and then repeat the whole cycle over again, which implies that in fact such a closed system, left to its own devices, requires that entropy to decrease, a seeming contradiction of the second law of thermodynamics. But the time scales required mean we needn't hold our breaths waiting. Also, in nature, there are darned few closed systems (and as soon as we see one, it's no longer closed at the quantum level), allowing us to conclude that in the ideal of zero friction, the pool ball system may become aperiodic, implying the second law in this case holds.
Maxwell's demon
And now, let us exorcize Maxwell's demon, which, though meant to elucidate, to this day bedevils discussions of entropy with outlandish "solutions" to the alleged "problem." Maxwell gave us a thought experiment whereby he posited a little being controlling the valve between canisters. If (in this version of his thought experiment) the gremlin opened the valve to let speedy particles past in one direction only, the little imp could divide the gas into a hot cloud in one canister and a cold cloud in the other. Obviously the energy the gremlin adds is equivalent to adding energy via a heating/cooling system, but Maxwell's point was about the very, very minute possibility that such a bizarre division could occur randomly (or, some would say, pseudo-randomly).
This possibility exists. In fact, as said, in certain idealized closed systems, entropy decrease MUST happen. Such a spontaneous division into hot and cold clouds would also probably happen quite often at the nano-nano-second level. That is, when time intervals are short enough, quantum physics tells us the usual rules go out the window. However, observation of such actions won't occur for such quantum intervals (so there is no change in information or entropy), and as for the "random" chance of observing an extremely high-ordering of gas molecules, even if someone witnessed such an occurrence, not only does the event not conform to a repeatable experiment, no one is likely to believe the report, even if true.
Truly universal?
Can we apply the principle of entropy to the closed system of the universe? A couple of points: We're not absolutely sure the cosmos is a closed system (perhaps, for example, "steady state" creation supplements "big bang" creation). If there is a "big crunch," then, some have speculated, we might expect complete devolution to original states (people would reverse grow from death to birth, for example). If space curvature implies otherwise, the system remains forever open or asymptotically forever open.
However, quantum fuzziness probably rules out such an idealization. Are quantum systems precisely reversible? Yes and no. When one observes a particle collision in an accelerator, one can calculate the reverse paths. However, in line with the Heisenberg uncertainty principle one can never be sure of observing a collision with precisely identical initial conditions. And if we can only rarely, very rarely, replicate the exact initial conditions of the collision, then the same holds for its inverse.
Then there is the question of whether perhaps a many worlds (aka parallel universes) or many histories interpretation of quantum weirdness holds. In the event of a collapse back toward a big crunch, would the cosmos tend toward the exact quantum fluctuations that are thought to have introduced irregularities in the early universe that grew into star and galactic clustering? Or would a different set of fluctuations serve as the attractor, on grounds both sets were and are superposed and one fluctuation is as probable as the other? And, do these fluctuations require a conscious observer, as in John von Neumann's interpretation?
Thinking in terms of computer-like algorithms, Stephen Wolfram writes in A New Kind of Science that it is unclear whether the "basic rules of the universe are really reversible," arguing that it could be that apparent reversibility arises due to effects of an attractor (he does not specify gravitational). He writes that "if pieces of the universe can break off but not reconnect, then there will be inevitably loss of information," thus increasing entropy.
Of course, we face such difficulties when trying to apply physical or mathematical concepts to the entire cosmos. It seems plausible that any system of relations we devise to examine properties of space and time may act like a lens that increases focus in one area while losing precision in another. I.e., a cosmic uncertainty principle.
Conservation of information?
A cosmic uncertainty principle would make information fuzzy. As the Heisenberg uncertainty principle shows, information about a particle's momentum is gained at the expense of information about its position. But, you may respond, the total information is conserved.
But wait! Is there a law about the conservation of information? In fact, information cannot be conserved -- in fact can't exist -- without memory, which in the end requires the mind of an observer. In fact, the "law" of increase of entropy says that memories fade and available information decreases. In terms of pure Shannon information, entropy expresses the probability of what we know or don't know.2 Thus entropy is introduced by noise entering the signal. In realistic systems, supposing enough time elapses, noise eventually overwhelms the intended signal. For example, what would you say is the likelihood that this essay will be accessible two centuries from now? (I've already lost a group of articles I had posted on the now defunct Yahoo Geocities site.) Or consider Shakespeare's plays. We cannot say with certainty exactly how the original scripts read.
In fact, can we agree with some physicists that a specified volume of space contains a specific quantity of information? I wonder. A Shannon transducer is said to contain a specific quantity of information, but no one can be sure of that, prior to someone reading the message and measuring the signal-to-noise ratio.
And quantum uncertainty qualifies as a form of noise, not only insofar as random jiggles in the signal, but also insofar as what signal was sent. If two signals are "transmitted" in quantum superposition, observation randomly determines which signal is read.
So one may set up a quantum measurement experiment and say that for a specific volume, the prior information describes the experiment. But quantum uncertainty still says that the experiment cannot be exactly described in a scientifically sensible way. So if we try to extrapolate information about a greater volume from the experiment volume, we begin to lose accuracy until the uncertainty reaches maximum. We see that quantum uncertainty can progressively change the signal-to-noise ratio, meaning entropy increases until the equilibrium level of no knowledge.
This of course would suggest that, from a human vantage point, there can be no exact information quantity for the cosmos.
So this brings us to the argument about whether black holes decrease the entropy of the universe by making it more orderly (i.e., simpler). My take is that a human observer in principle can never see anything enter a black hole. If one were to detect, at a safe distance, an object approaching a black hole, one would observe that its time pulses (its Doppler shift) would get slower and slower. In fact, the time pulses slow down asymptotic to eternity.
So the information represented by the in-falling object is, from this perspective, never lost.
But suppose we agree to an abstraction that eliminates the human observer -- as opposed to a vastly more gifted intelligence. In that case, perhaps the cosmos has an exact quantity of information at ta. It then makes sense to talk about whether a black hole affects that quantity.
Consider a particle that falls into a black hole. It is said that all the information available about a black hole is comprised of the quantities for its mass and its surface area. Everything this super-intelligence knew about the particle, or ever could know, seemingly, is gone. Information is lost and the cosmos is a simpler, more orderly place, higher in information and in violation of the second law... maybe.
But suppose the particle is a twin of an entangled pair. One particle stays loose while the other is swallowed by the black hole. If we measure, say, the spin of one such particle we would ordinarily automatically know the spin of the other. But who's to tell what the spin is of a particle headed for the gravitational singularity at the black hole's core? So the information about the particle vanishes and entropy increases. This same event however means the orderliness of the universe increases and the entropy decreases. So, which is it? Or is it both. Have no fear, this issue is addressed in the next section.
Oh, and of course we mustn't forget Hawking radiation, whereby a rotating black hole slowly leaks radiation as particles every now and then "tunnel" through the gravitational energy barrier and escape into the remainder cosmos. The mass decreases over eons and eons until -- having previously swallowed everything available -- it eventually evaporates, Hawking conjectures. Actually, we needn't invoke tunneling; the condition that the object is rotating means that it has kinetic energy; some quanta of energy associated with rotational acceleration are at the event horizon and are energetic enough to escape the gravity field, assuming they are vectored appropriately.
Hawking's updated black hole view
http://www.nature.com/news/2004/040712/full/news040712-12.html
In 2005, Hawking revived a long-simmering argument about black holes and entropy.
"I'm sorry to disappoint science fiction fans, but if information is preserved, there is no possibility of using black holes to travel to other universes. If you jump into a black hole, your mass energy will be returned to our universe but in a mangled form which contains the information about what you were like but in a state where it can not be easily recognized. It is like burning an encyclopedia. Information is not lost, if one keeps the smoke and the ashes. But it is difficult to read. In practice, it would be too difficult to re-build a macroscopic object like an encyclopedia that fell inside a black hole from information in the radiation, but the information preserving result is important for microscopic processes involving virtual black holes."
Information loss in black holes
http://arxiv.org/pdf/hepth/0507171.pdf
A question: suppose an entangled particle escapes the black hole? Is the cosmic information balance sheet rectified? Perhaps, supposing it never reached the singularity. But, what of particles down near the singularity? They perhaps morph as the fields transform into something that existed close to the cosmic big bang. So it seems implausible that the spin information is retained. But, who knows?
Where's that ace?
There is a strong connection between thermodynamic entropy and Shannon information entropy (FN 0). Consider the randomization of the pool break on the frictionless table after a few minutes. This is the equivalent of shuffling a deck of cards.
Suppose we have an especially sharp-eyed observer who watches where the ace of spades is placed in the deck as shuffling starts. We then have a few relatively simple shuffles. After the first shuffle, he knows to within three cards how far down in the deck the ace is. On the next shuffle he knows where it is with less accuracy. Let's say to a precision of (1/3)(1/3) = 1/9. After some more shuffles his potential error has reached 1/52, meaning he has no knowledge of the ace's whereabouts.
The increase in entropy occurs from one shuffle to the next. But at the last shuffle, equilibrium has been reached. Further shuffling can never increase his knowledge of where the ace is, meaning the entropy won't decrease.
The runs test gives a measure of randomness (FN 1) based on the normal distribution of numbers of runs, with the mean at n/2, "Too many" runs are found in one tail and "too few" in another. That is, a high z score implies that the sequence is suspected of being non-random or "highly ordered."
What however is meant by order? (This is where we tackle the conundrum of a decrease in one sort of cosmic information versus an increase in another sort.)
Entropy is often defined as the tendency toward decrease of order, and the related idea of information is sometimes thought of as the surprisal value of a digit string. Sometimes a pattern such as HHHH... is considered to have low information because we can easily calculate the nth value (assuming we are using some algorithm to obtain the string). So the Chaitin-Kolmogorov complexity is low, or that is, the information is low. On the other hand a string that by some measure is effectively random is considered here to be highly informative because the observer has almost no chance of knowing the string in detail in advance.
However, we can also take the opposite tack. Using runs testing, most digit strings (multi-value strings can often be transformed, for test purposes, to bi-value strings) are found under the bulge in the runs test bell curve and represent probable randomness. So it is unsurprising to encounter such a string. It is far more surprising to come across a string with far "too few" or far "too many" runs. These highly ordered strings would then be considered to have high information value.
So, once the deck has been sufficiently shuffled the entropy has reached its maximum (equilibrium). What is the probability of drawing four royal flushes? If we aren't considering entropy, we might say it is the same as that for any other 20-card deal. But, a runs test would give a z score of infinity (probability 1 that the deal is non-random) because drawing all high cards is equivalent to tossing a fair coin and getting 20 heads and no tails. If we don't like the infinitude we can posit 21 cards containing 20 high cards and 1 low card. The z score still implies non-randomness with a high degree of confidence.
Negative entropy?
Our discussion should not ignore the impact of Ramsey theory, an important subdiscipline of network theory. "Self-organizing" possibilities are inevitable with sufficient number of nodes in a network. In fact, one might argue that Ramsey theory implies negative entropy. Suppose we had n poker players. The probability that among them there is a royal flush skyrockets quite rapidly. So as n increases, the probability of a specific set of cards increases and the information surprisal value decreases (FN 3).
0. Taken from a Wikipedia article: The dimension of thermodynamic entropy is energy divided by temperature, and its SI unit is joules per kelvin. In information theory, entropy is a measure of the uncertainty associated with a random variable. In this context, the term usually refers to the Shannon entropy, which quantifies the expected value of the information contained in a message, usually in units such as bits. Equivalently, the Shannon entropy is a measure of the average information content one is missing when one does not know the value of the random variable. The concept was introduced by Claude E. Shannon in his 1948 paper "A Mathematical Theory of Communication."
1. We should caution that the runs test, which works for n1 > 7 and n2 > 7, fails for the pattern HH TT HH TT... This failure seems to be an artifact of the runs test assumption that a usual number of runs is about n/2. I suggest that we simply say that the probability of that pattern is less than or equal to H T H T H T..., a pattern whose z score rises rapidly with n. Other patterns such as HHH TTT HHH... also climb away from the randomness area slowly with n. With these cautions, however, the runs test gives striking results.
2. Taken from Wikipedia: In information theory, entropy is a measure of the uncertainty associated with a random variable. In this context, the term usually refers to the Shannon entropy, which quantifies the expected value of the information contained in a message, usually in units such as bits. Equivalently, the Shannon entropy is a measure of the average information content one is missing when one does not know the value of the random variable. The concept was introduced by Claude E. Shannon in his 1948 paper "A Mathematical Theory of Communication." Shannon's entropy represents an absolute limit on the best possible lossless compression of any communication, under certain constraints: treating messages to be encoded as a sequence of independent and identically-distributed random variables, Shannon's source coding theorem shows that, in the limit, the average length of the shortest possible representation to encode the messages in a given alphabet is their entropy divided by the logarithm of the number of symbols in the target alphabet. A fair coin has an entropy of one bit. However, if the coin is not fair, then the uncertainty is lower (if asked to bet on the next outcome, we would bet preferentially on the most frequent result), and thus the Shannon entropy is lower. Mathematically, a coin flip is an example of a Bernoulli trial, and its entropy is given by the binary entropy function. A long string of repeating characters has an entropy rate of zero, since every character is predictable. The entropy rate of English text is between 1.0 and 1.5 bits per letter, or as low as 0.6 to 1.3 bits per letter, according to estimates by Shannon based on human experiments.
3. John Allen Paulos on Ramsey theory: 'A more profound version of this line of thought can be traced back to British mathematician Frank Ramsey, who proved a strange theorem. It stated that if you have a sufficiently large set of geometric points and every pair of them is connected by either a red line or a green line (but not by both), then no matter how you color the lines, there will always be a large subset of the original set with a special property. Either every pair of the subset's members will be connected by a red line or every pair of the subset's members will be connected by a green line. If, for example, you want to be certain of having at least three points all connected by red lines or at least three points all connected by green lines, you will need at least six points. (The answer is not as obvious as it may seem, but the proof isn't difficult.) For you to be certain that you will have four points, every pair of which is connected by a red line, or four points, every pair of which is connected by a green line, you will need 18 points, and for you to be certain that there will be five points with this property, you will need -- it's not known exactly - between 43 and 55. With enough points, you will inevitably find unicolored islands of order as big as you want, no matter how you color the lines.'
Revised Oct. 29, 2013