Send Close Add comments: (status displays here)
Got it!  This site "robinsnyder.com" uses cookies. You consent to this by clicking on "Got it!" or by continuing to use this website.  Note: This appears on each machine/browser from which this site is accessed.
Entropy function
by RS  admin@robinsnyder.com : 1024 x 640


1. Physical entropy

2. Physical entropy summary
Entropy is a measure of the disorder of a system.

The second law of thermodynamics states that in a closed system, the entropy of that system increases over time.

The name comes from the Greek meaning a "turning inward". The word was coined in 1865 by German physicist Rudolph Clausius as he needed a word to describe the "transformational content" of what he was observing/studying.

3. Thermodynamics
There are two fundamental physical laws of thermodynamics that govern the physics of the world in which we live.

4. Systems
Systems that, if they existed, would violate the 2nd law: How does entropy relate to software development?

5. Time
Physical entropy is the primary way in which the elusive concept of physical "time" is defined.

It turns out to be very difficult to precisely define digital "time" in a computer. That is a subject for another "time".

6. Time
Einstein had insight into the true nature of time.
People like us, who believe in physics, know that the distinction between past, present, and future is only a stubbornly persistent illusion. Albert Einstein's (Physicist)

That is, the laws of physics are reversible.

Information sign More: Albert Einstein

7. Software entropy
A software system over time can be thought of in terms of entropy.

Over time, the disorder of your software, as it fits into a system, increases. It is far more likely that any change to the system will make your software stop working than work better. Explain how entropy fits into the laws of physics and how the concept of entropy fits into software development.

8. Improvements
Does your software ever randomly evolve into better software without a lot of intelligent effort on your part?

9. Random changes
When your program does not work, does making random changes improve the chances of it working, or is it more likely that your program will less well after making the changes?

The statistics are against random changes to a program or to software improving the program or software.

10. Entropy
Data (and information) can be represented using symbols that are made up of bits - zeros and ones in a binary code used to represent such data (and information).

A set of symbols (data) has a measure called entropy that approximates the lower bound on the number of bits needed to encode those symbols.

Such entropy constitutes a lack of order or a lack of predictability.

11. Entropy Formula
The following Shannon entropy formula expresses how entropy for a set of symbols is calculated.

Entropy function equation

12. Claude Shannon
Claude Shannon, in the 1950's, used the term "information" to describe the statistical properties of data transmitted from one place to another. He was working for the telephone company AT&T at Bell Labs in order to improve the quality of phone calls. He founded the modern field of (statistical) information - he was not overly concerned with the meaning of the "data" that he called "information". Previously, the term "information" was used as the verb "to inform". That is, one is "informed" that such and such is true - what we now call "information".

13. Von Neumann
Whenever the mathematical genius John Von Neumann (conventional computer architectures are called Von Neumann machines) was asked by Claude Shannon what to call a measure of the lack of content in data, Von Neumann is reported to have told Shannon to call it "entropy" because then nobody would understand it.

14. Von Neumann quote
Von Neumann to Shannon:

You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, no one really knows what entropy really is, so in a debate you will always have the advantage.

15. Statistics
Note that the concept of information developed by Shannon has to do with statistical properties of data.

It does not have anything to do directly with the meaning of the data. That is, information.

So what is called "information theory" might be better called "data theory".

16. Python code
Here is the Python code [#1]

Here is the output of the Python code.


17. Alphabets
Languages vary in how many letters they contain in making words.

18. Frequency
The average frequency with which letters happen in words (in a given corpus of texts) can be determined and varies from letter to letter.

If each letter appeared with the same frequency, then, for English, the entropy would be as follows.
ln2 26 = 4.7004397181411 bits

Does this make sense?
24 = 16 25 = 32

So the log, base 2, of 26 is between 4 and 5 bits.


19. Logarithms
Note that all logarithms are related by a constant factor. Thus, the following are true.
log2(26) = log10(26)/log10(2)

Here is the Python code [#2]

Here is the output of the Python code.


20. Random guessing
On average, guessing a letter in a puzzle, without any other information, requires 13 guesses (for 26 letters).

How many guesses would it take to fill in the missing letters in the following?
H __ L L __ , W __ R L D


21. Semitic languages
Some languages, such as Hebrew, do not have vowels, only consonants. One learns to infer the vowels according to the context in which the consonants appear.

22. English letter guessing
Shannon showed that the English language has an entropy of between 0.6 and 1.3 bits.

The redundancy of natural language helps in removing ambiguity when not all of the text can be transmitted accurately, as in listening to a broadcast with a lot of static noise.

23. Google n-grams
Google has collected a large amount of data on letter sequences in the text of various languages.

Such "grams" are known as "bi-grams" (2 letters) , "tri-grams" (3 letters), and so on. The general term is "n-grams" where n is an integer (greater than one).

24. Security
The bit length of a security key, such as 1024 bit, is not always a maximum entropy security key.

You need the max entropy value to compare security by the bit length of keys.

25. Random numbers
A true random number generator needs a lot of entropy.

26. Perfect passwords
The GRC web site, by Steve Gibson (famous security pod-caster), provides passwords that have high entropy. See https://www.grc.com/passwords.htm. There are more details there on how maximum entropy is used/achieved.

27. Imperfect random numbers
Pseudo-random numbers appear random but actually have a pattern. True random numbers cannot be generated via strictly computer methods.

28. John Von Neumann quotes
From WikiQuote:

In mathematics you don't understand things. You just get used to them.

If people do not believe that mathematics is simple, it is only because they do not realize how complicated life is.

There probably is a God. Many things are easier to explain if there is than if there isn't.

29. About John Von Neumann
From WikiQuote:

He was a really remarkable man. He listened to me talk about this rather obscure subject and in ten minutes he knew more about it than I did. He was extremely quick. David Blackwell

30. Random numbers
Applications: Note: Raspberry Pi has hardware random number generator. How well does it work?


31. Genetic entropy
DNA codeThe concept of genetic entropy is the tendency of the coded information in genetic material to accumulate mutations and degenerate over time due to errors in the copying process.


32. End of page

by RS  admin@robinsnyder.com : 1024 x 640