Information and Entropy (1)

Bits

The Boolean Bit

The Circuit Bit

The Control Bit

The Physical Bit

The Quantum Bit

There are three features of quantum mechanics, reversibility, superposition, and entanglement, that make qubits or collections of qubits different from Boolean bits.

The Classical Bit

Codes

Symbol Space Size

Use of Spare Capacity

There are many strategies to deal with unused code patterns. Here are some: - Ignore - Map to other values - Reserve for future expansion - Use for control codes - Use for common abbreviations

Binary Coded Decimal (BCD)

Genetic Code

Telephone Area Codes

IP Addresses

ASCII

Extension of Codes

Fixed-Length and Variable-Length Codes

Morse Code

Detail: ASCII

Detail: Integer Codes

Binary Code

Binary Gray Code

2's Complement

Sign/Magnitude

1's Complement

Detail: The Genetic Code

Detail: IP Addresses

Detail: Morse Code

Problem 2: Universality

A Boolean function \(F(A,B)\) is said to be universal if any arbitrary boolean if any arbitrary boolean function can be constructed by using nested \(F(A,B)\) functions.

\(OR\) and \(AND\) are not universal functions since \(OR\) (\(AND\)) is monotonic increasing(decreasing)
\(XOR\) is not universal. We define a nested function \(XOR\)s to be an expression \(f\) drawn from the set \(EXPR=\{0,1,A,B,XOR(f_{\alpha},f_{\beta})\}\), where \(f_{\alpha}\) and \(f_{\beta}\) are also drawn from the set \(EXPR\), but only a finite number of times. Consider the property \(E\): a function \(f_{\gamma}\) has property \(E\) if and only if \(0,2\), or \(4\) of these values is \(1\). If function \(f\) is equal to \(AND(A,B)\), then it does not have property \(E\). \(f\) cannot be equal to \(A,B,0\) or \(1\), and so it must be of the form \(XOR(f_{\alpha},f_{\beta})\). Noting that \(XOR\) produces a \(1\) only upon input of a \(1\) and a \(0\), i.e., an 'unpaired' \(1\). We obtain that exactly one of \(f_{\alpha}\) and \(f_{\beta}\) has odd output, namely not satisfying property \(E\). (assume \(f_{\alpha}\) has odd output) So \(f_{\alpha}\) is another \(XOR\) function. Eventually you run out of functions and thus \(f\) cannot be equal to \(AND(A,B)\).

\(NAND\) is a universal function. Actually:

b_15=N 
b_14=NAND(A,B)
b_13=NAND(b_14,A)
b_12=NAND(A,1)
b_11=NAND(b_12,B)
b_10=NAND(B,1)
b_9=NAND(NAND(A,B),NAND(b_12,b_10))
b_8=NAND(NAND(b_10,b_12),1)
b_7=NAND(b_8,1)
b_6=NAND(b_9,1)
b_5=NAND(b_10,1)
b_4=NAND(b_11,1) 
b_3=NAND(b_12,1) 
b_2=NAND(b_14,1) 
b_1=NAND(b_13,1) 
b_0=Z

Compression

two types of compression: - Lossless or reversible compression - Lossy or irreversible compression

Variable-Length Encoding

We put off the discussion of this technique until Chapter 5.

Run Length Encoding

Ex: "a B B B B B a a a B B a a a a" could be encoded as "a 1 B 5 a 3 B 2 a 4".

Static Dictionary

Semi-adaptive Dictionary

Dynamic Dictionary

LZW compression technique

The LZW Patent

Irreversible Techniques

Detail: LZW Compression

LZW编码的学习与实现 https://blog.csdn.net/krossford/article/details/49157531?depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromBaidu-3&utm_source=qq&utm_medium=social&utm_oi=808074114588893184

LZW算法可以实现动态字典，可以做到实时编码、传输、解码（而不是将输入全部分析一遍后利用字符出现频率什么的来编码压缩）。一般来说传输会比输入延迟一个字符，这也意味着输入相比传输要提前看一个字符从而确定动态字典的编写。通常是输入 \(\Rightarrow\) （输入方）添加入字典 \(\Rightarrow\) 传输 \(\Rightarrow\) （输出方）添加入字典 \(\Rightarrow\) 输出。在ASCII码的语境下，输入使用8位字符（0-255），而传输使用9位字符（0-511）

关于LZW算法还有一个特殊点：当输入的字符串第一次出现连续三个相同字符时（比如说rrr），将会出现传输某个（输出方）字典中没有的编码的编码。当输入第二个r之后，才会传输第一个r；输入第三个r后并不会传输（因为rr在输入方字典中出现过）；输入第三个r后的字符（比如说i）后传输rr的编码，当然此时这个编码不存在于输出方的字典中，输出方需要对该情况作特殊处理。书上给出的我看不懂 > Normally, on receipt of a transmitted codeword, the decoder can look up its string in the dictionary and then output it and use its first character to complete the partially formed last dictionary entry, and then start the next dictionary entry. Thus only one dictionary lookup is needed. However, the algorithm presented above uses two lookups, one for the first character, and a later one for the entire string. Why not use just one lookup for greater efficiency? > >There is a special case illustrated by the transmission of code 271 in this example, where the string corresponding to the received code is not complete. The first character can be found but then before the entire string is retrieved, the entry must be completed. This happens when a character or a string appears for the first time three times in a row, and is therefore rare. The algorithm above works correctly, at a cost of an extra lookup that is seldom needed and may slow the algorithm down.

Detail: 2-D Discrete Cosine Transformation

Discrete Linear Transformation

Discrete Cosine Transformation

这部分看Timothy Sauer的数值分析p409-p449，但注意这本书的DFT插值好像有非常大的问题.

Errors

Extension of System Model

Extend the model for information handling to include "channel coding." The channel encoder adds bits to the message so that in case it gets corrupted in some way, the channel encoder will know that and possibly even be able to repair the damage.

How do Errors Happen?

In the usual case, we will usually assume that different bits get corrupted independently, but in some cases errors in adjacent bits are not independent of each other, but instead have a common underlying cause (i.e., the errors may happen in bursts).

Detection vs. Correction

detect the error and then let the person or system that uses the output know that an error has occurred.

Have the channel decoder attempt to repair the message by correcting the error.

In both cases, extra bits are added to the messages to make them longer, so the message contains redundancy. The channel, by allowing errors to occur, actually introduces information. The decoder is irreversible in that it discards some information but keep the original information if well designed.

Hamming Distance

The number of bits that are different between the two.

Single Bits

Multiple Bits

Parity

Detect: Chang the 8-bit string into 9 bits. The added bit would be 1 if the number of bits equal to 1 is odd, and 0 otherwise. It is most often used when the likelihood of an error is very small, and there is no reason to suppose that errors of adjacent bits occur together, and the receiver is able to request a retransmission of the data.

Rectangular Codes

Rectangular codes can provide single error correction and double error detection simultaneously. Refer to p48.

Hamming Code

汉明码（Hamming Code）原理及实现 https://www.cnblogs.com/Philip-Tell-Truth/p/6669854.html

Block Codes

If the number of data bits in the block is \(k\), the the number of parity bits is \(n-k\), and it is customary to call such a code an \((n,k)\) block code. Thus the Hamming Code just described is \((7,4)\).

It is also customary to include in the parentheses the minimum Hamming distance \(d\) between any two valid codewords, or original data items, in the form \((n,k,d)\). We have following block types

Parity bits	Block size	Payload	Code rate	Block code type
2	3	1	0.33	(3,1,3)
3	7	4	0.57	(7,4,3)
4	15	11	0.73	(15,11,3)
5	31	26	0.84	(31,26,3)
6	63	57	0.90	(63,57,3)
7	127	120	0.94	(127,120,3)
8	255	247	0.97	(255,247,3)

Advanced Codes

Bose-Chaudhuri-Hocquenhem (BCH) codes.

Irving S. Reed and Gustave Solomon of MIT Lincoln Laboratory. The \((256,224,5)\) and \((224,192,5)\) Reed-Solomon codes are used in CD players.

Detail: Check Digits

Credit Cards

ISBN

ISSN

Probability

Events

Outcome is the symbol selected, whether or not it is known to us. Event is a subset of the possible outcomes of an experiment.

When a selection is made, then, there are several events. One is the outcome itself. This is called a fundamental event.

The special event in which any symbol at all is selected is called universal event. The special "event" in which no symbol is seleted is called the null event. The null event cannot happen because an outcome is only defined after a selection is made.

A set of events which do not overlap is said to be mutually exclusive. A set of events, one of which is sure to happen, is known as exhaustive. A set of events that are both mutually exclusive and exhaustive is known as a partition. The partition that consists of all the fundamental events will be called the fundamental partition.

A partition consisting of a small number of events, some of which may correspond to many symbols, is known as a coarse-grained partition whereas a partition with many events is a fine-grained partition.

Known Outcomes

Unknown Outcomes

Joint Events and Conditional Probabilities

\[ p(A,B)=p(B)p(A|B)=p(A)p(B|A) \tag{5.5} \]

This formula is known as Bayes' Theorem.

Averages

Information

\[ I=\sum_{i}^{} p(A_i)\log _{2}\left( \frac{1}{p(A_i)}\right) \tag{5.14} \]

This quantity is called the entropy of a source.

Properties of Information

If there are two events in the partition with probabilities \(p\) and \((1-p)\), the information per symbol is \[ I=p\log _{2}(\frac{1}{p})+(1-p)\log _{2}(\frac{1}{1-p}) \tag{5.16} \] which attains its largest (1 bit) for \(p=0.5\).

For partitions with more than two possible events the information per symbol can be higher. If there are \(n\) possible events the information per symbol lies between \(0\) and \(\log _{2}(n)\) bits, the maximum value being achieved when all probabilities are equal.

Efficient Source Coding

If a source has \(n\) possible symbols then a fixed-length code for it would require \(\log _{2}(n)\) bits per symbol.

There is a general procedure for constructing codes of this sort which are very efficient (in fact, they require an average of less tha \(I+1\) bits per symbol, even if \(I\) is considerably below \(\log _{2}(n)\)). We'll introduce Huffman codes below.

Detail: Efficient Source Code

Huffman code: p66-p68

信息论

#数学

Information and Entropy (1)

http://example.com/2022/07/01/Information-and-Entropy-1/

Author

John Doe

Posted on

July 1, 2022

Licensed under

Neuronal Dynamics (4) Previous

常微分方程（10） Next