Information and Entropy (1)
Bits
The Boolean Bit
The Circuit Bit
The Control Bit
The Physical Bit
The Quantum Bit
There are three features of quantum mechanics, reversibility, superposition, and entanglement, that make qubits or collections of qubits different from Boolean bits.
The Classical Bit
Codes
Symbol Space Size
Use of Spare Capacity
There are many strategies to deal with unused code patterns. Here are some: - Ignore - Map to other values - Reserve for future expansion - Use for control codes - Use for common abbreviations
Binary Coded Decimal (BCD)
Genetic Code
Telephone Area Codes
IP Addresses
ASCII
Extension of Codes
Fixed-Length and Variable-Length Codes
Morse Code
Detail: ASCII
Detail: Integer Codes
Binary Code
Binary Gray Code
2's Complement
Sign/Magnitude
1's Complement
Detail: The Genetic Code
Detail: IP Addresses
Detail: Morse Code
Problem 2: Universality
A Boolean function \(F(A,B)\) is said to be universal if any arbitrary boolean if any arbitrary boolean function can be constructed by using nested \(F(A,B)\) functions.
\(OR\) and \(AND\) are not universal functions since \(OR\) (\(AND\)) is monotonic increasing(decreasing)
\(XOR\) is not universal. We define a nested function \(XOR\)s to be an expression \(f\) drawn from the set \(EXPR=\{0,1,A,B,XOR(f_{\alpha},f_{\beta})\}\), where \(f_{\alpha}\) and \(f_{\beta}\) are also drawn from the set \(EXPR\), but only a finite number of times. Consider the property \(E\): a function \(f_{\gamma}\) has property \(E\) if and only if \(0,2\), or \(4\) of these values is \(1\). If function \(f\) is equal to \(AND(A,B)\), then it does not have property \(E\). \(f\) cannot be equal to \(A,B,0\) or \(1\), and so it must be of the form \(XOR(f_{\alpha},f_{\beta})\). Noting that \(XOR\) produces a \(1\) only upon input of a \(1\) and a \(0\), i.e., an 'unpaired' \(1\). We obtain that exactly one of \(f_{\alpha}\) and \(f_{\beta}\) has odd output, namely not satisfying property \(E\). (assume \(f_{\alpha}\) has odd output) So \(f_{\alpha}\) is another \(XOR\) function. Eventually you run out of functions and thus \(f\) cannot be equal to \(AND(A,B)\).
\(NAND\) is a universal function. Actually:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16b_15=N
b_14=NAND(A,B)
b_13=NAND(b_14,A)
b_12=NAND(A,1)
b_11=NAND(b_12,B)
b_10=NAND(B,1)
b_9=NAND(NAND(A,B),NAND(b_12,b_10))
b_8=NAND(NAND(b_10,b_12),1)
b_7=NAND(b_8,1)
b_6=NAND(b_9,1)
b_5=NAND(b_10,1)
b_4=NAND(b_11,1)
b_3=NAND(b_12,1)
b_2=NAND(b_14,1)
b_1=NAND(b_13,1)
b_0=Z
Compression
two types of compression: - Lossless or reversible compression - Lossy or irreversible compression
Variable-Length Encoding
We put off the discussion of this technique until Chapter 5.
Run Length Encoding
Ex: "a B B B B B a a a B B a a a a" could be encoded as "a 1 B 5 a 3 B 2 a 4".
Static Dictionary
Semi-adaptive Dictionary
Dynamic Dictionary
LZW compression technique
The LZW Patent
Irreversible Techniques
Detail: LZW Compression
LZW编码的学习与实现 https://blog.csdn.net/krossford/article/details/49157531?depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromBaidu-3&utm_source=qq&utm_medium=social&utm_oi=808074114588893184
LZW算法可以实现动态字典,可以做到实时编码、传输、解码(而不是将输入全部分析一遍后利用字符出现频率什么的来编码压缩)。一般来说传输会比输入延迟一个字符,这也意味着输入相比传输要提前看一个字符从而确定动态字典的编写。通常是输入 \(\Rightarrow\) (输入方)添加入字典 \(\Rightarrow\) 传输 \(\Rightarrow\) (输出方)添加入字典 \(\Rightarrow\) 输出。在ASCII码的语境下,输入使用8位字符(0-255),而传输使用9位字符(0-511)
关于LZW算法还有一个特殊点:当输入的字符串第一次出现连续三个相同字符时(比如说rrr),将会出现传输某个(输出方)字典中没有的编码的编码。当输入第二个r之后,才会传输第一个r;输入第三个r后并不会传输(因为rr在输入方字典中出现过);输入第三个r后的字符(比如说i)后传输rr的编码,当然此时这个编码不存在于输出方的字典中,输出方需要对该情况作特殊处理。书上给出的我看不懂 > Normally, on receipt of a transmitted codeword, the decoder can look up its string in the dictionary and then output it and use its first character to complete the partially formed last dictionary entry, and then start the next dictionary entry. Thus only one dictionary lookup is needed. However, the algorithm presented above uses two lookups, one for the first character, and a later one for the entire string. Why not use just one lookup for greater efficiency? > >There is a special case illustrated by the transmission of code 271 in this example, where the string corresponding to the received code is not complete. The first character can be found but then before the entire string is retrieved, the entry must be completed. This happens when a character or a string appears for the first time three times in a row, and is therefore rare. The algorithm above works correctly, at a cost of an extra lookup that is seldom needed and may slow the algorithm down.
Detail: 2-D Discrete Cosine Transformation
Discrete Linear Transformation
Discrete Cosine Transformation
这部分看Timothy Sauer的数值分析p409-p449,但注意这本书的DFT插值好像有非常大的问题.
Errors
Extension of System Model
Extend the model for information handling to include "channel coding." The channel encoder adds bits to the message so that in case it gets corrupted in some way, the channel encoder will know that and possibly even be able to repair the damage.
How do Errors Happen?
In the usual case, we will usually assume that different bits get corrupted independently, but in some cases errors in adjacent bits are not independent of each other, but instead have a common underlying cause (i.e., the errors may happen in bursts).
Detection vs. Correction
detect the error and then let the person or system that uses the output know that an error has occurred.
Have the channel decoder attempt to repair the message by correcting the error.
In both cases, extra bits are added to the messages to make them longer, so the message contains redundancy. The channel, by allowing errors to occur, actually introduces information. The decoder is irreversible in that it discards some information but keep the original information if well designed.
Hamming Distance
The number of bits that are different between the two.
Single Bits
Multiple Bits
Parity
Detect: Chang the 8-bit string into 9 bits. The added bit would be 1 if the number of bits equal to 1 is odd, and 0 otherwise. It is most often used when the likelihood of an error is very small, and there is no reason to suppose that errors of adjacent bits occur together, and the receiver is able to request a retransmission of the data.
Rectangular Codes
Rectangular codes can provide single error correction and double error detection simultaneously. Refer to p48.
Hamming Code
汉明码(Hamming Code)原理及实现 https://www.cnblogs.com/Philip-Tell-Truth/p/6669854.html
Block Codes
If the number of data bits in the block is \(k\), the the number of parity bits is \(n-k\), and it is customary to call such a code an \((n,k)\) block code. Thus the Hamming Code just described is \((7,4)\).
It is also customary to include in the parentheses the minimum Hamming distance \(d\) between any two valid codewords, or original data items, in the form \((n,k,d)\). We have following block types
Parity bits | Block size | Payload | Code rate | Block code type |
---|---|---|---|---|
2 | 3 | 1 | 0.33 | (3,1,3) |
3 | 7 | 4 | 0.57 | (7,4,3) |
4 | 15 | 11 | 0.73 | (15,11,3) |
5 | 31 | 26 | 0.84 | (31,26,3) |
6 | 63 | 57 | 0.90 | (63,57,3) |
7 | 127 | 120 | 0.94 | (127,120,3) |
8 | 255 | 247 | 0.97 | (255,247,3) |
Advanced Codes
Bose-Chaudhuri-Hocquenhem (BCH) codes.
Irving S. Reed and Gustave Solomon of MIT Lincoln Laboratory. The \((256,224,5)\) and \((224,192,5)\) Reed-Solomon codes are used in CD players.
Detail: Check Digits
Credit Cards
ISBN
ISSN
Probability
Events
Outcome is the symbol selected, whether or not it is known to us. Event is a subset of the possible outcomes of an experiment.
When a selection is made, then, there are several events. One is the outcome itself. This is called a fundamental event.
The special event in which any symbol at all is selected is called universal event. The special "event" in which no symbol is seleted is called the null event. The null event cannot happen because an outcome is only defined after a selection is made.
A set of events which do not overlap is said to be mutually exclusive. A set of events, one of which is sure to happen, is known as exhaustive. A set of events that are both mutually exclusive and exhaustive is known as a partition. The partition that consists of all the fundamental events will be called the fundamental partition.
A partition consisting of a small number of events, some of which may correspond to many symbols, is known as a coarse-grained partition whereas a partition with many events is a fine-grained partition.
Known Outcomes
Unknown Outcomes
Joint Events and Conditional Probabilities
\[ p(A,B)=p(B)p(A|B)=p(A)p(B|A) \tag{5.5} \]
This formula is known as Bayes' Theorem.
Averages
Information
\[ I=\sum_{i}^{} p(A_i)\log _{2}\left( \frac{1}{p(A_i)}\right) \tag{5.14} \]
This quantity is called the entropy of a source.
Properties of Information
If there are two events in the partition with probabilities \(p\) and \((1-p)\), the information per symbol is \[ I=p\log _{2}(\frac{1}{p})+(1-p)\log _{2}(\frac{1}{1-p}) \tag{5.16} \] which attains its largest (1 bit) for \(p=0.5\).
For partitions with more than two possible events the information per symbol can be higher. If there are \(n\) possible events the information per symbol lies between \(0\) and \(\log _{2}(n)\) bits, the maximum value being achieved when all probabilities are equal.
Efficient Source Coding
If a source has \(n\) possible symbols then a fixed-length code for it would require \(\log _{2}(n)\) bits per symbol.
There is a general procedure for constructing codes of this sort which are very efficient (in fact, they require an average of less tha \(I+1\) bits per symbol, even if \(I\) is considerably below \(\log _{2}(n)\)). We'll introduce Huffman codes below.
Detail: Efficient Source Code
Huffman code: p66-p68