SlideShare a Scribd company logo
Course Coordinator:-
Dr. Mulugeta Atlabachew (Ass. Professor)
HARAMAYA UNIVERSITY
HARAMAYA INSTITUTE OF TECHNOLOGY
SCHOOL OF ELECTRICAL AND COMPUTER
ENGINEERING
Course Coordinator:-
Dr. Mulugeta Atlabachew (Ass. Professor),
Guest Lecturer
Introduction to Source Coding
Shannon’s 1st Source Coding Theorem
 Shannon showed that:
“To reliably store the information generated by some
random source X, you need no more/less than, on the
average, H(X) bits for each outcome.”
Haramaya
University,
HIT,
School
of
ECE
2 Haramaya University, HIT, School of ECE
12/20/2022
Shannon’s 1st Source Coding Theorem
 If I toss a dice 1,000,000 times and record values from each
trial
1,3,4,6,2,5,2,4,5,2,4,5,6,1,….
 In principle, I need 3 bits for storing each outcome as 3 bits
covers 1-8. So I need 3,000,000 bits for storing the
information.
 Using ASCII representation, computer needs 8 bits=1 byte
for storing each outcome
 The resulting file has size 8,000,000 bits
Haramaya
University,
HIT,
School
of
ECE
3 Haramaya University, HIT, School of ECE
12/20/2022
Shannon’s 1st Source Coding Theorem
 You only need 2.585 bits for storing each outcome.
 So, the file can be compressed to yield size
2.585x1,000,000=2,585,000 bits
 Optimal Compression Ratio is:
Haramaya
University,
HIT,
School
of
ECE
4
%
31
.
32
3231
.
0
000
,
000
,
8
000
,
585
,
2


Haramaya University, HIT, School of ECE
12/20/2022
Shannon’s 1st Source Coding Theorem
.
Haramaya
University,
HIT,
School
of
ECE
5 Haramaya University, HIT, School of ECE
12/20/2022
Type of Coding
 Source Coding - Code data to more efficiently represent the
information
 Reduces “size” of data
 Analog - Encode analog source data into a binary format
 Digital - Reduce the “size” of digital source data
 Channel Coding - Code data for transmission over a noisy
communication channel
 Increases “size” of data
 Digital - add redundancy to identify and correct errors
 Analog - represent digital values by analog signals
Haramaya
University,
HIT,
School
of
ECE
6 Haramaya University, HIT, School of ECE
12/20/2022
Type of Source Coding
Two Types of Source Coding
 Lossless coding (entropy coding)
 Data can be decoded to form exactly the same bits
 Used in “zip”
 Can only achieve moderate compression (e.g. 2:1 - 3:1) for natural images
 Can be important in certain applications such as medical imaging
 Lossly source coding
 Decompressed image is visually similar, but has been changed
 Used in “JPEG” and “MPEG”
 Can achieve much greater compression (e.g. 20:1 -40:1) for natural images
 Uses entropy coding
Haramaya
University,
HIT,
School
of
ECE
7 Haramaya University, HIT, School of ECE
12/20/2022
Lossless Coding
 Lossless compression allows the original data to be
perfectly reconstructed from the compressed data.
 By operation of the pigeonhole principle, no lossless
compression algorithm can efficiently compress all possible
data.
Haramaya
University,
HIT,
School
of
ECE
8 Haramaya University, HIT, School of ECE
12/20/2022
Lossless Coding
 Lossless data compression is used in many applications. For example,
 It is used in the ZIP file format and in the GNU tool gzip.
 It is also used as a component within lossy data compression technologies
(e.g. lossless mid/side joint stereo preprocessing by MP3 encoders and
other lossy audio encoders).
 For Typical examples are executable programs, text documents, and source
code.
 Some image file formats, like PNG or GIF, use only lossless compression,
while others like TIFF and MNG may use either lossless or lossy methods.
 Lossless audio formats are most often used for archiving or production
purposes, while smaller lossy audio files are typically used on portable
players and in other cases where storage space is limited or exact
replication of the audio is unnecessary.
Haramaya
University,
HIT,
School
of
ECE
9 Haramaya University, HIT, School of ECE
12/20/2022
Lossless Coding
 Most lossless compression programs do two things in
sequence:
 the first step generates a statistical model for the input
data, and
 the second step uses this model to map input data to bit
sequences in such a way that "probable" (e.g. frequently
encountered) data will produce shorter output than
"improbable" data.
Haramaya
University,
HIT,
School
of
ECE
10 Haramaya University, HIT, School of ECE
12/20/2022
Lossless Coding
 The primary encoding algorithms used to produce bit
sequences are Huffman coding (also used by the deflate
algorithm) and arithmetic coding.
 Arithmetic coding achieves compression rates close to the
best possible for a particular statistical model, which is
given by the information entropy, whereas Huffman
compression is simpler and faster but produces poor
results for models that deal with symbol probabilities close
to 1.
Haramaya
University,
HIT,
School
of
ECE
11
Haramaya University, HIT, School of ECE
12/20/2022
Lossless Coding
Adaptive models
 Adaptive models dynamically update the model as the data is
compressed.
 Both the encoder and decoder begin with a trivial model, yielding poor
compression of initial data, but as they learn more about the data,
performance improves.
 Most popular types of compression used in practice now use adaptive
coders.
Haramaya
University,
HIT,
School
of
ECE
12 Haramaya University, HIT, School of ECE
12/20/2022
 Assume a set of symbols (26 English letters and some additional
symbols such as space, period, etc.) is to be transmitted through the
communication channel.
 These symbols can be treated as independent samples of a random
variable X with probability P(X) and entropy
 The length of the code for a symbol x with can be its
surprise
 Let L be the average number of bits to encode the N symbols. Shannon
proved that the minimum L satisfies
14
Shannon's Source Coding Theorem
Haramaya University, HIT, School of ECE
12/20/2022
 A Huffman code is a particular type of optimal prefix code that is
commonly used for lossless data compression.
 Optimum prefix code developed by D. Huffman in a class assignment
 The output from Huffman's algorithm can be viewed as a variable-
length code table for encoding a source symbol.
 The algorithm derives this table from the estimated probability or
frequency of occurrence (weight) for each possible value of the
source symbol.
 Huffman coding is not always optimal among all compression methods
 it is replaced with arithmetic coding or asymmetric numeral systems if better compression ratio is required.
15
Huffman Coding
Haramaya University, HIT, School of ECE
12/20/2022
 Two Requirements for Optimum Prefix Codes
 The two least likely symbols have codewords that differ only in the last bit
 These three requirements lead to a simple way of building a binary tree
describing an optimum prefix code - THE Huffman Code.
 Build it from bottom up, starting with the two least likely symbols
 The external nodes correspond to the symbols
 The internal nodes correspond to “super symbols” in a “reduced” alphabet
16
Huffman Coding
Haramaya University, HIT, School of ECE
12/20/2022
1. Label each node with one of the source symbol probabilities
2. Merge the nodes labeled by the two smallest probabilities into a parent
node
3. Label the parent node with the sum of the two children’s probabilities
 This parent node is now considered to be a “super symbol” (it replaces its two
children symbols) in a reduced alphabet
4. Among the elements in reduced alphabet, merge two with smallest probs.
 If there is more than one such pair, choose the pair that has the “lowest order
super symbol” (this assure the minimum variance Huffman Code)
5. Label the parent node with the sum of the two children probabilities.
6. Repeat steps 4 & 5 until only a single super symbol remains
17
Huffman Code-Design Steps
Haramaya University, HIT, School of ECE
12/20/2022
.
18
Huffman Code-Examples
Haramaya University, HIT, School of ECE
12/20/2022
.
19
Huffman Code-Examples
Haramaya University, HIT, School of ECE
12/20/2022
.
20
Minimum Variance-Huffman Code
Haramaya University, HIT, School of ECE
12/20/2022
 Build it from bottom up, starting with the two least likely symbols
 The external nodes correspond to the symbols
 The internal nodes correspond to “super symbols” in a “reduced” alphabet
21
Huffman Coding
Haramaya University, HIT, School of ECE
12/20/2022
.
22
Extended Huffman Coding
Haramaya University, HIT, School of ECE
12/20/2022
.
23
Extended Huffman Coding
Haramaya University, HIT, School of ECE
12/20/2022
.
24
Performance of Extended Huffman Coding
Haramaya University, HIT, School of ECE
12/20/2022
.
25
Performance of Extended Huffman Coding
Haramaya University, HIT, School of ECE
12/20/2022
.
26
Performance of Extended Huffman Coding
Haramaya University, HIT, School of ECE
12/20/2022
 Adaptive Huffman coding (also called Dynamic Huffman
coding) is an adaptive coding technique based on Huffman
coding.
 It permits building the code as the symbols are being
transmitted, having no initial knowledge of source distribution,
that allows one-pass encoding and adaptation to changing
conditions in data.
 The benefit of one-pass procedure is that the source can be
encoded in real time, though it becomes more sensitive to
transmission errors, since just a single loss ruins the whole
code.
27
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
 One pass
 During the pass calculate the frequencies
 Update the Huffman tree accordingly
 Coder – new Huffman tree computed after transmitting the
symbol
 Decoder – new Huffman tree computed after receiving the
symbol
 Symbol set and their initial codes must be known ahead of
time.
 Need NYT (not yet transmitted symbol) to indicate a new leaf
is needed in the tree.
28
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
29
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
30
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
31
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
32
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
33
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
34
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
35
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
36
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
37
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
38
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
39
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
40
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
41
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
42
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
43
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
44
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
45
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
46
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
47
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
48
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
49
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
.
50
ADAPTIVE HUFFMAN CODING
Haramaya University, HIT, School of ECE
12/20/2022
Huffman Coding
 Replacing an input symbol with a codeword
 Need a probability distribution
 Hard to adapt to changing statistics
 Need to store the codeword table
 Minimum codeword length is 1 bit
Arithmetic Coding
 Replace the entire input with a single floating-point number
 Does not need the probability distribution
 Adaptive coding is very easy
 No need to keep and send codeword table
 Fractional codeword length
51
Huffman Coding Vs Arithmetic Coding
Haramaya University, HIT, School of ECE
12/20/2022
 Recall table look-up decoding of Huffman code
 N: alphabet size
 L: Max code word length
 Divide [0, 2L] into N intervals
 One interval for one symbol
 Interval size is roughly proportional to symbol prob.
52
Arithmetic Coding
Haramaya University, HIT, School of ECE
12/20/2022
 Arithmetic coding applies this idea recursively
 Normalizes the range [0, 2L] to [0, 1].
 Map an input sequence (multiple symbols) to a unique tag
in [0, 1)
53
Arithmetic Coding
Haramaya University, HIT, School of ECE
12/20/2022
 Disjoint and complete partition of the range [0, 1)
 Each interval corresponds to one symbol
 Interval size is proportional to symbol probability
 The first symbol restricts the tag position to be in one of
the intervals
 The reduced interval is partitioned recursively as more
symbols are processed.
54
Arithmetic Coding
Haramaya University, HIT, School of ECE
12/20/2022
.
55
Arithmetic Coding
Haramaya University, HIT, School of ECE
12/20/2022
 Symbol set and prob: a (0.8), b(0.02), c(0.18)
56
Arithmetic Coding-Example
Haramaya University, HIT, School of ECE
12/20/2022
 set and prob: a (0.8), b(0.02), c(0.18)
57
Arithmetic Coding-Example
Haramaya University, HIT, School of ECE
12/20/2022
58
Arithmetic Decoding-Example
Haramaya University, HIT, School of ECE
12/20/2022
59
Arithmetic Decoding-Example (Floating Pt. Option)-Simplified
Haramaya University, HIT, School of ECE
12/20/2022
 Arithmetic coding is slow in general:
 To decode a symbol, we need a series of decisions and
multiplications:
 The complexity is greatly reduced if we have only two symbols: 0
and 1.
 Only two intervals: [0, x), [x, 1)
60
Binary Arithmetic Decoding
Haramaya University, HIT, School of ECE
12/20/2022
61
Binary Arithmetic Decoding
Haramaya University, HIT, School of ECE
12/20/2022
 Investigate the latest development in the source coding
technologies.
62
Reading Assignment
Haramaya University, HIT, School of ECE
12/20/2022

More Related Content

Similar to Introduction to Source Coding.pdf (20)

PDF
cp467_12_lecture14_image compression1.pdf
shaikmoosa2003
 
PDF
Arithmetic coding
Gidey Leul
 
PPT
Unit 4
RemyaRoseS
 
PDF
A research paper_on_lossless_data_compre
Luisa Francisco
 
PPT
Mmclass2
Hassan Dar
 
PPT
Huffman coding.ppt
vace1
 
PPT
Compression techniques
m_divya_bharathi
 
PDF
12_HuffmanhsjsjsjjsiejjssjjejsjCoding_pdf.pdf
SHIVAM691605
 
PDF
Dictionary Based Compression
anithabalaprabhu
 
PDF
Huffman and Arithmetic coding - Performance analysis
Ramakant Soni
 
PDF
Itblock2 150209161919-conversion-gate01
Xuan Phu Nguyen
 
PPT
Compress
Chhaya Narvekar
 
PPTX
Digital image processing- Compression- Different Coding techniques
sudarmani rajagopal
 
PDF
Implementation of Lossless Compression Algorithms for Text Data
BRNSSPublicationHubI
 
PDF
Data compression huffman coding algoritham
Rahul Khanwani
 
PPT
Lec5 Compression
anithabalaprabhu
 
PPT
Information Theory MSU-EEE.ppt
robomango
 
PPTX
Fundamental Limits on Performance in InformationTheory.pptx
dpprasad3
 
DOCX
Lecft3data
ali Hussien
 
PPT
Hufman coding basic
radthees
 
cp467_12_lecture14_image compression1.pdf
shaikmoosa2003
 
Arithmetic coding
Gidey Leul
 
Unit 4
RemyaRoseS
 
A research paper_on_lossless_data_compre
Luisa Francisco
 
Mmclass2
Hassan Dar
 
Huffman coding.ppt
vace1
 
Compression techniques
m_divya_bharathi
 
12_HuffmanhsjsjsjjsiejjssjjejsjCoding_pdf.pdf
SHIVAM691605
 
Dictionary Based Compression
anithabalaprabhu
 
Huffman and Arithmetic coding - Performance analysis
Ramakant Soni
 
Itblock2 150209161919-conversion-gate01
Xuan Phu Nguyen
 
Compress
Chhaya Narvekar
 
Digital image processing- Compression- Different Coding techniques
sudarmani rajagopal
 
Implementation of Lossless Compression Algorithms for Text Data
BRNSSPublicationHubI
 
Data compression huffman coding algoritham
Rahul Khanwani
 
Lec5 Compression
anithabalaprabhu
 
Information Theory MSU-EEE.ppt
robomango
 
Fundamental Limits on Performance in InformationTheory.pptx
dpprasad3
 
Lecft3data
ali Hussien
 
Hufman coding basic
radthees
 

Recently uploaded (20)

PPTX
Functions in Python Programming Language
BeulahS2
 
PPTX
FSE_LLM4SE1_A Tool for In-depth Analysis of Code Execution Reasoning of Large...
cl144
 
PDF
Designing for Tomorrow – Architecture’s Role in the Sustainability Movement
BIM Services
 
PPT
دراسة حاله لقرية تقع في جنوب غرب السودان
محمد قصص فتوتة
 
PPTX
Comparison of Flexible and Rigid Pavements in Bangladesh
Arifur Rahman
 
PDF
13th International Conference of Security, Privacy and Trust Management (SPTM...
ijcisjournal
 
PPTX
CST413 KTU S7 CSE Machine Learning Neural Networks and Support Vector Machine...
resming1
 
PDF
lesson4-occupationalsafetyandhealthohsstandards-240812020130-1a7246d0.pdf
arvingallosa3
 
PDF
Plant Control_EST_85520-01_en_AllChanges_20220127.pdf
DarshanaChathuranga4
 
PPT
SF 9_Unit 1.ppt software engineering ppt
AmarrKannthh
 
PDF
Clustering Algorithms - Kmeans,Min ALgorithm
Sharmila Chidaravalli
 
PDF
CLIP_Internals_and_Architecture.pdf sdvsdv sdv
JoseLuisCahuanaRamos3
 
PPTX
Kel.3_A_Review_on_Internet_of_Things_for_Defense_v3.pptx
Endang Saefullah
 
PDF
تقرير عن التحليل الديناميكي لتدفق الهواء حول جناح.pdf
محمد قصص فتوتة
 
PPTX
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
AsadShad4
 
PPTX
Unit_I Functional Units, Instruction Sets.pptx
logaprakash9
 
PDF
PRIZ Academy - Process functional modelling
PRIZ Guru
 
PPTX
Computer network Computer network Computer network Computer network
Shrikant317689
 
PPTX
Stability of IBR Dominated Grids - IEEE PEDG 2025 - short.pptx
ssuser307730
 
PPTX
Precooling and Refrigerated storage.pptx
ThongamSunita
 
Functions in Python Programming Language
BeulahS2
 
FSE_LLM4SE1_A Tool for In-depth Analysis of Code Execution Reasoning of Large...
cl144
 
Designing for Tomorrow – Architecture’s Role in the Sustainability Movement
BIM Services
 
دراسة حاله لقرية تقع في جنوب غرب السودان
محمد قصص فتوتة
 
Comparison of Flexible and Rigid Pavements in Bangladesh
Arifur Rahman
 
13th International Conference of Security, Privacy and Trust Management (SPTM...
ijcisjournal
 
CST413 KTU S7 CSE Machine Learning Neural Networks and Support Vector Machine...
resming1
 
lesson4-occupationalsafetyandhealthohsstandards-240812020130-1a7246d0.pdf
arvingallosa3
 
Plant Control_EST_85520-01_en_AllChanges_20220127.pdf
DarshanaChathuranga4
 
SF 9_Unit 1.ppt software engineering ppt
AmarrKannthh
 
Clustering Algorithms - Kmeans,Min ALgorithm
Sharmila Chidaravalli
 
CLIP_Internals_and_Architecture.pdf sdvsdv sdv
JoseLuisCahuanaRamos3
 
Kel.3_A_Review_on_Internet_of_Things_for_Defense_v3.pptx
Endang Saefullah
 
تقرير عن التحليل الديناميكي لتدفق الهواء حول جناح.pdf
محمد قصص فتوتة
 
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
AsadShad4
 
Unit_I Functional Units, Instruction Sets.pptx
logaprakash9
 
PRIZ Academy - Process functional modelling
PRIZ Guru
 
Computer network Computer network Computer network Computer network
Shrikant317689
 
Stability of IBR Dominated Grids - IEEE PEDG 2025 - short.pptx
ssuser307730
 
Precooling and Refrigerated storage.pptx
ThongamSunita
 
Ad

Introduction to Source Coding.pdf

  • 1. Course Coordinator:- Dr. Mulugeta Atlabachew (Ass. Professor) HARAMAYA UNIVERSITY HARAMAYA INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING Course Coordinator:- Dr. Mulugeta Atlabachew (Ass. Professor), Guest Lecturer Introduction to Source Coding
  • 2. Shannon’s 1st Source Coding Theorem  Shannon showed that: “To reliably store the information generated by some random source X, you need no more/less than, on the average, H(X) bits for each outcome.” Haramaya University, HIT, School of ECE 2 Haramaya University, HIT, School of ECE 12/20/2022
  • 3. Shannon’s 1st Source Coding Theorem  If I toss a dice 1,000,000 times and record values from each trial 1,3,4,6,2,5,2,4,5,2,4,5,6,1,….  In principle, I need 3 bits for storing each outcome as 3 bits covers 1-8. So I need 3,000,000 bits for storing the information.  Using ASCII representation, computer needs 8 bits=1 byte for storing each outcome  The resulting file has size 8,000,000 bits Haramaya University, HIT, School of ECE 3 Haramaya University, HIT, School of ECE 12/20/2022
  • 4. Shannon’s 1st Source Coding Theorem  You only need 2.585 bits for storing each outcome.  So, the file can be compressed to yield size 2.585x1,000,000=2,585,000 bits  Optimal Compression Ratio is: Haramaya University, HIT, School of ECE 4 % 31 . 32 3231 . 0 000 , 000 , 8 000 , 585 , 2   Haramaya University, HIT, School of ECE 12/20/2022
  • 5. Shannon’s 1st Source Coding Theorem . Haramaya University, HIT, School of ECE 5 Haramaya University, HIT, School of ECE 12/20/2022
  • 6. Type of Coding  Source Coding - Code data to more efficiently represent the information  Reduces “size” of data  Analog - Encode analog source data into a binary format  Digital - Reduce the “size” of digital source data  Channel Coding - Code data for transmission over a noisy communication channel  Increases “size” of data  Digital - add redundancy to identify and correct errors  Analog - represent digital values by analog signals Haramaya University, HIT, School of ECE 6 Haramaya University, HIT, School of ECE 12/20/2022
  • 7. Type of Source Coding Two Types of Source Coding  Lossless coding (entropy coding)  Data can be decoded to form exactly the same bits  Used in “zip”  Can only achieve moderate compression (e.g. 2:1 - 3:1) for natural images  Can be important in certain applications such as medical imaging  Lossly source coding  Decompressed image is visually similar, but has been changed  Used in “JPEG” and “MPEG”  Can achieve much greater compression (e.g. 20:1 -40:1) for natural images  Uses entropy coding Haramaya University, HIT, School of ECE 7 Haramaya University, HIT, School of ECE 12/20/2022
  • 8. Lossless Coding  Lossless compression allows the original data to be perfectly reconstructed from the compressed data.  By operation of the pigeonhole principle, no lossless compression algorithm can efficiently compress all possible data. Haramaya University, HIT, School of ECE 8 Haramaya University, HIT, School of ECE 12/20/2022
  • 9. Lossless Coding  Lossless data compression is used in many applications. For example,  It is used in the ZIP file format and in the GNU tool gzip.  It is also used as a component within lossy data compression technologies (e.g. lossless mid/side joint stereo preprocessing by MP3 encoders and other lossy audio encoders).  For Typical examples are executable programs, text documents, and source code.  Some image file formats, like PNG or GIF, use only lossless compression, while others like TIFF and MNG may use either lossless or lossy methods.  Lossless audio formats are most often used for archiving or production purposes, while smaller lossy audio files are typically used on portable players and in other cases where storage space is limited or exact replication of the audio is unnecessary. Haramaya University, HIT, School of ECE 9 Haramaya University, HIT, School of ECE 12/20/2022
  • 10. Lossless Coding  Most lossless compression programs do two things in sequence:  the first step generates a statistical model for the input data, and  the second step uses this model to map input data to bit sequences in such a way that "probable" (e.g. frequently encountered) data will produce shorter output than "improbable" data. Haramaya University, HIT, School of ECE 10 Haramaya University, HIT, School of ECE 12/20/2022
  • 11. Lossless Coding  The primary encoding algorithms used to produce bit sequences are Huffman coding (also used by the deflate algorithm) and arithmetic coding.  Arithmetic coding achieves compression rates close to the best possible for a particular statistical model, which is given by the information entropy, whereas Huffman compression is simpler and faster but produces poor results for models that deal with symbol probabilities close to 1. Haramaya University, HIT, School of ECE 11 Haramaya University, HIT, School of ECE 12/20/2022
  • 12. Lossless Coding Adaptive models  Adaptive models dynamically update the model as the data is compressed.  Both the encoder and decoder begin with a trivial model, yielding poor compression of initial data, but as they learn more about the data, performance improves.  Most popular types of compression used in practice now use adaptive coders. Haramaya University, HIT, School of ECE 12 Haramaya University, HIT, School of ECE 12/20/2022
  • 13.  Assume a set of symbols (26 English letters and some additional symbols such as space, period, etc.) is to be transmitted through the communication channel.  These symbols can be treated as independent samples of a random variable X with probability P(X) and entropy  The length of the code for a symbol x with can be its surprise  Let L be the average number of bits to encode the N symbols. Shannon proved that the minimum L satisfies 14 Shannon's Source Coding Theorem Haramaya University, HIT, School of ECE 12/20/2022
  • 14.  A Huffman code is a particular type of optimal prefix code that is commonly used for lossless data compression.  Optimum prefix code developed by D. Huffman in a class assignment  The output from Huffman's algorithm can be viewed as a variable- length code table for encoding a source symbol.  The algorithm derives this table from the estimated probability or frequency of occurrence (weight) for each possible value of the source symbol.  Huffman coding is not always optimal among all compression methods  it is replaced with arithmetic coding or asymmetric numeral systems if better compression ratio is required. 15 Huffman Coding Haramaya University, HIT, School of ECE 12/20/2022
  • 15.  Two Requirements for Optimum Prefix Codes  The two least likely symbols have codewords that differ only in the last bit  These three requirements lead to a simple way of building a binary tree describing an optimum prefix code - THE Huffman Code.  Build it from bottom up, starting with the two least likely symbols  The external nodes correspond to the symbols  The internal nodes correspond to “super symbols” in a “reduced” alphabet 16 Huffman Coding Haramaya University, HIT, School of ECE 12/20/2022
  • 16. 1. Label each node with one of the source symbol probabilities 2. Merge the nodes labeled by the two smallest probabilities into a parent node 3. Label the parent node with the sum of the two children’s probabilities  This parent node is now considered to be a “super symbol” (it replaces its two children symbols) in a reduced alphabet 4. Among the elements in reduced alphabet, merge two with smallest probs.  If there is more than one such pair, choose the pair that has the “lowest order super symbol” (this assure the minimum variance Huffman Code) 5. Label the parent node with the sum of the two children probabilities. 6. Repeat steps 4 & 5 until only a single super symbol remains 17 Huffman Code-Design Steps Haramaya University, HIT, School of ECE 12/20/2022
  • 17. . 18 Huffman Code-Examples Haramaya University, HIT, School of ECE 12/20/2022
  • 18. . 19 Huffman Code-Examples Haramaya University, HIT, School of ECE 12/20/2022
  • 19. . 20 Minimum Variance-Huffman Code Haramaya University, HIT, School of ECE 12/20/2022
  • 20.  Build it from bottom up, starting with the two least likely symbols  The external nodes correspond to the symbols  The internal nodes correspond to “super symbols” in a “reduced” alphabet 21 Huffman Coding Haramaya University, HIT, School of ECE 12/20/2022
  • 21. . 22 Extended Huffman Coding Haramaya University, HIT, School of ECE 12/20/2022
  • 22. . 23 Extended Huffman Coding Haramaya University, HIT, School of ECE 12/20/2022
  • 23. . 24 Performance of Extended Huffman Coding Haramaya University, HIT, School of ECE 12/20/2022
  • 24. . 25 Performance of Extended Huffman Coding Haramaya University, HIT, School of ECE 12/20/2022
  • 25. . 26 Performance of Extended Huffman Coding Haramaya University, HIT, School of ECE 12/20/2022
  • 26.  Adaptive Huffman coding (also called Dynamic Huffman coding) is an adaptive coding technique based on Huffman coding.  It permits building the code as the symbols are being transmitted, having no initial knowledge of source distribution, that allows one-pass encoding and adaptation to changing conditions in data.  The benefit of one-pass procedure is that the source can be encoded in real time, though it becomes more sensitive to transmission errors, since just a single loss ruins the whole code. 27 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 27.  One pass  During the pass calculate the frequencies  Update the Huffman tree accordingly  Coder – new Huffman tree computed after transmitting the symbol  Decoder – new Huffman tree computed after receiving the symbol  Symbol set and their initial codes must be known ahead of time.  Need NYT (not yet transmitted symbol) to indicate a new leaf is needed in the tree. 28 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 28. . 29 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 29. . 30 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 30. . 31 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 31. . 32 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 32. . 33 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 33. . 34 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 34. . 35 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 35. . 36 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 36. . 37 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 37. . 38 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 38. . 39 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 39. . 40 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 40. . 41 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 41. . 42 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 42. . 43 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 43. . 44 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 44. . 45 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 45. . 46 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 46. . 47 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 47. . 48 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 48. . 49 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 49. . 50 ADAPTIVE HUFFMAN CODING Haramaya University, HIT, School of ECE 12/20/2022
  • 50. Huffman Coding  Replacing an input symbol with a codeword  Need a probability distribution  Hard to adapt to changing statistics  Need to store the codeword table  Minimum codeword length is 1 bit Arithmetic Coding  Replace the entire input with a single floating-point number  Does not need the probability distribution  Adaptive coding is very easy  No need to keep and send codeword table  Fractional codeword length 51 Huffman Coding Vs Arithmetic Coding Haramaya University, HIT, School of ECE 12/20/2022
  • 51.  Recall table look-up decoding of Huffman code  N: alphabet size  L: Max code word length  Divide [0, 2L] into N intervals  One interval for one symbol  Interval size is roughly proportional to symbol prob. 52 Arithmetic Coding Haramaya University, HIT, School of ECE 12/20/2022
  • 52.  Arithmetic coding applies this idea recursively  Normalizes the range [0, 2L] to [0, 1].  Map an input sequence (multiple symbols) to a unique tag in [0, 1) 53 Arithmetic Coding Haramaya University, HIT, School of ECE 12/20/2022
  • 53.  Disjoint and complete partition of the range [0, 1)  Each interval corresponds to one symbol  Interval size is proportional to symbol probability  The first symbol restricts the tag position to be in one of the intervals  The reduced interval is partitioned recursively as more symbols are processed. 54 Arithmetic Coding Haramaya University, HIT, School of ECE 12/20/2022
  • 54. . 55 Arithmetic Coding Haramaya University, HIT, School of ECE 12/20/2022
  • 55.  Symbol set and prob: a (0.8), b(0.02), c(0.18) 56 Arithmetic Coding-Example Haramaya University, HIT, School of ECE 12/20/2022
  • 56.  set and prob: a (0.8), b(0.02), c(0.18) 57 Arithmetic Coding-Example Haramaya University, HIT, School of ECE 12/20/2022
  • 58. 59 Arithmetic Decoding-Example (Floating Pt. Option)-Simplified Haramaya University, HIT, School of ECE 12/20/2022
  • 59.  Arithmetic coding is slow in general:  To decode a symbol, we need a series of decisions and multiplications:  The complexity is greatly reduced if we have only two symbols: 0 and 1.  Only two intervals: [0, x), [x, 1) 60 Binary Arithmetic Decoding Haramaya University, HIT, School of ECE 12/20/2022
  • 60. 61 Binary Arithmetic Decoding Haramaya University, HIT, School of ECE 12/20/2022
  • 61.  Investigate the latest development in the source coding technologies. 62 Reading Assignment Haramaya University, HIT, School of ECE 12/20/2022