Source coding

SOURCE CODING
By: MOHIT KUMAR
M.Tech 1st year
1

 CONTENT
 Arithmetic coding
 Lempel- ziv coding
 Run length encoding
 Rate distortion function
 Entropy rate of stochastic process
 JPEG standard
1. Lossless compression
2. Lossy compression
2

 Arithmetic coding
 It takes a stream of input symbols and replaces it with a
single floating point number in [1,0)
 The longer and more complex the message, the more
bits are needed to represents the output number
 The output of an arithmetic coding is, as usual, a
stream of bits
 However we can think that there is a prefix 0, and the
stream represents a fractional binary number between
0 and 1
01101010 → 0.01101010
 In order to explain the algorithm, numbers will be
shown as decimal, but obviously they are always binary.
3

 Example
 String bccb from the alphabet {a,b,c}
 Zero-frequency problem solved initializing at 1 all
character counters
 When the first b is to be coded all symbols have a
33% probability.
 The arithmetic coder maintains two numbers, low
and high, which represent a subinterval [low,high)
of the range [0,1)
 Initially low=0 and high=1
4

 Example
5
0
1
0.3333
0.6667
a
b
c
0.6667
0.3333
1/3
1/3
1/3
0.4167
0.5834
1/4
2/4
1/4
a
b
c
0. 5834
0. 6667
2/5
2/5
1/5
0.6001
0.6334
a
b
c
0. 6667
0.6334 a
b
c
0.6390
0.6501
3/6
2/6
1/6
[0.6390, 0.6501) 0.64

 range = high - low
 low = previous low + range * low bound of new symbol
 high = previous low + range * high bound of new symbol
6

 Lempel Ziv coding
 Lempel-Ziv is a lossless data compression method
algorithm
 Lossless Data compression is technique used to produce
the original information from a compressed data.
 The Lempel Ziv algorithms belong to yet another
category of lossless compression techniques known as
dictionary coders.
 The dictionary is being built in a single pass, while at the
same time encoding take places.
 It continuously rewrites the dictionary for a file,
discarding patterns it previously included and adding
new ones when necessary.
8

APPLICATIONS: APPLICATIONS:
ZIP GIF
GZIP V.42
STACKER COMPRESS
LZ77
LZR LZSS LZH LZB
LZ78
LZFG LZT LZC
LZJ LZW LZMW
9

 Run Length Encoding
 Compress any type of repeating data sequence
 At transmitter:
-identify repeating characters to replace
-if found, eliminate and replace by three- character code
10

Original data string Encoded data string
$******55.72 $Sc*655.72
--------- Sc-9
GunsScb9Butter
11

 Efficiency depends on
-number of repeated character occurrences in data to
be compressed
-average repeated character length.
 Compression ratio= length of uncompressed data
length of compressed data
 Any compression scheme will have variable
performance as the content of the input varies.
12

 It is used as an element in more complex image
compression technique.
 For run-length encoding on an image, transmission
of digital line scan is replaced by transmission of a
quantity count of each of successive run of black or
white scanned picture element.
13

0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 1 1 1 1 0 0 0
0 0 0 1 0 0 1 0 0 0
0 0 0 1 1 1 1 0 0 0
0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 0 0 0
0 0 0 1 1 1 1 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
Length:100 bit
23W 4B 6W 1B 2W 1B 6W 4B 9W
1B 9W 1B 6W 4B 23W
or:
23 4 6 1 6 4 9 1 9 1 6 4 23
Simple run-length encoding
Run-Length Code for a 100-
pixel image
14

 100 pixel image of10X10 representation is converted to a
100-bit code.
 Each pixel is represented by a single bit indicate black or
white.
 Run-length code consist of length of alternate black or
white sequence.
 Encoded data stream is a string of number that indicate
length of alternate black or white runs.
15

 Rate Distortion
 Rate distortion theory calculates
minimum
transmission bit-rate for a given
distortion and source.
 Rate distortion theory are obtained
without consideration of a specific
coding method.
16

 Rate Distortion Function
• Definition:
• For a given maximum average distortion , the rate
distortion function R(D*) is the lower bound for the
transmission bit-rate.
• The minimization is conducted for all possible mappings
that satisfy the average distortion constraint.
• R(D*)is measured in bits.
• In rate distortion theory: minimize mutual information
I(Q) because source is given, not the channel.
17

 Entropy rate of stochastic
process
 Entropy refers to disorder of uncertainty. Entropy quantifies
amount of uncertainty involved in the value of random
variable or the outcome of random process
 How does the entropy of the sequence grow with n.
 The entropy grows (asymptotically) linearly with n at a rate
H(χ), which we will call entropy rate of a process
 The entropy of a stochastic process {Xi}
18

 JPEG image compression
ImageCompression
Lossless
Lossy
 Lossless image
compression is a
compression algorithm that
allows the original image to
be perfectly reconstructed
from the original data.
 Lossy image compression
is a type of compression
where a certain amount of
information is discarded
which means that some
data are lost and hence the
image cannot be
decompressed with 100%
originality.
19

 JPEG, which stands for Joint Photographic
Experts Group (the name of the committee that
created the JPEG standard) is a lossy
compression algorithm for images.
 A lossy compression scheme is a way to
inexactly represent the data in the image, such
that less memory is used yet the data appears to
be very similar. This is why JPEG images look
almost the same as the original images they
were derived from most of the time, unless the
quality is reduced significantly, in which case
there will be visible differences.
 The JPEG algorithm takes advantage of the fact
that humans can't see colours at high
frequencies. These high frequencies are the data
points in the image that are eliminated during the
compression. JPEG compression also works
best on images with smooth colour transitions.
20

Encoder 0101100111... Decoder
Original Image Decoded Image
Bitstream
• Flow of compression
• The image file is converted into a series of binary data,
which is called the bit-stream
• The decoder receives the encoded bit-stream and
decodes it to reconstruct the image
• The total data quantity of the bit-stream is less than
the total data quantity of the original image
21

 DCT
 The DCT uses the cosine function, therefore
not interacting with complex numbers at all.
 DCT converts the information contained in
pixels from spatial domain to the frequency
domain.
 Human vision is insensitive to high frequency
components, due to which it is possible to
treat the data corresponding to high
frequencies as redundant. To segregate the
raw image data on the basis of frequency, it
needs to be converted into frequency
domain, which is the primary function of DCT.
22

 Basic concept of
compression
Original image Compressed image
23

Source coding

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to Source coding (20)

Recently uploaded (20)

Source coding