SlideShare a Scribd company logo
Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi
International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 469
An Efficient Multiplierless Transform Algorithm for Video Coding
Geetha. K. S geethakomandur@gmail.com
Associate Professor, Dept of E&CE
R.V.College of Engineering,
Bangalore, 560 059, India
M K Pushpa pushpachandan@rediffmail.com
Associate Professor, Dept of IT
M.S.Ramaiah Institute of Technology,
Bangalore,560 054, India
Dr.M.Uttarakumari dr.uttarakumari@gmail.com
Professor, Dept of E&C
R.V.College of Engineering,
Bangalore,560 059, India
Dr. S.Sethu Selvi selvi@msrit.edu
Professor & Head, Dept of E&C
M.S.Ramaiah Institute of Technology,
Bangalore,560 054, India
Abstract
This paper presents an efficient algorithm to accelerate software video encoders/decoders by
reducing the number of arithmetic operations for Discrete Cosine Transform (DCT). A
multiplierless Ramanujan Ordered Number DCT (RDCT) is presented which computes the
coefficients using shifts and addition operations only. The reduction in computational complexity
has improved the performance of the video codec by almost 58% compared with the commonly
used integer DCT. The results show that significant computation reduction can be achieved with
negligible average peak signal-to-noise ratio (PSNR) degradation. The average structural
similarity index matrix (SSIM) also ensures that the degradation due to the approximation is
minimal.
Keywords: Ramanujan Ordered Number DCT, Multiplierless DCT, Video Coding.
1. INTRODUCTION
Digital video applications have become more and more popular in our everyday life. Currently,
there are several video standards, such as H.261 [l], H.263 [2], and MPEG [3][4], established for
different applications. All these standards use motion compensated prediction, Discrete Cosine
Transform (DCT), quantization, zigzag scan, and Variable Length Coding (VLC) as their basic
functional blocks. Among these building blocks, Motion estimation (ME) in the motion
compensated (MC) prediction is the most computationally intensive part, and then the DCT and
the Inverse DCT (IDCT). Many fast algorithms have been developed to speed up the computation
for Motion estimation. In this paper, an efficient technique is investigated to accelerate software
video encoders by reducing the number of operations for DCT and quantization. The DCT and
the quantization processes require a lot of multiplications, which are computationally expensive. A
modification is proposed by replacing the 2-D DCT block in the standard MPEG-2 video codec
with the 2-D Multiplierless Recursive DCT block. The performance is then compared with the
existing DCT algorithms.
Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi
International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 470
The organization of the paper is as follows. In Section 2, different blocks of video coding as in
MPEG coder/decoder are explained, Section 3, explains the use of the multiplierless DCT
coefficient computation that reduces the computation in the video encoder. In Section 4, the
methodology of the proposed technique with the simulation results is discussed.
2. MPEG CODER/DECODER
The international standard [5, 6] describe a system, MPEG-2, for encoding and decoding digital
video data. The standard allows for the encoding of video over a wide range of resolutions,
including higher resolutions commonly known as HDTV.
In this system, encoded pictures are made up of pixels. If each 8 8× array of pixels is known as a
block, then an 2 2× array of blocks is termed a macroblock. In this paper, an 8 8× array of pixels
is used as macroblock. Compression is achieved using the well known techniques of prediction
(motion estimation in the encoder, motion compensation in the decoder), 2-D DCT, quantization
of DCT coefficients, and Huffman/run(remove space) length coding. Pictures called I pictures are
encoded without prediction and maintained as reference frames. Pictures termed P pictures may
be encoded with prediction from previous pictures. B pictures may be encoded using prediction
from both previous and subsequent pictures. A simplified MPEG-2 encoder and decoder is shown
in Figure 1.
FIGURE1. MPEG-2 Encoder and Decoder
Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi
International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 471
Before DCT is performed, motion compensated prediction is done for every macro block
(8 8× pixels) on inter-coded frames. The objective of motion estimation is to find the best match
of the current macro block within the search region in the reference frame. The common matching
criterion used for finding the best match in the search region is the Mean Absolute Difference
(MAD).
1 1
2
0 0
1 N N
ij ij
i j
MAD C R
N
− −
= =
= −∑∑
(1)
Where N is the size of the macro block, ijC and ijR are the pixels being compared in current
macro block and the reference macro block respectively.
In motion compensated predictive coding, before performing the DCT computation, the Three
Step Search algorithm [7, 8] is used to find the motion vectors. The best macroblock is found by
using the MAD as a measure. The search algorithm is started with the search location at the
centre of the macroblock as (0, 0). The step size is then fixed as S=4, and the search parameter
as 7 for a macroblock of size8 8× . So, the search continues for the eight neighborhood pixels
around location (0, 0). Out of these 9 locations, the pixel with the least cost function is then
reinitiated as the new search origin and the step size is then reduced by half. So, S=S/2. The
procedure is repeated until S=1. The pixel with the least cost function would then be the best
match. The vector that represents the best match is saved.
FIGURE 2: Three Step Search Procedure. (Motion Vector is (5,7)
Each motion compensated macro block consists of four 8 8× luminance and two
8 8× chrominance prediction error blocks (difference blocks). These 8 8× blocks are transformed
to generate 8 8× DCT coefficients and these coefficients are quantized for compression.
3. PROPOSED VIDEO CODEC
The DCT and the quantization processes require a lot of multiplications, which are
computationally expensive. The standardized DCT block requires floating-point multipliers and for
an 8 8× block, evaluation of coefficients require 12 floating-point multipliers. The implementation
of such a codec is more expensive as the complexity is concentrated towards the floating-point
multipliers. This disadvantage is overcome by replacing the floating-point DCT block with a
multiplierless DCT block where the coefficients are evaluated using Ramanujan ordered
numbers. Computation of DCT coefficients involves evaluation of cosine angles of multiples of
2π/N. Evaluation of these angles is accomplished by using a 4th
degree polynomial that
approximates the cosine function with error of approximation in the order of 10
-3
[13] . If N is
Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi
International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 472
chosen such that it could be represented as 2 2l m− −
+ , where l and m are integers, then the
trigonometric functions can be evaluated recursively by simple shift and addition operations. Such
integers are called Ramanujan ordered numbers. Use of Ramanujan ordered Number for
computing DCT was outlined by the author in [11,12]. Matrix factorization of the transformation
matrix reduced the complexity to 2log
2
N
N shifts and 2
3
log 1
2
N N N− + additions [12] thereby
eliminating the use of multipliers.
3.1 Multiplierless Ramanujan Ordered Number DCT(RDCT)[11,12]
The 2-D Discrete Cosine Transform (DCT) can be defined as follows:
( ) ( )1 2
1 2
1 1
1 2
1 2 1 2 1 2
0 01 2 1 2
2 1 2 14
( , ) ( , )cos cos
2 2
N N
n n
n n
C k k x n n k k
N N N N
π π
− −
= =
+ +
=
   
   
   
∑∑ (2)
Neglecting the scaling factors and using the property of Seperability, the DCT equation can be
written as:
( ) ( )1 2
1 2
1 1
2 1
1 2 1 2 2 1
0 0 2 1
2 1 2 1
( , ) ( , )cos cos
2 2
N N
n n
n n
C k k x n n k k
N N
π π
− −
= =
+ +
=
    
    
    
∑ ∑
(3)
Thus, 2-D 1 2N N× DCT can be implemented by computing the row transformation followed by
the column transformation. Hence, a 1-D transformation can be considered as a process of
evaluating the sequences in the form as follows:
( )
1
2
0
2
( )cos 2 2 1
N
n
n
c x n n k
N
π−
−
=
 
= + 
 
∑
(4)
3.1.1 Evaluation of Transform Coefficients Using Chebyshev Recursion
Computation of DCT coefficients requires evaluation of sequences of type
( ){ }2| cos 0,1,2 1,n n
nc c p n N p
N
π= = − ∈ℜK (5)
where ℜ is the set of real numbers. These computations are done via a Chebyshev-type of
recursion.
Let us define
( ) ( ){ }, | 2 /
0,1...... ,
1 ,
4
n nW M p w w pcos n M
n p
M
M N
π
β
= =
= Ψ ∈ℜ
 
Ψ = − = 
 
(6)
where, β is equal to 1, if N is divisible by 4. It is equal to 2, if N is divisible by 2, but not by 4.
Otherwise, it is equal to 4(N is not divisible by 2). The use of β facilitates the computation of
w (M, p) by considering cosine values from the first quadrant of the circle.
Let us then define
Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi
International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 473
( )( )
22
2
cos 2 1n
x
N
w n x
π −
=
∴ = +
(7)
x is then represented using Ramanujan ordered number of degree 2 as
ˆ 2 2l m
x where l and m are non - negative integers− −
= + .
For ex: If N=8, then
( )
( )
2 1 2 2
3 4
2
2 2 2 2
8
ˆ 2 2
ˆcosn
x
x
w n x
π − − − −
− −
= ≅ +
= +
′∴ =
(8)
where n′ is the scaled and shifted time samples and ˆx being the Ramanujan ordered number.
Evaluation of these cosine values is by cosine approximation using 2nd
order polynomial. Let the
polynomial be defined as
( ) ( )
2
ˆ
2!
cosn
x
t n
α
α α
=
∴ =
(9)
( )nt α are then computed using the recursive equations as
( )
( ) ( )
( ) ( ) ( ) ( )
( )
0
1
1 1
1
1
2 1
1,2....., 1
n n n
t
t
t t t
n
α
α α
α α α α+ −
=
= −
= − −
= Ψ −
M
(10)
It is observed that the above recursive equations are closely related to Chebyshev polynomial of
the first kind. Since the evaluation of the recursive equations involve only numbers of powers of
two, ( )nt α ’s and therefore ( )nc α ’s can be computed by simple shift and addition operations.
RDCT kernel needs samples only at( )2 1n + , and thus all the samples of ( )nt α need not be
stored.
TABLE I. COMPARISON OF COMPUTATIONAL COMPLEXITY
Operations
Floating-point DCT
N M× [9]
Integer DCT
N M× [10]
RDCT
N M× [11]
Multiplications
( ) 22 logNM M
(Floating-point)
NM
(Integer)
Nil
Additions
( )
( )
23 2 log
2
NM M
NM N M+ − + +
( )22 log 1
2
NM
NM
−
+ +
( ) 23 2 log
2
NM M
NM N M− + +
Lifting Steps Nil
( ) 23 2 log
3 3
N N
N− +
Nil
Shifts Nil Nil ( ) 22 logNM M
Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi
International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 474
Table I gives the comparison of the reduction of the computational complexity of the proposed
algorithm. To compute N M× DCT the proposed algorithm takes
( ) 23 2 log 2NM M NM N M− + + additions and 22logNM M shift operations. Thus for
N=M, the proposed algorithm for a 8 8× block DCT evaluation, requires 96 shift operation, and
176 addition operations. The proposed algorithm being recursive ensures that the storage of the
trigonometric values is not required.
4. SIMULATION RESULTS
To demonstrate the efficacy of the proposed algorithm on MPEG Video codec, the results were
compared with the existing algorithm of the standard MPEG-2 video codec and the results are
tabulated. The proposed RDCT is tested by replacing the two-dimensional DCT block in the
MPEG-2 standard algorithm with the 2-D RDCT block. The performance is then compared by
using commonly used multiplierless 2-D Integer DCT. --. DC coefficient is quantized and coded
separately and transmitted. The AC coefficients are encoded with very few coefficients removing
the completely zero coefficients block.
Table.II gives the average PSNR of the original frame with decoded frame, using 60 frames of
input video sequence (video grabbed at 30fps), with a GOP (group of pictures) as 10 and the
encoding format as I1P4B2B3P7B5B6I10B7B8. The step size is considered as 10 to decode all 10
frames in the display format as I1B2B3P4B5B6P7B8B9I10. The simulation has been evaluated for
both forward and bidirectional prediction and the results shows that the motion estimation in both
the formats gives better results for the proposed RDCT when compared with the floating-point
DCT and the integer DCT. From Table II it is clear that the proposed RDCT offers same accuracy
in average PSNR as that of the floating-point DCT with a deviation of 0.01%, and the deviation
with Integer DCT is by 0.01% for standard test sequence like Alex.avi. The deviation in PSNR of
the RDCT with floating-point DCT is 0.005% and with integer DCT is 0.08% for real time data
sequence. This clearly shows that the proposed technique of using RDCT for the video codec is
providing better reconstructed picture quality.
TABLE II AVERAGE PSNR IN dB OF THE DECODED FRAMES
Test Sequence Frame Format
Floating-
point DCT[9]
Integer DCT
[10]
Multiplierless
RDCT
Real time Data
(Frames grabbed at
30 fps)
IBBPBBPBBP 35.7010 35.6694 35.6991
IPPPPPPPPP 33.6581 33.6132 33.6525
San_Fran_Traffic
IBBPBBPBBP 34.8076 34.7776 34.7996
IPPPPPPPPP 31.7476 31.7176 31.7462
Alex
IBBPBBPBBP 36.0928 36.0809 36.0876
IPPPPPPPPP 31.583 31.5756 31.579
The Structural Similarity Index Matrix (SSIM) index seeks to separately discover differences in
local image luminance l(x,y), contrast c(x,y) and structure s(x,y) between the original and
compensated images. Given the pixel points (x,y), the SSIM is defined as
( ) ( ) ( ) ( )
1 2 3
2 2 2 2 2 2
1 2 3
, , . , . ,
2 2
= . .x y x y xy
x y x y x y
SSIM x y l x y c x y S x y
C C C
C C C
µ µ σ σ σ
µ µ σ σ σ σ
=
+ + +
+ + + + + +
(11)
where µx, µy, xσ , yσ and xyσ are the local sample means, variances, and cross-covariance of x
and y. The constants C1, C2, C3 stabilize SSIM when the means and variances become small.
Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi
International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 475
SSIM index varies between 0(worst) and 1(best). Table III shows the average SSIM for decoded
frames with original frames.
Table III AVERAGE SSIM BETWEEN THE DECODED AND ORIGINAL FRAMES
Test Sequence Frame Format
Floating-
point
DCT[9]
Integer
DCT
[10]
Multiplierless
RDCT
Real time Data
(Frames grabbed
at 30 fps)
IBBPBBPBBP 0.9223 0.9218 0.9223
IPPPPPPPPP 0.921645 0.921628 0.921635
San_Fran_Traffic
IBBPBBPBBP 0.85028 0.85020 0.85026
IPPPPPPPPP 0.85701 0.85014 0.85693
Alex
IBBPBBPBBP 0.8689 0.8684 0.8690
IPPPPPPPPP 0.8678 0.8668 0.8682
From Table III, it is clear that the quality of decoding is very good with RDCT and achieves the
same performance as that of the floating-point DCT. This is ensured by taking the difference
frame between the reference frame and the decoded frame. The difference frame is as shown in
the Figure 3a and 3b. The difference between the RDCT and the floating-point DCT in terms of
SSIM is 0.01% for standard test sequence like Alex.avi and the difference between the RDCT
and the Integer DCT in terms of SSIM is 0.07% for the same test sequence. For the real time
data the difference between RDCT and floating-point DCT is 0 in terms of SSIM and between
RDCT and integer DCT is 0.05% in terms of SSIM. These values clearly indicate that the
reconstructed frame with proposed RDCT is very good in subjective quality when compared with
the reconstructed frame with Integer DCT.
FIGURE 3a Difference between original and decoded frame (real time sequence)
Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi
International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 476
FIGURE 3B Difference between original and decoded frame (San_Fran_Traffic.avi)
Table IV shows the comparison of the computation time for decoding I reference frame and
decoding 60 frames, with different algorithms with a GOP of 10 frames. The computation was
performed on a Intel Core 2 Duo Processor, @ 1.80 GHz.
TABLE IV DECODING TIME IN SECONDS
Table IV shows that the proposed RDCT has reduction in decoding time for 10 frames by
47.9578% when compared with the floating-point DCT whereas it has an improvement of
56.1158% over the commonly used integer DCT for a real time data sequence. However, the
reduction in the time is 47.1884% when compared with the floating-point DCT whereas it has an
improvement of 54.6779% over integer DCT for a standard data sequence like Alex.avi.
Test Sequence Decoding frame
Floating-
point DCT
Integer
DCT
Multiplierless
RDCT
Real time Data
(Frames grabbed
at 30 fps)
Reference
frame
0.191 0.313 0.125
10 frames 12.609 14.953 6.562
San_Fran_Traffic
Reference
frame
0.296 0.308 0.109
10 frames 12.322 14.641 6.335
Alex
Reference
frame
0.245 0.325 0.125
10 frames 12.484 14.547 6.593
Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi
International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 477
FIGURE 4. Comparison plot for sequences ‘Real-time sequence’, ‘San_Fran_Traffic’ & ‘Alex’
Fig 4 gives us better comparison in terms of the execution times for decoding 10 frames using
different algorithms namely RDCT, floating-point DCT and the IntDCT. The plot clearly shows the
RDCT outperforms the floating-point DCT and the IntDCT. This improvement in the decoding time
is due to the improvement in the computational complexity of the DCT algorithm.
5. CONCLUSION
The computationally less complex video coding technique is presented in this paper using
multiplierless Ramanujan ordered DCT. This method allows us to evaluate the cosine function
using only integers which are powers of 2 thereby replaces the complex floating-point
multiplications by shifters and adders. This algorithm takes 2N / 2 log N shifts and
( )23N/2log N N 1− + addition operations to evaluate an N-point DCT. The cosine approximation
increases the overhead on the number of adders by 13.6% but totally avoids floating point
multiplications. The reduction in complexity is reflected in the time required for the decoding of
video frames. There is an improvement of 58% from the existing commonly used Integer DCT
video codec. The average SSIM and average PSNR values indicate that the quality of decoding
using the RDCT is same as that of the Integer DCT. Hence, the proposed algorithm is an efficient
multiplierless transform for video coding that offers less computationally complexity but assures
the same quality as that of the existing algorithms.
6. REFERENCES
[1] G. Eason, B. Noble, and I. N. Sneddon, “On certain integrals of ITU-T Recommendation
H.261, “Video codecs for audiovisual services at p x 64 kb/s,” Mar. 1993.
[2] ITU-T Recommendation H.263, “Video coding for low bitrate communication,” Mar. 1996.
Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi
International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 478
[3] ISO/IEC 11 172-2, “Information technology - coding of moving pictures and associated
audio for digital storage media at up to about 1.5 Mbit/s: Part 2 Video,” Aug. 1993.
[4] ITU-T Recommendation H.262 I ISODEC 13818-2, “Information technology - generic
coding of moving pictures and associated audio information: video,” 1995.
[5] ISO/IEC 13818-2 "Generic Coding of Moving Pictures and Associated Audio Information:
Video",
[6] ATSC document A/54 "Guide to the Use of the ATSC Digital Television Standard"
[7] Renxiang Li, Bing Zeng and Ming I.Liou, “A New Three-Step Search Algorithm for Block
Motion Estimation”, IEEE Trans. Circuits And Systems For Video Technology, Vol.4, No.4,
pp. 438-442, Aug 1994.
[8] Aroh Barjatya, “Block Matching Algorithms For Motion Estimation” , DIP 6620 Spring 2004
Final Project Paper.
[9] H.S. Hou, “A Fast Recursive Algorithms for Computing the Discrete Cosine Transform”.
IEEE Trans. Acoust., Speech, Signal Processing, Vol.35, pp 1455-1461, Oct 1987.
[10] Yonghong Zeng, Lizhi Cheng, Guoan Bi, and Alex C. Kot, ‘‘Integer DCT’s and Fast
Algorithms”, IEEE Signal Proc.141-14 (2000).
[11] Geetha.K.S, V.K.Ananthashayana, ‘‘A Novel Recursive Multiplierless Algorithm for 2-D
DCT”,Proc. ICSPCN 2009,Aug 2009.
[12] Geetha.K.S, M.Uttarakumari, “Multiplierless Recursive algorithm using Ramanujan ordered
Numbers,” in IETE Journal of Research, vol. 56, Issue 4, JUL-AUG 2010.
[13] Geetha.K.S, M.Uttarakumari, “A Novel Cosine approximation for high-speed evaluation of
DCT” International Journal of Image Processing, CSC Journals Volume: 4 Issue: 6 Pg
539 – 548 Jan-Feb 2011.
Ad

Recommended

New Approach of Preprocessing For Numeral Recognition
New Approach of Preprocessing For Numeral Recognition
IJERA Editor
 
Capstone paper
Capstone paper
Muhammad Saeed
 
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
An Approach for Image Deblurring: Based on Sparse Representation and Regulari...
IRJET Journal
 
Kernels in convolution
Kernels in convolution
Revanth Kumar
 
R044120124
R044120124
IJERA Editor
 
DCT
DCT
aniruddh Tyagi
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
ijceronline
 
Performance Comparison of K-means Codebook Optimization using different Clust...
Performance Comparison of K-means Codebook Optimization using different Clust...
IOSR Journals
 
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Seongwon Hwang
 
Performance Improvement of Vector Quantization with Bit-parallelism Hardware
Performance Improvement of Vector Quantization with Bit-parallelism Hardware
CSCJournals
 
Content Based Video Retrieval in Transformed Domain using Fractional Coeffici...
Content Based Video Retrieval in Transformed Domain using Fractional Coeffici...
CSCJournals
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Matlab Implementation of Baseline JPEG Image Compression Using Hardware Optim...
Matlab Implementation of Baseline JPEG Image Compression Using Hardware Optim...
inventionjournals
 
INVERSIONOF MAGNETIC ANOMALIES DUE TO 2-D CYLINDRICAL STRUCTURES –BY AN ARTIF...
INVERSIONOF MAGNETIC ANOMALIES DUE TO 2-D CYLINDRICAL STRUCTURES –BY AN ARTIF...
ijsc
 
Substitution-diffusion based Image Cipher
Substitution-diffusion based Image Cipher
IJNSA Journal
 
nips report
nips report
?? ?
 
Comparative Study between DCT and Wavelet Transform Based Image Compression A...
Comparative Study between DCT and Wavelet Transform Based Image Compression A...
IOSR Journals
 
Robust Watermarking through Dual Band IWT and Chinese Remainder Theorem
Robust Watermarking through Dual Band IWT and Chinese Remainder Theorem
journalBEEI
 
APPLIED MACHINE LEARNING
APPLIED MACHINE LEARNING
Revanth Kumar
 
High Speed Memory Efficient Multiplier-less 1-D 9/7 Wavelet Filters Based NED...
High Speed Memory Efficient Multiplier-less 1-D 9/7 Wavelet Filters Based NED...
IJERA Editor
 
VARIATION-FREE WATERMARKING TECHNIQUE BASED ON SCALE RELATIONSHIP
VARIATION-FREE WATERMARKING TECHNIQUE BASED ON SCALE RELATIONSHIP
csandit
 
Implementation performance analysis of cordic
Implementation performance analysis of cordic
iaemedu
 
Mm chap08 -_lossy_compression_algorithms
Mm chap08 -_lossy_compression_algorithms
Eellekwameowusu
 
Fast dct algorithm using winograd’s method
Fast dct algorithm using winograd’s method
IAEME Publication
 
2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression
Alexander Decker
 
2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression
Alexander Decker
 
IIIRJET-Implementation of Image Compression Algorithm on FPGA
IIIRJET-Implementation of Image Compression Algorithm on FPGA
IRJET Journal
 
Modified approximate 8-point multiplier less DCT like transform
Modified approximate 8-point multiplier less DCT like transform
IJERA Editor
 
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
IJERA Editor
 

More Related Content

What's hot (17)

Convolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Seongwon Hwang
 
Performance Improvement of Vector Quantization with Bit-parallelism Hardware
Performance Improvement of Vector Quantization with Bit-parallelism Hardware
CSCJournals
 
Content Based Video Retrieval in Transformed Domain using Fractional Coeffici...
Content Based Video Retrieval in Transformed Domain using Fractional Coeffici...
CSCJournals
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Matlab Implementation of Baseline JPEG Image Compression Using Hardware Optim...
Matlab Implementation of Baseline JPEG Image Compression Using Hardware Optim...
inventionjournals
 
INVERSIONOF MAGNETIC ANOMALIES DUE TO 2-D CYLINDRICAL STRUCTURES –BY AN ARTIF...
INVERSIONOF MAGNETIC ANOMALIES DUE TO 2-D CYLINDRICAL STRUCTURES –BY AN ARTIF...
ijsc
 
Substitution-diffusion based Image Cipher
Substitution-diffusion based Image Cipher
IJNSA Journal
 
nips report
nips report
?? ?
 
Comparative Study between DCT and Wavelet Transform Based Image Compression A...
Comparative Study between DCT and Wavelet Transform Based Image Compression A...
IOSR Journals
 
Robust Watermarking through Dual Band IWT and Chinese Remainder Theorem
Robust Watermarking through Dual Band IWT and Chinese Remainder Theorem
journalBEEI
 
APPLIED MACHINE LEARNING
APPLIED MACHINE LEARNING
Revanth Kumar
 
High Speed Memory Efficient Multiplier-less 1-D 9/7 Wavelet Filters Based NED...
High Speed Memory Efficient Multiplier-less 1-D 9/7 Wavelet Filters Based NED...
IJERA Editor
 
VARIATION-FREE WATERMARKING TECHNIQUE BASED ON SCALE RELATIONSHIP
VARIATION-FREE WATERMARKING TECHNIQUE BASED ON SCALE RELATIONSHIP
csandit
 
Implementation performance analysis of cordic
Implementation performance analysis of cordic
iaemedu
 
Mm chap08 -_lossy_compression_algorithms
Mm chap08 -_lossy_compression_algorithms
Eellekwameowusu
 
Fast dct algorithm using winograd’s method
Fast dct algorithm using winograd’s method
IAEME Publication
 
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Seongwon Hwang
 
Performance Improvement of Vector Quantization with Bit-parallelism Hardware
Performance Improvement of Vector Quantization with Bit-parallelism Hardware
CSCJournals
 
Content Based Video Retrieval in Transformed Domain using Fractional Coeffici...
Content Based Video Retrieval in Transformed Domain using Fractional Coeffici...
CSCJournals
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Matlab Implementation of Baseline JPEG Image Compression Using Hardware Optim...
Matlab Implementation of Baseline JPEG Image Compression Using Hardware Optim...
inventionjournals
 
INVERSIONOF MAGNETIC ANOMALIES DUE TO 2-D CYLINDRICAL STRUCTURES –BY AN ARTIF...
INVERSIONOF MAGNETIC ANOMALIES DUE TO 2-D CYLINDRICAL STRUCTURES –BY AN ARTIF...
ijsc
 
Substitution-diffusion based Image Cipher
Substitution-diffusion based Image Cipher
IJNSA Journal
 
nips report
nips report
?? ?
 
Comparative Study between DCT and Wavelet Transform Based Image Compression A...
Comparative Study between DCT and Wavelet Transform Based Image Compression A...
IOSR Journals
 
Robust Watermarking through Dual Band IWT and Chinese Remainder Theorem
Robust Watermarking through Dual Band IWT and Chinese Remainder Theorem
journalBEEI
 
APPLIED MACHINE LEARNING
APPLIED MACHINE LEARNING
Revanth Kumar
 
High Speed Memory Efficient Multiplier-less 1-D 9/7 Wavelet Filters Based NED...
High Speed Memory Efficient Multiplier-less 1-D 9/7 Wavelet Filters Based NED...
IJERA Editor
 
VARIATION-FREE WATERMARKING TECHNIQUE BASED ON SCALE RELATIONSHIP
VARIATION-FREE WATERMARKING TECHNIQUE BASED ON SCALE RELATIONSHIP
csandit
 
Implementation performance analysis of cordic
Implementation performance analysis of cordic
iaemedu
 
Mm chap08 -_lossy_compression_algorithms
Mm chap08 -_lossy_compression_algorithms
Eellekwameowusu
 
Fast dct algorithm using winograd’s method
Fast dct algorithm using winograd’s method
IAEME Publication
 

Similar to An Efficient Multiplierless Transform algorithm for Video Coding (20)

2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression
Alexander Decker
 
2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression
Alexander Decker
 
IIIRJET-Implementation of Image Compression Algorithm on FPGA
IIIRJET-Implementation of Image Compression Algorithm on FPGA
IRJET Journal
 
Modified approximate 8-point multiplier less DCT like transform
Modified approximate 8-point multiplier less DCT like transform
IJERA Editor
 
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
IJERA Editor
 
H0545156
H0545156
IOSR Journals
 
3 - A critical review on the usual DCT Implementations (presented in a Malays...
3 - A critical review on the usual DCT Implementations (presented in a Malays...
Youness Lahdili
 
Bivariatealgebraic integerencoded arai algorithm for
Bivariatealgebraic integerencoded arai algorithm for
eSAT Publishing House
 
Bg044357364
Bg044357364
IJERA Editor
 
Kassem2009
Kassem2009
lazchi
 
M4L12.ppt
M4L12.ppt
SamrajECE
 
4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...
4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...
Youness Lahdili
 
Medical Image Compression using DCT with Entropy Encoding and Huffman on MRI ...
Medical Image Compression using DCT with Entropy Encoding and Huffman on MRI ...
Associate Professor in VSB Coimbatore
 
A Comparative Study of Image Compression Algorithms
A Comparative Study of Image Compression Algorithms
IJORCS
 
Discrete cosine transform
Discrete cosine transform
Rashmi Karkra
 
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
VLSICS Design
 
Efficient Implementation of Low Power 2-D DCT Architecture
Efficient Implementation of Low Power 2-D DCT Architecture
IJMER
 
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
VLSICS Design
 
11.0003www.iiste.org call for paper_d_discrete cosine transform for image com...
11.0003www.iiste.org call for paper_d_discrete cosine transform for image com...
Alexander Decker
 
Multimedia image compression standards
Multimedia image compression standards
Mazin Alwaaly
 
2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression
Alexander Decker
 
2.[9 17]comparative analysis between dct & dwt techniques of image compression
2.[9 17]comparative analysis between dct & dwt techniques of image compression
Alexander Decker
 
IIIRJET-Implementation of Image Compression Algorithm on FPGA
IIIRJET-Implementation of Image Compression Algorithm on FPGA
IRJET Journal
 
Modified approximate 8-point multiplier less DCT like transform
Modified approximate 8-point multiplier less DCT like transform
IJERA Editor
 
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
FPGA Implementation of 2-D DCT & DWT Engines for Vision Based Tracking of Dyn...
IJERA Editor
 
3 - A critical review on the usual DCT Implementations (presented in a Malays...
3 - A critical review on the usual DCT Implementations (presented in a Malays...
Youness Lahdili
 
Bivariatealgebraic integerencoded arai algorithm for
Bivariatealgebraic integerencoded arai algorithm for
eSAT Publishing House
 
Kassem2009
Kassem2009
lazchi
 
4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...
4 - Simulation and analysis of different DCT techniques on MATLAB (presented ...
Youness Lahdili
 
Medical Image Compression using DCT with Entropy Encoding and Huffman on MRI ...
Medical Image Compression using DCT with Entropy Encoding and Huffman on MRI ...
Associate Professor in VSB Coimbatore
 
A Comparative Study of Image Compression Algorithms
A Comparative Study of Image Compression Algorithms
IJORCS
 
Discrete cosine transform
Discrete cosine transform
Rashmi Karkra
 
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
Pipelined Architecture of 2D-DCT, Quantization and ZigZag Process for JPEG Im...
VLSICS Design
 
Efficient Implementation of Low Power 2-D DCT Architecture
Efficient Implementation of Low Power 2-D DCT Architecture
IJMER
 
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
PIPELINED ARCHITECTURE OF 2D-DCT, QUANTIZATION AND ZIGZAG PROCESS FOR JPEG IM...
VLSICS Design
 
11.0003www.iiste.org call for paper_d_discrete cosine transform for image com...
11.0003www.iiste.org call for paper_d_discrete cosine transform for image com...
Alexander Decker
 
Multimedia image compression standards
Multimedia image compression standards
Mazin Alwaaly
 
Ad

Recently uploaded (20)

Code Profiling in Odoo 18 - Odoo 18 Slides
Code Profiling in Odoo 18 - Odoo 18 Slides
Celine George
 
IIT KGP Quiz Week 2024 Sports Quiz (Prelims + Finals)
IIT KGP Quiz Week 2024 Sports Quiz (Prelims + Finals)
IIT Kharagpur Quiz Club
 
Great Governors' Send-Off Quiz 2025 Prelims IIT KGP
Great Governors' Send-Off Quiz 2025 Prelims IIT KGP
IIT Kharagpur Quiz Club
 
2025 June Year 9 Presentation: Subject selection.pptx
2025 June Year 9 Presentation: Subject selection.pptx
mansk2
 
M&A5 Q1 1 differentiate evolving early Philippine conventional and contempora...
M&A5 Q1 1 differentiate evolving early Philippine conventional and contempora...
ErlizaRosete
 
VCE Literature Section A Exam Response Guide
VCE Literature Section A Exam Response Guide
jpinnuck
 
2025 Completing the Pre-SET Plan Form.pptx
2025 Completing the Pre-SET Plan Form.pptx
mansk2
 
LAZY SUNDAY QUIZ "A GENERAL QUIZ" JUNE 2025 SMC QUIZ CLUB, SILCHAR MEDICAL CO...
LAZY SUNDAY QUIZ "A GENERAL QUIZ" JUNE 2025 SMC QUIZ CLUB, SILCHAR MEDICAL CO...
Ultimatewinner0342
 
Public Health For The 21st Century 1st Edition Judy Orme Jane Powell
Public Health For The 21st Century 1st Edition Judy Orme Jane Powell
trjnesjnqg7801
 
Filipino 9 Maikling Kwento Ang Ama Panitikang Asiyano
Filipino 9 Maikling Kwento Ang Ama Panitikang Asiyano
sumadsadjelly121997
 
ECONOMICS, DISASTER MANAGEMENT, ROAD SAFETY - STUDY MATERIAL [10TH]
ECONOMICS, DISASTER MANAGEMENT, ROAD SAFETY - STUDY MATERIAL [10TH]
SHERAZ AHMAD LONE
 
CRYPTO TRADING COURSE BY FINANCEWORLD.IO
CRYPTO TRADING COURSE BY FINANCEWORLD.IO
AndrewBorisenko3
 
How to Manage Different Customer Addresses in Odoo 18 Accounting
How to Manage Different Customer Addresses in Odoo 18 Accounting
Celine George
 
Peer Teaching Observations During School Internship
Peer Teaching Observations During School Internship
AjayaMohanty7
 
University of Ghana Cracks Down on Misconduct: Over 100 Students Sanctioned
University of Ghana Cracks Down on Misconduct: Over 100 Students Sanctioned
Kweku Zurek
 
Q1_TLE 8_Week 1- Day 1 tools and equipment
Q1_TLE 8_Week 1- Day 1 tools and equipment
clairenotado3
 
F-BLOCK ELEMENTS POWER POINT PRESENTATIONS
F-BLOCK ELEMENTS POWER POINT PRESENTATIONS
mprpgcwa2024
 
Photo chemistry Power Point Presentation
Photo chemistry Power Point Presentation
mprpgcwa2024
 
YSPH VMOC Special Report - Measles Outbreak Southwest US 6-14-2025.pptx
YSPH VMOC Special Report - Measles Outbreak Southwest US 6-14-2025.pptx
Yale School of Public Health - The Virtual Medical Operations Center (VMOC)
 
LDMMIA Shop & Student News Summer Solstice 25
LDMMIA Shop & Student News Summer Solstice 25
LDM & Mia eStudios
 
Code Profiling in Odoo 18 - Odoo 18 Slides
Code Profiling in Odoo 18 - Odoo 18 Slides
Celine George
 
IIT KGP Quiz Week 2024 Sports Quiz (Prelims + Finals)
IIT KGP Quiz Week 2024 Sports Quiz (Prelims + Finals)
IIT Kharagpur Quiz Club
 
Great Governors' Send-Off Quiz 2025 Prelims IIT KGP
Great Governors' Send-Off Quiz 2025 Prelims IIT KGP
IIT Kharagpur Quiz Club
 
2025 June Year 9 Presentation: Subject selection.pptx
2025 June Year 9 Presentation: Subject selection.pptx
mansk2
 
M&A5 Q1 1 differentiate evolving early Philippine conventional and contempora...
M&A5 Q1 1 differentiate evolving early Philippine conventional and contempora...
ErlizaRosete
 
VCE Literature Section A Exam Response Guide
VCE Literature Section A Exam Response Guide
jpinnuck
 
2025 Completing the Pre-SET Plan Form.pptx
2025 Completing the Pre-SET Plan Form.pptx
mansk2
 
LAZY SUNDAY QUIZ "A GENERAL QUIZ" JUNE 2025 SMC QUIZ CLUB, SILCHAR MEDICAL CO...
LAZY SUNDAY QUIZ "A GENERAL QUIZ" JUNE 2025 SMC QUIZ CLUB, SILCHAR MEDICAL CO...
Ultimatewinner0342
 
Public Health For The 21st Century 1st Edition Judy Orme Jane Powell
Public Health For The 21st Century 1st Edition Judy Orme Jane Powell
trjnesjnqg7801
 
Filipino 9 Maikling Kwento Ang Ama Panitikang Asiyano
Filipino 9 Maikling Kwento Ang Ama Panitikang Asiyano
sumadsadjelly121997
 
ECONOMICS, DISASTER MANAGEMENT, ROAD SAFETY - STUDY MATERIAL [10TH]
ECONOMICS, DISASTER MANAGEMENT, ROAD SAFETY - STUDY MATERIAL [10TH]
SHERAZ AHMAD LONE
 
CRYPTO TRADING COURSE BY FINANCEWORLD.IO
CRYPTO TRADING COURSE BY FINANCEWORLD.IO
AndrewBorisenko3
 
How to Manage Different Customer Addresses in Odoo 18 Accounting
How to Manage Different Customer Addresses in Odoo 18 Accounting
Celine George
 
Peer Teaching Observations During School Internship
Peer Teaching Observations During School Internship
AjayaMohanty7
 
University of Ghana Cracks Down on Misconduct: Over 100 Students Sanctioned
University of Ghana Cracks Down on Misconduct: Over 100 Students Sanctioned
Kweku Zurek
 
Q1_TLE 8_Week 1- Day 1 tools and equipment
Q1_TLE 8_Week 1- Day 1 tools and equipment
clairenotado3
 
F-BLOCK ELEMENTS POWER POINT PRESENTATIONS
F-BLOCK ELEMENTS POWER POINT PRESENTATIONS
mprpgcwa2024
 
Photo chemistry Power Point Presentation
Photo chemistry Power Point Presentation
mprpgcwa2024
 
LDMMIA Shop & Student News Summer Solstice 25
LDMMIA Shop & Student News Summer Solstice 25
LDM & Mia eStudios
 
Ad

An Efficient Multiplierless Transform algorithm for Video Coding

  • 1. Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 469 An Efficient Multiplierless Transform Algorithm for Video Coding Geetha. K. S [email protected] Associate Professor, Dept of E&CE R.V.College of Engineering, Bangalore, 560 059, India M K Pushpa [email protected] Associate Professor, Dept of IT M.S.Ramaiah Institute of Technology, Bangalore,560 054, India Dr.M.Uttarakumari [email protected] Professor, Dept of E&C R.V.College of Engineering, Bangalore,560 059, India Dr. S.Sethu Selvi [email protected] Professor & Head, Dept of E&C M.S.Ramaiah Institute of Technology, Bangalore,560 054, India Abstract This paper presents an efficient algorithm to accelerate software video encoders/decoders by reducing the number of arithmetic operations for Discrete Cosine Transform (DCT). A multiplierless Ramanujan Ordered Number DCT (RDCT) is presented which computes the coefficients using shifts and addition operations only. The reduction in computational complexity has improved the performance of the video codec by almost 58% compared with the commonly used integer DCT. The results show that significant computation reduction can be achieved with negligible average peak signal-to-noise ratio (PSNR) degradation. The average structural similarity index matrix (SSIM) also ensures that the degradation due to the approximation is minimal. Keywords: Ramanujan Ordered Number DCT, Multiplierless DCT, Video Coding. 1. INTRODUCTION Digital video applications have become more and more popular in our everyday life. Currently, there are several video standards, such as H.261 [l], H.263 [2], and MPEG [3][4], established for different applications. All these standards use motion compensated prediction, Discrete Cosine Transform (DCT), quantization, zigzag scan, and Variable Length Coding (VLC) as their basic functional blocks. Among these building blocks, Motion estimation (ME) in the motion compensated (MC) prediction is the most computationally intensive part, and then the DCT and the Inverse DCT (IDCT). Many fast algorithms have been developed to speed up the computation for Motion estimation. In this paper, an efficient technique is investigated to accelerate software video encoders by reducing the number of operations for DCT and quantization. The DCT and the quantization processes require a lot of multiplications, which are computationally expensive. A modification is proposed by replacing the 2-D DCT block in the standard MPEG-2 video codec with the 2-D Multiplierless Recursive DCT block. The performance is then compared with the existing DCT algorithms.
  • 2. Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 470 The organization of the paper is as follows. In Section 2, different blocks of video coding as in MPEG coder/decoder are explained, Section 3, explains the use of the multiplierless DCT coefficient computation that reduces the computation in the video encoder. In Section 4, the methodology of the proposed technique with the simulation results is discussed. 2. MPEG CODER/DECODER The international standard [5, 6] describe a system, MPEG-2, for encoding and decoding digital video data. The standard allows for the encoding of video over a wide range of resolutions, including higher resolutions commonly known as HDTV. In this system, encoded pictures are made up of pixels. If each 8 8× array of pixels is known as a block, then an 2 2× array of blocks is termed a macroblock. In this paper, an 8 8× array of pixels is used as macroblock. Compression is achieved using the well known techniques of prediction (motion estimation in the encoder, motion compensation in the decoder), 2-D DCT, quantization of DCT coefficients, and Huffman/run(remove space) length coding. Pictures called I pictures are encoded without prediction and maintained as reference frames. Pictures termed P pictures may be encoded with prediction from previous pictures. B pictures may be encoded using prediction from both previous and subsequent pictures. A simplified MPEG-2 encoder and decoder is shown in Figure 1. FIGURE1. MPEG-2 Encoder and Decoder
  • 3. Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 471 Before DCT is performed, motion compensated prediction is done for every macro block (8 8× pixels) on inter-coded frames. The objective of motion estimation is to find the best match of the current macro block within the search region in the reference frame. The common matching criterion used for finding the best match in the search region is the Mean Absolute Difference (MAD). 1 1 2 0 0 1 N N ij ij i j MAD C R N − − = = = −∑∑ (1) Where N is the size of the macro block, ijC and ijR are the pixels being compared in current macro block and the reference macro block respectively. In motion compensated predictive coding, before performing the DCT computation, the Three Step Search algorithm [7, 8] is used to find the motion vectors. The best macroblock is found by using the MAD as a measure. The search algorithm is started with the search location at the centre of the macroblock as (0, 0). The step size is then fixed as S=4, and the search parameter as 7 for a macroblock of size8 8× . So, the search continues for the eight neighborhood pixels around location (0, 0). Out of these 9 locations, the pixel with the least cost function is then reinitiated as the new search origin and the step size is then reduced by half. So, S=S/2. The procedure is repeated until S=1. The pixel with the least cost function would then be the best match. The vector that represents the best match is saved. FIGURE 2: Three Step Search Procedure. (Motion Vector is (5,7) Each motion compensated macro block consists of four 8 8× luminance and two 8 8× chrominance prediction error blocks (difference blocks). These 8 8× blocks are transformed to generate 8 8× DCT coefficients and these coefficients are quantized for compression. 3. PROPOSED VIDEO CODEC The DCT and the quantization processes require a lot of multiplications, which are computationally expensive. The standardized DCT block requires floating-point multipliers and for an 8 8× block, evaluation of coefficients require 12 floating-point multipliers. The implementation of such a codec is more expensive as the complexity is concentrated towards the floating-point multipliers. This disadvantage is overcome by replacing the floating-point DCT block with a multiplierless DCT block where the coefficients are evaluated using Ramanujan ordered numbers. Computation of DCT coefficients involves evaluation of cosine angles of multiples of 2π/N. Evaluation of these angles is accomplished by using a 4th degree polynomial that approximates the cosine function with error of approximation in the order of 10 -3 [13] . If N is
  • 4. Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 472 chosen such that it could be represented as 2 2l m− − + , where l and m are integers, then the trigonometric functions can be evaluated recursively by simple shift and addition operations. Such integers are called Ramanujan ordered numbers. Use of Ramanujan ordered Number for computing DCT was outlined by the author in [11,12]. Matrix factorization of the transformation matrix reduced the complexity to 2log 2 N N shifts and 2 3 log 1 2 N N N− + additions [12] thereby eliminating the use of multipliers. 3.1 Multiplierless Ramanujan Ordered Number DCT(RDCT)[11,12] The 2-D Discrete Cosine Transform (DCT) can be defined as follows: ( ) ( )1 2 1 2 1 1 1 2 1 2 1 2 1 2 0 01 2 1 2 2 1 2 14 ( , ) ( , )cos cos 2 2 N N n n n n C k k x n n k k N N N N π π − − = = + + =             ∑∑ (2) Neglecting the scaling factors and using the property of Seperability, the DCT equation can be written as: ( ) ( )1 2 1 2 1 1 2 1 1 2 1 2 2 1 0 0 2 1 2 1 2 1 ( , ) ( , )cos cos 2 2 N N n n n n C k k x n n k k N N π π − − = = + + =                ∑ ∑ (3) Thus, 2-D 1 2N N× DCT can be implemented by computing the row transformation followed by the column transformation. Hence, a 1-D transformation can be considered as a process of evaluating the sequences in the form as follows: ( ) 1 2 0 2 ( )cos 2 2 1 N n n c x n n k N π− − =   = +    ∑ (4) 3.1.1 Evaluation of Transform Coefficients Using Chebyshev Recursion Computation of DCT coefficients requires evaluation of sequences of type ( ){ }2| cos 0,1,2 1,n n nc c p n N p N π= = − ∈ℜK (5) where ℜ is the set of real numbers. These computations are done via a Chebyshev-type of recursion. Let us define ( ) ( ){ }, | 2 / 0,1...... , 1 , 4 n nW M p w w pcos n M n p M M N π β = = = Ψ ∈ℜ   Ψ = − =    (6) where, β is equal to 1, if N is divisible by 4. It is equal to 2, if N is divisible by 2, but not by 4. Otherwise, it is equal to 4(N is not divisible by 2). The use of β facilitates the computation of w (M, p) by considering cosine values from the first quadrant of the circle. Let us then define
  • 5. Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 473 ( )( ) 22 2 cos 2 1n x N w n x π − = ∴ = + (7) x is then represented using Ramanujan ordered number of degree 2 as ˆ 2 2l m x where l and m are non - negative integers− − = + . For ex: If N=8, then ( ) ( ) 2 1 2 2 3 4 2 2 2 2 2 8 ˆ 2 2 ˆcosn x x w n x π − − − − − − = ≅ + = + ′∴ = (8) where n′ is the scaled and shifted time samples and ˆx being the Ramanujan ordered number. Evaluation of these cosine values is by cosine approximation using 2nd order polynomial. Let the polynomial be defined as ( ) ( ) 2 ˆ 2! cosn x t n α α α = ∴ = (9) ( )nt α are then computed using the recursive equations as ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 0 1 1 1 1 1 2 1 1,2....., 1 n n n t t t t t n α α α α α α α+ − = = − = − − = Ψ − M (10) It is observed that the above recursive equations are closely related to Chebyshev polynomial of the first kind. Since the evaluation of the recursive equations involve only numbers of powers of two, ( )nt α ’s and therefore ( )nc α ’s can be computed by simple shift and addition operations. RDCT kernel needs samples only at( )2 1n + , and thus all the samples of ( )nt α need not be stored. TABLE I. COMPARISON OF COMPUTATIONAL COMPLEXITY Operations Floating-point DCT N M× [9] Integer DCT N M× [10] RDCT N M× [11] Multiplications ( ) 22 logNM M (Floating-point) NM (Integer) Nil Additions ( ) ( ) 23 2 log 2 NM M NM N M+ − + + ( )22 log 1 2 NM NM − + + ( ) 23 2 log 2 NM M NM N M− + + Lifting Steps Nil ( ) 23 2 log 3 3 N N N− + Nil Shifts Nil Nil ( ) 22 logNM M
  • 6. Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 474 Table I gives the comparison of the reduction of the computational complexity of the proposed algorithm. To compute N M× DCT the proposed algorithm takes ( ) 23 2 log 2NM M NM N M− + + additions and 22logNM M shift operations. Thus for N=M, the proposed algorithm for a 8 8× block DCT evaluation, requires 96 shift operation, and 176 addition operations. The proposed algorithm being recursive ensures that the storage of the trigonometric values is not required. 4. SIMULATION RESULTS To demonstrate the efficacy of the proposed algorithm on MPEG Video codec, the results were compared with the existing algorithm of the standard MPEG-2 video codec and the results are tabulated. The proposed RDCT is tested by replacing the two-dimensional DCT block in the MPEG-2 standard algorithm with the 2-D RDCT block. The performance is then compared by using commonly used multiplierless 2-D Integer DCT. --. DC coefficient is quantized and coded separately and transmitted. The AC coefficients are encoded with very few coefficients removing the completely zero coefficients block. Table.II gives the average PSNR of the original frame with decoded frame, using 60 frames of input video sequence (video grabbed at 30fps), with a GOP (group of pictures) as 10 and the encoding format as I1P4B2B3P7B5B6I10B7B8. The step size is considered as 10 to decode all 10 frames in the display format as I1B2B3P4B5B6P7B8B9I10. The simulation has been evaluated for both forward and bidirectional prediction and the results shows that the motion estimation in both the formats gives better results for the proposed RDCT when compared with the floating-point DCT and the integer DCT. From Table II it is clear that the proposed RDCT offers same accuracy in average PSNR as that of the floating-point DCT with a deviation of 0.01%, and the deviation with Integer DCT is by 0.01% for standard test sequence like Alex.avi. The deviation in PSNR of the RDCT with floating-point DCT is 0.005% and with integer DCT is 0.08% for real time data sequence. This clearly shows that the proposed technique of using RDCT for the video codec is providing better reconstructed picture quality. TABLE II AVERAGE PSNR IN dB OF THE DECODED FRAMES Test Sequence Frame Format Floating- point DCT[9] Integer DCT [10] Multiplierless RDCT Real time Data (Frames grabbed at 30 fps) IBBPBBPBBP 35.7010 35.6694 35.6991 IPPPPPPPPP 33.6581 33.6132 33.6525 San_Fran_Traffic IBBPBBPBBP 34.8076 34.7776 34.7996 IPPPPPPPPP 31.7476 31.7176 31.7462 Alex IBBPBBPBBP 36.0928 36.0809 36.0876 IPPPPPPPPP 31.583 31.5756 31.579 The Structural Similarity Index Matrix (SSIM) index seeks to separately discover differences in local image luminance l(x,y), contrast c(x,y) and structure s(x,y) between the original and compensated images. Given the pixel points (x,y), the SSIM is defined as ( ) ( ) ( ) ( ) 1 2 3 2 2 2 2 2 2 1 2 3 , , . , . , 2 2 = . .x y x y xy x y x y x y SSIM x y l x y c x y S x y C C C C C C µ µ σ σ σ µ µ σ σ σ σ = + + + + + + + + + (11) where µx, µy, xσ , yσ and xyσ are the local sample means, variances, and cross-covariance of x and y. The constants C1, C2, C3 stabilize SSIM when the means and variances become small.
  • 7. Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 475 SSIM index varies between 0(worst) and 1(best). Table III shows the average SSIM for decoded frames with original frames. Table III AVERAGE SSIM BETWEEN THE DECODED AND ORIGINAL FRAMES Test Sequence Frame Format Floating- point DCT[9] Integer DCT [10] Multiplierless RDCT Real time Data (Frames grabbed at 30 fps) IBBPBBPBBP 0.9223 0.9218 0.9223 IPPPPPPPPP 0.921645 0.921628 0.921635 San_Fran_Traffic IBBPBBPBBP 0.85028 0.85020 0.85026 IPPPPPPPPP 0.85701 0.85014 0.85693 Alex IBBPBBPBBP 0.8689 0.8684 0.8690 IPPPPPPPPP 0.8678 0.8668 0.8682 From Table III, it is clear that the quality of decoding is very good with RDCT and achieves the same performance as that of the floating-point DCT. This is ensured by taking the difference frame between the reference frame and the decoded frame. The difference frame is as shown in the Figure 3a and 3b. The difference between the RDCT and the floating-point DCT in terms of SSIM is 0.01% for standard test sequence like Alex.avi and the difference between the RDCT and the Integer DCT in terms of SSIM is 0.07% for the same test sequence. For the real time data the difference between RDCT and floating-point DCT is 0 in terms of SSIM and between RDCT and integer DCT is 0.05% in terms of SSIM. These values clearly indicate that the reconstructed frame with proposed RDCT is very good in subjective quality when compared with the reconstructed frame with Integer DCT. FIGURE 3a Difference between original and decoded frame (real time sequence)
  • 8. Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 476 FIGURE 3B Difference between original and decoded frame (San_Fran_Traffic.avi) Table IV shows the comparison of the computation time for decoding I reference frame and decoding 60 frames, with different algorithms with a GOP of 10 frames. The computation was performed on a Intel Core 2 Duo Processor, @ 1.80 GHz. TABLE IV DECODING TIME IN SECONDS Table IV shows that the proposed RDCT has reduction in decoding time for 10 frames by 47.9578% when compared with the floating-point DCT whereas it has an improvement of 56.1158% over the commonly used integer DCT for a real time data sequence. However, the reduction in the time is 47.1884% when compared with the floating-point DCT whereas it has an improvement of 54.6779% over integer DCT for a standard data sequence like Alex.avi. Test Sequence Decoding frame Floating- point DCT Integer DCT Multiplierless RDCT Real time Data (Frames grabbed at 30 fps) Reference frame 0.191 0.313 0.125 10 frames 12.609 14.953 6.562 San_Fran_Traffic Reference frame 0.296 0.308 0.109 10 frames 12.322 14.641 6.335 Alex Reference frame 0.245 0.325 0.125 10 frames 12.484 14.547 6.593
  • 9. Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 477 FIGURE 4. Comparison plot for sequences ‘Real-time sequence’, ‘San_Fran_Traffic’ & ‘Alex’ Fig 4 gives us better comparison in terms of the execution times for decoding 10 frames using different algorithms namely RDCT, floating-point DCT and the IntDCT. The plot clearly shows the RDCT outperforms the floating-point DCT and the IntDCT. This improvement in the decoding time is due to the improvement in the computational complexity of the DCT algorithm. 5. CONCLUSION The computationally less complex video coding technique is presented in this paper using multiplierless Ramanujan ordered DCT. This method allows us to evaluate the cosine function using only integers which are powers of 2 thereby replaces the complex floating-point multiplications by shifters and adders. This algorithm takes 2N / 2 log N shifts and ( )23N/2log N N 1− + addition operations to evaluate an N-point DCT. The cosine approximation increases the overhead on the number of adders by 13.6% but totally avoids floating point multiplications. The reduction in complexity is reflected in the time required for the decoding of video frames. There is an improvement of 58% from the existing commonly used Integer DCT video codec. The average SSIM and average PSNR values indicate that the quality of decoding using the RDCT is same as that of the Integer DCT. Hence, the proposed algorithm is an efficient multiplierless transform for video coding that offers less computationally complexity but assures the same quality as that of the existing algorithms. 6. REFERENCES [1] G. Eason, B. Noble, and I. N. Sneddon, “On certain integrals of ITU-T Recommendation H.261, “Video codecs for audiovisual services at p x 64 kb/s,” Mar. 1993. [2] ITU-T Recommendation H.263, “Video coding for low bitrate communication,” Mar. 1996.
  • 10. Geetha K.S, Pushpa M.K, Uttarakumari M & S.Sethu Selvi International Journal of Image Processing (IJIP), Volume (5) : Issue (4) : 2011 478 [3] ISO/IEC 11 172-2, “Information technology - coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s: Part 2 Video,” Aug. 1993. [4] ITU-T Recommendation H.262 I ISODEC 13818-2, “Information technology - generic coding of moving pictures and associated audio information: video,” 1995. [5] ISO/IEC 13818-2 "Generic Coding of Moving Pictures and Associated Audio Information: Video", [6] ATSC document A/54 "Guide to the Use of the ATSC Digital Television Standard" [7] Renxiang Li, Bing Zeng and Ming I.Liou, “A New Three-Step Search Algorithm for Block Motion Estimation”, IEEE Trans. Circuits And Systems For Video Technology, Vol.4, No.4, pp. 438-442, Aug 1994. [8] Aroh Barjatya, “Block Matching Algorithms For Motion Estimation” , DIP 6620 Spring 2004 Final Project Paper. [9] H.S. Hou, “A Fast Recursive Algorithms for Computing the Discrete Cosine Transform”. IEEE Trans. Acoust., Speech, Signal Processing, Vol.35, pp 1455-1461, Oct 1987. [10] Yonghong Zeng, Lizhi Cheng, Guoan Bi, and Alex C. Kot, ‘‘Integer DCT’s and Fast Algorithms”, IEEE Signal Proc.141-14 (2000). [11] Geetha.K.S, V.K.Ananthashayana, ‘‘A Novel Recursive Multiplierless Algorithm for 2-D DCT”,Proc. ICSPCN 2009,Aug 2009. [12] Geetha.K.S, M.Uttarakumari, “Multiplierless Recursive algorithm using Ramanujan ordered Numbers,” in IETE Journal of Research, vol. 56, Issue 4, JUL-AUG 2010. [13] Geetha.K.S, M.Uttarakumari, “A Novel Cosine approximation for high-speed evaluation of DCT” International Journal of Image Processing, CSC Journals Volume: 4 Issue: 6 Pg 539 – 548 Jan-Feb 2011.