SlideShare a Scribd company logo
3
Most read
4
Most read
5
Most read
53 International Journal for Modern Trends in Science and Technology
Volume: 2 | Issue: 09 | September 2016 | ISSN: 2455-3778IJMTST
An FPGA Based Floating Point
Arithmetic Unit Using Verilog
T. Ramesh1
| G. Koteshwar Rao2
1PG Scholar, Vaagdevi College of Engineering, Telangana.
2Assistant Professor, Vaagdevi College of Engineering, Telangana.
Floating Point (FP) multiplication is widely used in large set of scientific and signal processing computation.
Multiplication is one of the common arithmetic operations in these computations. A high speed floating point
double precision multiplier is implemented on a Virtex-6 FPGA. In addition, the proposed design is compliant
with IEEE-754 format and handles over flow, under flow, rounding and various exception conditions. The
design achieved the operating frequency of 414.714 MHz with an area of 648 slices.
KEYWORDS: Double precision, Floating point, Multiplier, FPGA, IEEE-754.
Copyright © 2016 International Journal for Modern Trends in Science and Technology
All rights reserved.
I. INTRODUCTION
The real numbers represented in binary format
are known as floating point numbers. Based on
TEEE-754 standard, floating point formats are
classified into binary and decimal interchange
formats. Floating point multipliers are very
important in DSP applications.
This paper focuses on double precision
normalized binary interchange format. Figure I
shows the TEEE-754 double precision binary
format representation. Sign (S) is represented with
one bit, exponent (E) and fraction (M or Mantissa)
are represented with eleven and fifty two bits
respectively. For a number is said to be a
normalized number, it must consist of'one' in the
MSB of the significant and exponent is greater than
zero and smaller than 1023. The real number is
represented by equations (I) & (2).
Figurel. TEEE-754 double precision floating point format
Value= -1S × M × 2E
Floating point implementation on FPGAs has been
the interest of many researchers. In [I], an
TEEE-754 single precision pipelined floating point
multiplier is implemented on multiple FPGAs (4
Actel AI280). Nabeel Shirazi, Walters, and Peter
Athanas implemented custom 16/18 bit three
stage pipelined floating point multiplier, that
doesn't support rounding modes [2]. L.Louca,
T.A.Cook, W.H. Johnson [3] implemented a single
precision floating point multiplier by using a
digit-serial multiplier and Altera FLEX 8000. The
design achieved 2.3 MFlops and doesn't support
rounding modes. In [4], a parameterizable floating
point multiplier is implemented using five stages
pipeline, Handel-C software and Xilinx XCYIOOO
FPGA.The design achieved the operating frequency
of 28MFlops. The floating point unit [5] is
implemented using the primitives of Xilinx Yirtex IT
FPGA. The design achieved the operating frequency
of 100 MHz with a latency of 4 clock cycles.
Mohamed AI-Ashraf}', Ashraf Salem, and Wagdy
Anis [6] implemented an efficient TEEE-754 single
precision floating point multiplier and targeted for
Xilinx Yirtex-5 FPGA. The multiplier handles the
overflow and underflow cases but rounding is not
implemented. The design achieves 30 I MFLOPs
with latency of three clock cycles. The multiplier
was verified against Xilinx floating point multiplier
core.
ABSTRACT
54 International Journal for Modern Trends in Science and Technology
An FPGA Based Floating Point Arithmetic Unit Using Verilog
II. FLOATING POINT MULTIPLICATION ALGORITHM
Multiplying two numbers in floating point format
is done by
1. Adding the exponent of the two numbers then
subtracting the bias from their result.
2.Multiplying the significant of the two numbers
3.Calculating the sign by XORing the sign of the
two numbers.
In order to represent the multiplication result as
a normalized number there should be I in the MSB
of the result (leading one).
The following steps are necessary to multiply
two floating point numbers.
The following steps are necessary to multiply
two floating point numbers.
1.Multiplying the significant i.e. (I.MI * I.M2)
2.Placing the decimal point in the result
3.Adding the exponents i.e. (E I + E2 - Bias)
4. Obtaining the sign i.e. sl xor s2
5.Normalizing the result i.e. obtaining I at
the MSB of the results "significand"
6.Rounding the result to fit in the available bits
7.Checking for underflow/overflow occurrence
III. IMPLEMENTATION OF DOUBLE PRECISION
FLOATING POINT MULTIPLIER
In this paper we implemented a double precision
floating point multiplier with exceptions and
rounding. Figure 2 shows the multiplier structure
that includes exponents addition, significand
multiplication, and sign calculation. Figure 3
shows the multiplier, exceptions and rounding that
are independent and are done in parallel.
Figure 3. Multiplier structure with rounding and exceptions
A. Multiplier
The black box view of the double precision
floating point multiplier is shown in figure 4.The
Multiplier receives two 64-bit floating point
numbers. First these numbers are unpacked by
separating the numbers into sign, exponent, and
mantissa bits. The sign logic is a simple XOR. The
exponents of the two numbers are added and then
subtracted with a bias number i.e., 1023. Mantissa
multiplier block performs multiplication operation.
After this the output of mantissa division is
normalized, i.e., if the MSB of the result obtained is
not I, then it is left shifted to make the MSB I. If
changes are made by shifting then corresponding
changes has to be made in exponent also.
The multiplication operation is performed in the
module (fJ:lU_mul). The mantissa of operand A
and the leading 'I' (for normalized numbers) are
stored in the 53-bit register (mul_a). The mantissa
of operand Band the leading' I' (for normalized
numbers) are stored in the 53-bit register (mul_b).
Multiplying all 53 bits of mul_a by 53 bits of mul_b
would result in a 106-bit product. 53 bit by 53 bit
multipliers are not available in the most popular
Xilinx and Altera FPGAs, so the multiply would be
broken down into smaller multiplies and the
results would be added together to give the final
106-bit product. The module (fJ:lU_mul) breaks up
the multiply into smaller 24-bit by 17-bit
multiplies. The Xilinx Virtex-6 device contains
DSP48E I slices with 25 by 18 twos complement
multipliers, which can perform a 24-bit by 17-bit
unsigned multiply.
The breakdown of the multiply in module
(fJ:lU_mul) is broken up as follows
product_a = mul_a[23:0] * mul_b[16:0]
product_b= mul_a[23:0] * mul_b[33:17]
55 International Journal for Modern Trends in Science and Technology
Volume: 2 | Issue: 09 | September 2016 | ISSN: 2455-3778IJMTST
product_c= mul_a[23:0] * mul_b[50:34]
product_d =mul_a[23:0] * mutb[52:51]
product_e= mul_a[40:24] * mul_b[16:0]
productj= mul_a[40:24] * mutb[33:17]
product_g= mul_a[40:24] * mul_b[52:34]
product_h = mul_a[52:41] * mul_b[16:0]
product_i =mul_a[52:41] * mul_b[33:17]
productj =mul_a[52:41] * mul_b[52:34]
Figure 4. Black box view of floating point double precision
multiplier
The products (a-j) are added together, with the
appropriate offsets based on which part of the
mul_a and mul_b arrays they are multiplying.
fields of operands A and B are added together and
then the value (1023) is subtracted from the sum of
A and B. If the resultant exponent is less than 0,
then the (product) register needs to be right shifted
by the amount. This value is stored in register
(exponent_under). The final exponent of the output
operand will be 0 in this case, and the result will be
a denormalized number. If exponent_under is
greater than 52, then the mantissa will be shifted
out of the product register, and the output will be
0, and the "underflow" signal will be asserted. The
mantissa output from the (fJ:lU_mul) module is in
56-bit register (product_7). The MSB is a leading '0'
to allow for a potential overflow in the rounding
module. The first bit '0' is followed by the leading 'I'
for normalized numbers, or '0' for denormalized
numbers. Then the 52 bits of the mantissa follow.
Two extra bits follow the mantissa, and are used for
rounding purposes. The first extra bit is taken from
the next bit after the mantissa in the 106-bit
product result of the multiply. The second extra bit
is an OR of the 52 LSB's of the 106-bit product
B. Rounding and Exceptions
The IEEE standard specifies four rounding modes
round to nearest, round to zero, round to positive
infinity, and round to negative infmity. Table 1
shows the rounding modes selected for various bit
combinations of rmode. Based on the rounding
changes to the mantissa corresponding changes
has to be made in the exponent part also.
Table!: Rounding modes selected for various bit
combinations of rmode
Bit combination Rounding Mode
00 round-nearest-even
01 round to zero
10 round_up
I round down
In the exceptions module, all of the special
cases are checked for, and if they are found, the
appropriate output is created, and the individual
output signals of underflow, overflow, inexact,
exception, and invalid will be asserted if the
conditions for each case exist.
IV. RESULTS
The double precision floating point multiplier
design was simulated in Modelsim 6.6c and
synthesized using Xilinx ISE 12.2i which was
mapped on to Virtex-6 FPGA. The simulation
results of 64-bit floating point double precision
multiplier are shown in figure 5. The 'opa' and 'opb'
are the inputs and 'out' is the output. Table 2
shows the device utilization for implementing the
circuit on Virtex-6 FPGA. Table 3 shows the timing
summary of double precision floating point
multiplier. Table 4 shows the area and operating
frequency of double preCISIOn floating point
multiplier, Single precision floating point multiplier
[6] and Xilinx core respectively. M.AI-AshrafY,
A.Salem and W.Anis [6] implemented single
precision floating point multiplier and it occupies
an area of 604 slices and it's operating frequency is
301.114 MHz. Where as in case of XiIinx core, it
occupies an area of 266 slices and it's operating
frequency is 221.484 MHz. So the implemented
design provides high operating frequency with
more accuracy.
56 International Journal for Modern Trends in Science and Technology
An FPGA Based Floating Point Arithmetic Unit Using Verilog
Table2: Device utilization summary (Virtex -6vlx75ttl484-3)
of double precision floating point multiplier
Logic Utilization Used
Number of slice registers
(Flip-Flops) 1,998
Number of slice LUTs 2,181
Number of occupied slices 648
Number of bonded lOBs 203
Table 3: Timing summary of double precision floating point
multiplier
Parameter Valne
Minimum period (ns) 2All
Maximum Frequency (MHz) 414.714
Figure 5. Simulation results of double precision floating point multiplier
Table 4: Area and operating frequency of double precision
floating point multiplier, single precision floating point
multiplier [6] and Xilinx core
Present
Work M.AI-Ashrafy, A.Salem and Xilinx Core
W.Anis l6J
Device
paramete
rs
Double
Precision Single precision
Single
Precision
No. of
slices 648 604 266
V. CONCLUSION
The double precision floating point multiplier
supports the LEEE-754 binary interchange format,
targeted on a Xilinx Virtex-6 xc6vlx75t-3ff484
FPGA. The design achieved the operating
frequency of 414.714 MFLOPs with area of 648
slices. The implemented design is verified with
single precision floating point multiplier [6] and
Xilinx core, it provides high speed and supports
double precision, which gives more accuracy
compared to single precession. This design
handles the overflow, underflow, and truncation
rounding mode
REFERENCES
[1] B. Fagin and C. Renard, "Field Programmable Gate
Arrays and Floating Point Arithmetic," IEEE
Transactions on VLS1, vol. 2, no. 3,pp. 365-367,
1994.
[2] N. Shirazi, A. Walters, and P. Athanas, "Quantitative
Analysis of Floating Point Arithmetic on FPGA Based
Custom Computing Machines," Proceedings of the
IEEE Symposium on FPGAs for Custom Computing
Machines (FCCM"95),pp.155-162, 1995.
[3] L. Louca, T. A. Cook, and W. H. Johnson,
"Implementation of IEEE Single Precision Floating
Point Addition and Multiplication on FPGAs,"
Proceedings of 83rd IEEE Symposium on FPGAs for
57 International Journal for Modern Trends in Science and Technology
Volume: 2 | Issue: 09 | September 2016 | ISSN: 2455-3778IJMTST
Custom Computing Machines (FCCM"96),pp.
107-116,1996.
[4] A. Jaenicke and W. Luk, "Parameterized
Floating-Point Arithmetic on FPGAs", Proc. of IEEE
lCASSP, 2001,vol. 2, pp. 897-900.
[5] B. Lee and N. Burgess, "Parameterisable
Floating-point Operations on FPG A," Conference
Record of the Thirty- Sixth Asilomar Conference on
Signals, Systems, and Computers,2002.
[6] Mohamed AI-Ashraf)', Ashraf Salem, Wagdy Anis.,
"An Efficient Implementation of Floating Point
Multiplier ", Saudi International Electronics,
Communications and Photonics Conference
(SIECPC), pp. 1-5,24-26 April 2011.
Ad

Recommended

Logic Fe Tcom
Logic Fe Tcom
Mukesh Mishra
 
floating point multiplier
floating point multiplier
Bipin Likhar
 
Computer Organization And Architecture lab manual
Computer Organization And Architecture lab manual
Nitesh Dubey
 
Design of Multiplexers, Decoder and a Full Subtractor using Reversible Gates
Design of Multiplexers, Decoder and a Full Subtractor using Reversible Gates
IJLT EMAS
 
Computer organization and architecture lab manual
Computer organization and architecture lab manual
Shankar Gangaju
 
FYBSC IT Digital Electronics Unit III Chapter I Combinational Logic Circuits
FYBSC IT Digital Electronics Unit III Chapter I Combinational Logic Circuits
Arti Parab Academics
 
Ch1 2
Ch1 2
rohity7
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
ijceronline
 
Digital Logic Circuits
Digital Logic Circuits
sathish sak
 
EE8351 DLC
EE8351 DLC
rmkceteee
 
An Efficient Design for Data Encryption and Decryption using Reconfigurable R...
An Efficient Design for Data Encryption and Decryption using Reconfigurable R...
IRJET Journal
 
FYBSC IT Digital Electronics Unit IV Chapter I Multiplexer, Demultiplexer, AL...
FYBSC IT Digital Electronics Unit IV Chapter I Multiplexer, Demultiplexer, AL...
Arti Parab Academics
 
Project on digital vlsi design
Project on digital vlsi design
DINESH DEVIREDDY
 
Digital Logic Rcs
Digital Logic Rcs
Ramzi Alqrainy
 
Implementation and Simulation of Ieee 754 Single-Precision Floating Point Mul...
Implementation and Simulation of Ieee 754 Single-Precision Floating Point Mul...
inventionjournals
 
Digital Logic & Design (DLD) presentation
Digital Logic & Design (DLD) presentation
foyez ahammad
 
Lcdf4 chap 03_p2
Lcdf4 chap 03_p2
ozgur_can
 
Seminar on Digital Multiplier(Booth Multiplier) Using VHDL
Seminar on Digital Multiplier(Booth Multiplier) Using VHDL
Naseer LoneRider
 
Lab 4 Three-Bit Binary Adder
Lab 4 Three-Bit Binary Adder
Katrina Little
 
Gsp 215 Future Our Mission/newtonhelp.com
Gsp 215 Future Our Mission/newtonhelp.com
amaranthbeg8
 
Lp2520162020
Lp2520162020
IJERA Editor
 
C programming part2
C programming part2
Keroles karam khalil
 
Digital Logic circuit
Digital Logic circuit
kavitha muneeshwaran
 
Introduction to digital logic
Introduction to digital logic
Kamal Acharya
 
FPGA based BCH Decoder
FPGA based BCH Decoder
ijsrd.com
 
Design and Implementation of High Speed Area Efficient Double Precision Float...
Design and Implementation of High Speed Area Efficient Double Precision Float...
IOSR Journals
 
H010114954
H010114954
IOSR Journals
 
Survey On Two-Term Dot Product Of Multiplier Using Floating Point
Survey On Two-Term Dot Product Of Multiplier Using Floating Point
IRJET Journal
 
SINGLE PRECISION FLOATING POINT MULTIPLIER USING SHIFT AND ADD ALGORITHM
SINGLE PRECISION FLOATING POINT MULTIPLIER USING SHIFT AND ADD ALGORITHM
AM Publications
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and Development
IJERD Editor
 

More Related Content

What's hot (17)

Digital Logic Circuits
Digital Logic Circuits
sathish sak
 
EE8351 DLC
EE8351 DLC
rmkceteee
 
An Efficient Design for Data Encryption and Decryption using Reconfigurable R...
An Efficient Design for Data Encryption and Decryption using Reconfigurable R...
IRJET Journal
 
FYBSC IT Digital Electronics Unit IV Chapter I Multiplexer, Demultiplexer, AL...
FYBSC IT Digital Electronics Unit IV Chapter I Multiplexer, Demultiplexer, AL...
Arti Parab Academics
 
Project on digital vlsi design
Project on digital vlsi design
DINESH DEVIREDDY
 
Digital Logic Rcs
Digital Logic Rcs
Ramzi Alqrainy
 
Implementation and Simulation of Ieee 754 Single-Precision Floating Point Mul...
Implementation and Simulation of Ieee 754 Single-Precision Floating Point Mul...
inventionjournals
 
Digital Logic & Design (DLD) presentation
Digital Logic & Design (DLD) presentation
foyez ahammad
 
Lcdf4 chap 03_p2
Lcdf4 chap 03_p2
ozgur_can
 
Seminar on Digital Multiplier(Booth Multiplier) Using VHDL
Seminar on Digital Multiplier(Booth Multiplier) Using VHDL
Naseer LoneRider
 
Lab 4 Three-Bit Binary Adder
Lab 4 Three-Bit Binary Adder
Katrina Little
 
Gsp 215 Future Our Mission/newtonhelp.com
Gsp 215 Future Our Mission/newtonhelp.com
amaranthbeg8
 
Lp2520162020
Lp2520162020
IJERA Editor
 
C programming part2
C programming part2
Keroles karam khalil
 
Digital Logic circuit
Digital Logic circuit
kavitha muneeshwaran
 
Introduction to digital logic
Introduction to digital logic
Kamal Acharya
 
FPGA based BCH Decoder
FPGA based BCH Decoder
ijsrd.com
 
Digital Logic Circuits
Digital Logic Circuits
sathish sak
 
An Efficient Design for Data Encryption and Decryption using Reconfigurable R...
An Efficient Design for Data Encryption and Decryption using Reconfigurable R...
IRJET Journal
 
FYBSC IT Digital Electronics Unit IV Chapter I Multiplexer, Demultiplexer, AL...
FYBSC IT Digital Electronics Unit IV Chapter I Multiplexer, Demultiplexer, AL...
Arti Parab Academics
 
Project on digital vlsi design
Project on digital vlsi design
DINESH DEVIREDDY
 
Implementation and Simulation of Ieee 754 Single-Precision Floating Point Mul...
Implementation and Simulation of Ieee 754 Single-Precision Floating Point Mul...
inventionjournals
 
Digital Logic & Design (DLD) presentation
Digital Logic & Design (DLD) presentation
foyez ahammad
 
Lcdf4 chap 03_p2
Lcdf4 chap 03_p2
ozgur_can
 
Seminar on Digital Multiplier(Booth Multiplier) Using VHDL
Seminar on Digital Multiplier(Booth Multiplier) Using VHDL
Naseer LoneRider
 
Lab 4 Three-Bit Binary Adder
Lab 4 Three-Bit Binary Adder
Katrina Little
 
Gsp 215 Future Our Mission/newtonhelp.com
Gsp 215 Future Our Mission/newtonhelp.com
amaranthbeg8
 
Introduction to digital logic
Introduction to digital logic
Kamal Acharya
 
FPGA based BCH Decoder
FPGA based BCH Decoder
ijsrd.com
 

Similar to An FPGA Based Floating Point Arithmetic Unit Using Verilog (20)

Design and Implementation of High Speed Area Efficient Double Precision Float...
Design and Implementation of High Speed Area Efficient Double Precision Float...
IOSR Journals
 
H010114954
H010114954
IOSR Journals
 
Survey On Two-Term Dot Product Of Multiplier Using Floating Point
Survey On Two-Term Dot Product Of Multiplier Using Floating Point
IRJET Journal
 
SINGLE PRECISION FLOATING POINT MULTIPLIER USING SHIFT AND ADD ALGORITHM
SINGLE PRECISION FLOATING POINT MULTIPLIER USING SHIFT AND ADD ALGORITHM
AM Publications
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and Development
IJERD Editor
 
Final modified ppts
Final modified ppts
sravan kumar y
 
Iy3116761679
Iy3116761679
IJERA Editor
 
IRJET - Design and Implementation of Double Precision FPU for Optimised Speed
IRJET - Design and Implementation of Double Precision FPU for Optimised Speed
IRJET Journal
 
IRJET- Implementation of Floating Point FFT Processor with Single Precision f...
IRJET- Implementation of Floating Point FFT Processor with Single Precision f...
IRJET Journal
 
DESIGN OF DOUBLE PRECISION FLOATING POINT MULTIPLICATION ALGORITHM WITH VECTO...
DESIGN OF DOUBLE PRECISION FLOATING POINT MULTIPLICATION ALGORITHM WITH VECTO...
jmicro
 
Ap32283286
Ap32283286
IJERA Editor
 
Jz2517611766
Jz2517611766
IJERA Editor
 
Jz2517611766
Jz2517611766
IJERA Editor
 
IRJET- A Review on Single Precision Floating Point Arithmetic Unit of 32 Bit ...
IRJET- A Review on Single Precision Floating Point Arithmetic Unit of 32 Bit ...
IRJET Journal
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and Development
IJERD Editor
 
IRJET- Single Precision Floating Point Arithmetic using VHDL Coding
IRJET- Single Precision Floating Point Arithmetic using VHDL Coding
IRJET Journal
 
Flot multiplier
Flot multiplier
P V Krishna Mohan Gupta
 
Lp2520162020
Lp2520162020
IJERA Editor
 
Data processing and processor organisation
Data processing and processor organisation
AnsariArfat
 
Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...
Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...
IRJET Journal
 
Design and Implementation of High Speed Area Efficient Double Precision Float...
Design and Implementation of High Speed Area Efficient Double Precision Float...
IOSR Journals
 
Survey On Two-Term Dot Product Of Multiplier Using Floating Point
Survey On Two-Term Dot Product Of Multiplier Using Floating Point
IRJET Journal
 
SINGLE PRECISION FLOATING POINT MULTIPLIER USING SHIFT AND ADD ALGORITHM
SINGLE PRECISION FLOATING POINT MULTIPLIER USING SHIFT AND ADD ALGORITHM
AM Publications
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and Development
IJERD Editor
 
IRJET - Design and Implementation of Double Precision FPU for Optimised Speed
IRJET - Design and Implementation of Double Precision FPU for Optimised Speed
IRJET Journal
 
IRJET- Implementation of Floating Point FFT Processor with Single Precision f...
IRJET- Implementation of Floating Point FFT Processor with Single Precision f...
IRJET Journal
 
DESIGN OF DOUBLE PRECISION FLOATING POINT MULTIPLICATION ALGORITHM WITH VECTO...
DESIGN OF DOUBLE PRECISION FLOATING POINT MULTIPLICATION ALGORITHM WITH VECTO...
jmicro
 
IRJET- A Review on Single Precision Floating Point Arithmetic Unit of 32 Bit ...
IRJET- A Review on Single Precision Floating Point Arithmetic Unit of 32 Bit ...
IRJET Journal
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and Development
IJERD Editor
 
IRJET- Single Precision Floating Point Arithmetic using VHDL Coding
IRJET- Single Precision Floating Point Arithmetic using VHDL Coding
IRJET Journal
 
Data processing and processor organisation
Data processing and processor organisation
AnsariArfat
 
Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...
Implementation of an Effective Self-Timed Multiplier for Single Precision Flo...
IRJET Journal
 
Ad

Recently uploaded (20)

Complete guidance book of Asp.Net Web API
Complete guidance book of Asp.Net Web API
Shabista Imam
 
Kel.3_A_Review_on_Internet_of_Things_for_Defense_v3.pptx
Kel.3_A_Review_on_Internet_of_Things_for_Defense_v3.pptx
Endang Saefullah
 
special_edition_using_visual_foxpro_6.pdf
special_edition_using_visual_foxpro_6.pdf
Shabista Imam
 
retina_biometrics ruet rajshahi bangdesh.pptx
retina_biometrics ruet rajshahi bangdesh.pptx
MdRakibulIslam697135
 
Bitumen Emulsion by Dr Sangita Ex CRRI Delhi
Bitumen Emulsion by Dr Sangita Ex CRRI Delhi
grilcodes
 
20CE404-Soil Mechanics - Slide Share PPT
20CE404-Soil Mechanics - Slide Share PPT
saravananr808639
 
Industrial internet of things IOT Week-3.pptx
Industrial internet of things IOT Week-3.pptx
KNaveenKumarECE
 
Comparison of Flexible and Rigid Pavements in Bangladesh
Comparison of Flexible and Rigid Pavements in Bangladesh
Arifur Rahman
 
Data Structures Module 3 Binary Trees Binary Search Trees Tree Traversals AVL...
Data Structures Module 3 Binary Trees Binary Search Trees Tree Traversals AVL...
resming1
 
Modern multi-proposer consensus implementations
Modern multi-proposer consensus implementations
François Garillot
 
Introduction to Python Programming Language
Introduction to Python Programming Language
merlinjohnsy
 
Call For Papers - 17th International Conference on Wireless & Mobile Networks...
Call For Papers - 17th International Conference on Wireless & Mobile Networks...
hosseinihamid192023
 
Tesla-Stock-Analysis-and-Forecast.pptx (1).pptx
Tesla-Stock-Analysis-and-Forecast.pptx (1).pptx
moonsony54
 
輪読会資料_Miipher and Miipher2 .
輪読会資料_Miipher and Miipher2 .
NABLAS株式会社
 
تقرير عن التحليل الديناميكي لتدفق الهواء حول جناح.pdf
تقرير عن التحليل الديناميكي لتدفق الهواء حول جناح.pdf
محمد قصص فتوتة
 
DESIGN OF REINFORCED CONCRETE ELEMENTS S
DESIGN OF REINFORCED CONCRETE ELEMENTS S
prabhusp8
 
Proposal for folders structure division in projects.pdf
Proposal for folders structure division in projects.pdf
Mohamed Ahmed
 
FSE-Journal-First-Automated code editing with search-generate-modify.pdf
FSE-Journal-First-Automated code editing with search-generate-modify.pdf
cl144
 
Deep Learning for Image Processing on 16 June 2025 MITS.pptx
Deep Learning for Image Processing on 16 June 2025 MITS.pptx
resming1
 
Structural Wonderers_new and ancient.pptx
Structural Wonderers_new and ancient.pptx
nikopapa113
 
Complete guidance book of Asp.Net Web API
Complete guidance book of Asp.Net Web API
Shabista Imam
 
Kel.3_A_Review_on_Internet_of_Things_for_Defense_v3.pptx
Kel.3_A_Review_on_Internet_of_Things_for_Defense_v3.pptx
Endang Saefullah
 
special_edition_using_visual_foxpro_6.pdf
special_edition_using_visual_foxpro_6.pdf
Shabista Imam
 
retina_biometrics ruet rajshahi bangdesh.pptx
retina_biometrics ruet rajshahi bangdesh.pptx
MdRakibulIslam697135
 
Bitumen Emulsion by Dr Sangita Ex CRRI Delhi
Bitumen Emulsion by Dr Sangita Ex CRRI Delhi
grilcodes
 
20CE404-Soil Mechanics - Slide Share PPT
20CE404-Soil Mechanics - Slide Share PPT
saravananr808639
 
Industrial internet of things IOT Week-3.pptx
Industrial internet of things IOT Week-3.pptx
KNaveenKumarECE
 
Comparison of Flexible and Rigid Pavements in Bangladesh
Comparison of Flexible and Rigid Pavements in Bangladesh
Arifur Rahman
 
Data Structures Module 3 Binary Trees Binary Search Trees Tree Traversals AVL...
Data Structures Module 3 Binary Trees Binary Search Trees Tree Traversals AVL...
resming1
 
Modern multi-proposer consensus implementations
Modern multi-proposer consensus implementations
François Garillot
 
Introduction to Python Programming Language
Introduction to Python Programming Language
merlinjohnsy
 
Call For Papers - 17th International Conference on Wireless & Mobile Networks...
Call For Papers - 17th International Conference on Wireless & Mobile Networks...
hosseinihamid192023
 
Tesla-Stock-Analysis-and-Forecast.pptx (1).pptx
Tesla-Stock-Analysis-and-Forecast.pptx (1).pptx
moonsony54
 
輪読会資料_Miipher and Miipher2 .
輪読会資料_Miipher and Miipher2 .
NABLAS株式会社
 
تقرير عن التحليل الديناميكي لتدفق الهواء حول جناح.pdf
تقرير عن التحليل الديناميكي لتدفق الهواء حول جناح.pdf
محمد قصص فتوتة
 
DESIGN OF REINFORCED CONCRETE ELEMENTS S
DESIGN OF REINFORCED CONCRETE ELEMENTS S
prabhusp8
 
Proposal for folders structure division in projects.pdf
Proposal for folders structure division in projects.pdf
Mohamed Ahmed
 
FSE-Journal-First-Automated code editing with search-generate-modify.pdf
FSE-Journal-First-Automated code editing with search-generate-modify.pdf
cl144
 
Deep Learning for Image Processing on 16 June 2025 MITS.pptx
Deep Learning for Image Processing on 16 June 2025 MITS.pptx
resming1
 
Structural Wonderers_new and ancient.pptx
Structural Wonderers_new and ancient.pptx
nikopapa113
 
Ad

An FPGA Based Floating Point Arithmetic Unit Using Verilog

  • 1. 53 International Journal for Modern Trends in Science and Technology Volume: 2 | Issue: 09 | September 2016 | ISSN: 2455-3778IJMTST An FPGA Based Floating Point Arithmetic Unit Using Verilog T. Ramesh1 | G. Koteshwar Rao2 1PG Scholar, Vaagdevi College of Engineering, Telangana. 2Assistant Professor, Vaagdevi College of Engineering, Telangana. Floating Point (FP) multiplication is widely used in large set of scientific and signal processing computation. Multiplication is one of the common arithmetic operations in these computations. A high speed floating point double precision multiplier is implemented on a Virtex-6 FPGA. In addition, the proposed design is compliant with IEEE-754 format and handles over flow, under flow, rounding and various exception conditions. The design achieved the operating frequency of 414.714 MHz with an area of 648 slices. KEYWORDS: Double precision, Floating point, Multiplier, FPGA, IEEE-754. Copyright © 2016 International Journal for Modern Trends in Science and Technology All rights reserved. I. INTRODUCTION The real numbers represented in binary format are known as floating point numbers. Based on TEEE-754 standard, floating point formats are classified into binary and decimal interchange formats. Floating point multipliers are very important in DSP applications. This paper focuses on double precision normalized binary interchange format. Figure I shows the TEEE-754 double precision binary format representation. Sign (S) is represented with one bit, exponent (E) and fraction (M or Mantissa) are represented with eleven and fifty two bits respectively. For a number is said to be a normalized number, it must consist of'one' in the MSB of the significant and exponent is greater than zero and smaller than 1023. The real number is represented by equations (I) & (2). Figurel. TEEE-754 double precision floating point format Value= -1S × M × 2E Floating point implementation on FPGAs has been the interest of many researchers. In [I], an TEEE-754 single precision pipelined floating point multiplier is implemented on multiple FPGAs (4 Actel AI280). Nabeel Shirazi, Walters, and Peter Athanas implemented custom 16/18 bit three stage pipelined floating point multiplier, that doesn't support rounding modes [2]. L.Louca, T.A.Cook, W.H. Johnson [3] implemented a single precision floating point multiplier by using a digit-serial multiplier and Altera FLEX 8000. The design achieved 2.3 MFlops and doesn't support rounding modes. In [4], a parameterizable floating point multiplier is implemented using five stages pipeline, Handel-C software and Xilinx XCYIOOO FPGA.The design achieved the operating frequency of 28MFlops. The floating point unit [5] is implemented using the primitives of Xilinx Yirtex IT FPGA. The design achieved the operating frequency of 100 MHz with a latency of 4 clock cycles. Mohamed AI-Ashraf}', Ashraf Salem, and Wagdy Anis [6] implemented an efficient TEEE-754 single precision floating point multiplier and targeted for Xilinx Yirtex-5 FPGA. The multiplier handles the overflow and underflow cases but rounding is not implemented. The design achieves 30 I MFLOPs with latency of three clock cycles. The multiplier was verified against Xilinx floating point multiplier core. ABSTRACT
  • 2. 54 International Journal for Modern Trends in Science and Technology An FPGA Based Floating Point Arithmetic Unit Using Verilog II. FLOATING POINT MULTIPLICATION ALGORITHM Multiplying two numbers in floating point format is done by 1. Adding the exponent of the two numbers then subtracting the bias from their result. 2.Multiplying the significant of the two numbers 3.Calculating the sign by XORing the sign of the two numbers. In order to represent the multiplication result as a normalized number there should be I in the MSB of the result (leading one). The following steps are necessary to multiply two floating point numbers. The following steps are necessary to multiply two floating point numbers. 1.Multiplying the significant i.e. (I.MI * I.M2) 2.Placing the decimal point in the result 3.Adding the exponents i.e. (E I + E2 - Bias) 4. Obtaining the sign i.e. sl xor s2 5.Normalizing the result i.e. obtaining I at the MSB of the results "significand" 6.Rounding the result to fit in the available bits 7.Checking for underflow/overflow occurrence III. IMPLEMENTATION OF DOUBLE PRECISION FLOATING POINT MULTIPLIER In this paper we implemented a double precision floating point multiplier with exceptions and rounding. Figure 2 shows the multiplier structure that includes exponents addition, significand multiplication, and sign calculation. Figure 3 shows the multiplier, exceptions and rounding that are independent and are done in parallel. Figure 3. Multiplier structure with rounding and exceptions A. Multiplier The black box view of the double precision floating point multiplier is shown in figure 4.The Multiplier receives two 64-bit floating point numbers. First these numbers are unpacked by separating the numbers into sign, exponent, and mantissa bits. The sign logic is a simple XOR. The exponents of the two numbers are added and then subtracted with a bias number i.e., 1023. Mantissa multiplier block performs multiplication operation. After this the output of mantissa division is normalized, i.e., if the MSB of the result obtained is not I, then it is left shifted to make the MSB I. If changes are made by shifting then corresponding changes has to be made in exponent also. The multiplication operation is performed in the module (fJ:lU_mul). The mantissa of operand A and the leading 'I' (for normalized numbers) are stored in the 53-bit register (mul_a). The mantissa of operand Band the leading' I' (for normalized numbers) are stored in the 53-bit register (mul_b). Multiplying all 53 bits of mul_a by 53 bits of mul_b would result in a 106-bit product. 53 bit by 53 bit multipliers are not available in the most popular Xilinx and Altera FPGAs, so the multiply would be broken down into smaller multiplies and the results would be added together to give the final 106-bit product. The module (fJ:lU_mul) breaks up the multiply into smaller 24-bit by 17-bit multiplies. The Xilinx Virtex-6 device contains DSP48E I slices with 25 by 18 twos complement multipliers, which can perform a 24-bit by 17-bit unsigned multiply. The breakdown of the multiply in module (fJ:lU_mul) is broken up as follows product_a = mul_a[23:0] * mul_b[16:0] product_b= mul_a[23:0] * mul_b[33:17]
  • 3. 55 International Journal for Modern Trends in Science and Technology Volume: 2 | Issue: 09 | September 2016 | ISSN: 2455-3778IJMTST product_c= mul_a[23:0] * mul_b[50:34] product_d =mul_a[23:0] * mutb[52:51] product_e= mul_a[40:24] * mul_b[16:0] productj= mul_a[40:24] * mutb[33:17] product_g= mul_a[40:24] * mul_b[52:34] product_h = mul_a[52:41] * mul_b[16:0] product_i =mul_a[52:41] * mul_b[33:17] productj =mul_a[52:41] * mul_b[52:34] Figure 4. Black box view of floating point double precision multiplier The products (a-j) are added together, with the appropriate offsets based on which part of the mul_a and mul_b arrays they are multiplying. fields of operands A and B are added together and then the value (1023) is subtracted from the sum of A and B. If the resultant exponent is less than 0, then the (product) register needs to be right shifted by the amount. This value is stored in register (exponent_under). The final exponent of the output operand will be 0 in this case, and the result will be a denormalized number. If exponent_under is greater than 52, then the mantissa will be shifted out of the product register, and the output will be 0, and the "underflow" signal will be asserted. The mantissa output from the (fJ:lU_mul) module is in 56-bit register (product_7). The MSB is a leading '0' to allow for a potential overflow in the rounding module. The first bit '0' is followed by the leading 'I' for normalized numbers, or '0' for denormalized numbers. Then the 52 bits of the mantissa follow. Two extra bits follow the mantissa, and are used for rounding purposes. The first extra bit is taken from the next bit after the mantissa in the 106-bit product result of the multiply. The second extra bit is an OR of the 52 LSB's of the 106-bit product B. Rounding and Exceptions The IEEE standard specifies four rounding modes round to nearest, round to zero, round to positive infinity, and round to negative infmity. Table 1 shows the rounding modes selected for various bit combinations of rmode. Based on the rounding changes to the mantissa corresponding changes has to be made in the exponent part also. Table!: Rounding modes selected for various bit combinations of rmode Bit combination Rounding Mode 00 round-nearest-even 01 round to zero 10 round_up I round down In the exceptions module, all of the special cases are checked for, and if they are found, the appropriate output is created, and the individual output signals of underflow, overflow, inexact, exception, and invalid will be asserted if the conditions for each case exist. IV. RESULTS The double precision floating point multiplier design was simulated in Modelsim 6.6c and synthesized using Xilinx ISE 12.2i which was mapped on to Virtex-6 FPGA. The simulation results of 64-bit floating point double precision multiplier are shown in figure 5. The 'opa' and 'opb' are the inputs and 'out' is the output. Table 2 shows the device utilization for implementing the circuit on Virtex-6 FPGA. Table 3 shows the timing summary of double precision floating point multiplier. Table 4 shows the area and operating frequency of double preCISIOn floating point multiplier, Single precision floating point multiplier [6] and Xilinx core respectively. M.AI-AshrafY, A.Salem and W.Anis [6] implemented single precision floating point multiplier and it occupies an area of 604 slices and it's operating frequency is 301.114 MHz. Where as in case of XiIinx core, it occupies an area of 266 slices and it's operating frequency is 221.484 MHz. So the implemented design provides high operating frequency with more accuracy.
  • 4. 56 International Journal for Modern Trends in Science and Technology An FPGA Based Floating Point Arithmetic Unit Using Verilog Table2: Device utilization summary (Virtex -6vlx75ttl484-3) of double precision floating point multiplier Logic Utilization Used Number of slice registers (Flip-Flops) 1,998 Number of slice LUTs 2,181 Number of occupied slices 648 Number of bonded lOBs 203 Table 3: Timing summary of double precision floating point multiplier Parameter Valne Minimum period (ns) 2All Maximum Frequency (MHz) 414.714 Figure 5. Simulation results of double precision floating point multiplier Table 4: Area and operating frequency of double precision floating point multiplier, single precision floating point multiplier [6] and Xilinx core Present Work M.AI-Ashrafy, A.Salem and Xilinx Core W.Anis l6J Device paramete rs Double Precision Single precision Single Precision No. of slices 648 604 266 V. CONCLUSION The double precision floating point multiplier supports the LEEE-754 binary interchange format, targeted on a Xilinx Virtex-6 xc6vlx75t-3ff484 FPGA. The design achieved the operating frequency of 414.714 MFLOPs with area of 648 slices. The implemented design is verified with single precision floating point multiplier [6] and Xilinx core, it provides high speed and supports double precision, which gives more accuracy compared to single precession. This design handles the overflow, underflow, and truncation rounding mode REFERENCES [1] B. Fagin and C. Renard, "Field Programmable Gate Arrays and Floating Point Arithmetic," IEEE Transactions on VLS1, vol. 2, no. 3,pp. 365-367, 1994. [2] N. Shirazi, A. Walters, and P. Athanas, "Quantitative Analysis of Floating Point Arithmetic on FPGA Based Custom Computing Machines," Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines (FCCM"95),pp.155-162, 1995. [3] L. Louca, T. A. Cook, and W. H. Johnson, "Implementation of IEEE Single Precision Floating Point Addition and Multiplication on FPGAs," Proceedings of 83rd IEEE Symposium on FPGAs for
  • 5. 57 International Journal for Modern Trends in Science and Technology Volume: 2 | Issue: 09 | September 2016 | ISSN: 2455-3778IJMTST Custom Computing Machines (FCCM"96),pp. 107-116,1996. [4] A. Jaenicke and W. Luk, "Parameterized Floating-Point Arithmetic on FPGAs", Proc. of IEEE lCASSP, 2001,vol. 2, pp. 897-900. [5] B. Lee and N. Burgess, "Parameterisable Floating-point Operations on FPG A," Conference Record of the Thirty- Sixth Asilomar Conference on Signals, Systems, and Computers,2002. [6] Mohamed AI-Ashraf)', Ashraf Salem, Wagdy Anis., "An Efficient Implementation of Floating Point Multiplier ", Saudi International Electronics, Communications and Photonics Conference (SIECPC), pp. 1-5,24-26 April 2011.