SlideShare a Scribd company logo
SYSTEM DESIGN USING HDL (ECE43)
#
Digital system design using Verilog,
Charles Roth, Lizy Kurian John, Byeong Kil Lee,
1st Edition, 2016, Cengage Learning
1 2.1, 2.2, 2.3 - 2.8, 2.11, 2.13 - 2.15
2 2.9, 2.10, 2.12, 2.16 - 2.19, 8.1, 8.2
3 3.1 - 3.4, 5.1, 5.2.1, 5.3
4 4.1 - 4.5, 4.8, 4.6, 4.7, 4.9, 4.11
5 6.1 - 6.5, 6.7 - 6.12
INTRODUCTION TO
PROGRAMMABLE
LOGIC DEVICES
Brief overview of
Programmable Logic
Devices
Need of programmable logic devices:
• Implementation of a significant amount of functionality
into one physical chip.
• Removes the need for multiple off-the-shelf devices.
• Easy reprogramming, therefore increased ability to
change the design.
• Easier to change the design in case of errors or change
in the design specifications.
Programmable logic
Factory programmable devices
ROM
(Read only
memory)
MPGA (Mask
Programmable
Gate Array)
Field Programmable Devices
SPLD (Simple
Programmable
Logic Device)
CPLD
(Complex
programmable
Logic Device)
FPGA (Field
Programmable
Gate Array)
GAL (Generic
Array Logic)
PAL
(Programmable
Array Logic)
PLA
(Programmable
Logic Array)
PROM
(Programmable
Read Only
Memory)
• Factory Programmable Devices: Generic devices that
are programmed at the factory to meet the Customer’s
requirements. Programming can be done only once.
Examples: ROM, MPGA
• ROM: Primarily meant for memory, but can be used to
implement combinational circuits.
• MPGA: Also called as gate arrays, they have been a
popular technology for creating ASIC.
• Field Programmable Devices: Devices that are
programmed by the user, rather than in factory.
Factor SPLD CPLD FPGA
Density
Low (few
hundred gates)
Low to medium
(500 to 12,000 gates)
Medium to high (3000 to
5,000,000 gates)
Timing Predictable Predictable Unpredictable
Cost Low Low to Medium Medium to high
Major
Vendors
(with device
families)
Lattice
(GAL16LV8,
GAL22V10),
Cypress
(PALCE16V8),
AMD (22V10)
Xilinx (CoolRunner,
XC9500),
Altera (MAX)
Xilinx (Kintex, Artix, Virtex,
Spartan),
Altera (Stratix, Cyclone, Arria),
Lattice (Mach, ECP),
Microsemi (Axcelerator, Fusion)
• PLA: It consists of programmable AND array &
programmable OR array.
• PAL: It is a special case of PLA, where OR array is
fixed and only AND array is programmable. It can also
contain flip-flops.
• Earlier programmable devices were only one time
programmable (OTP, PROM); later on, the advent of
Ultraviolet and electronically erasable technology
gradually led to re-programmable logic devices.
CMOS Electrically Erasable PLDs:
• It contains macroblocks with array of gates, flip-flops,
multiplexers, or standard building blocks.
• PLAs, PALs, GALs & PLDs are collectively referred as
SPLDs.
GAL (Generic Array Logic):
• Lattice semiconductor created similar devices with easy
reprogrammability, and called their line of devices as
GALs.
ROM
(Read Only
Memory)
• ROM consists of an array of semiconductor
devices that are interconnected to store an
array of binary data.
• Data stored in ROM can be read out when
required, but cannot be changed under
normal operating conditions.
• Output pattern stored in ROM is called a
word.
• Each input serves as address, which selects
one of the stored words as output.
• Size of ROM is given as follows: 2n X m, where “n”
represents the number of input lines and “m” represents
the width of output lines.
• A ROM’s size, with 4-bit output line and 3-bit input line
can be written as, 8 words X 4 bits.
• In the following example, when ABC=010, F0F1F2F3=0111.
• ROM consists of a decoder and a memory array. When a
pattern of 1’s and 0’s is applied as input to the decoder,
any one of its output becomes 1, which in turn selects
that particular stored word from the array.
• Types of ROM:
 Mask programmable ROM
 PROM (user programmable)
 EPROM (UV erasure)
 EEPROM (Electrically erasable)
 Flash memory
• Mask programmable ROM: Data array is permanently stored
during manufacture, by selectively including or omitting the
switching elements, in the cross-point switch matrix. Special
masks are used for this purpose, which is an expensive process.
• PROM: One time, user programmable (fuse / antifuse).
• EPROM: Programmer uses voltage pulses to store electronic
charges in the memory array location. UV light is used for the
erasure of complete data that is stored.
• EEPROM: Uses electronic pulses for erasure of data. It can be
reprogrammed only 100 to 1000 times.
• Flash memories: They have built-in programming and erasure
capabilities, and data can be written while in-circuit, without
needing any separate programmer.
• ROM can implement any combinational circuit, by
storing the outputs for all of the input combinations.
Hence, this method is also called as LUT method.
Ex-1: Implement a 2 bit adder using ROM:
Solution: Input : two 2-bit numbers.
Output : Sum having 3-bits.
• Can be implemented with 16 X 3 ROM.
Data to be stored in memory:
0, 1, 2, 3, 1, 2, 3, 4, 2, 3, 4, 5, 3, 4, 5, 6
Ex-2: Compute the size of the ROM to implement an 8:3
priority encoder.
There will be 256 entries in the ROM.
Size of the ROM: 28 X 4
Ex-3: Implement the following state machine, of BCD to
Excess-3 code converter, using ROM.
PS NS Z
X=0 X=1 X=0 X=1
S0 S1 S2 1 0
S1 S3 S4 1 0
S2 S4 S4 0 1
S3 S5 S5 0 1
S4 S5 S6 1 0
S5 S0 S0 0 1
S6 S0 - 1 -
• Sequential circuit is designed
using ROM and flip-flops.
• ROM is used to realize the output
functions and the next state
equations.
• The state of the circuit is stored in
a register of D flip-flops, and fed
back to the input of the ROM.
• To realize the given Mealy
machine, a ROM and 3 flip-flops
are necessary.
• The ROM will generate
the next state equations
and output Z, from the
present states and input X.
• Q1, Q2, Q3 and X are
connected to the address
lines, with X connected to
the LSB.
• Contents of ROM are:
3, 4, 6, 8, 9, 8, A, B,
B, C, 0, 1, 1, 0, 0, 0.
Programmable Logic
Array (PLA)
• PLA with n-input lines and
m-output lines, can realize
m-functions of n-variables.
• When compared to ROM,
instead of decoder, AND
array is used to realize the
product terms.
• Later on, OR array is used
to sum the product terms.
Ex-4: Using PLA, Realize the following functions:
F0 = ⅀m(0,1,4,6) = (A'B'+AC')
F1 = ⅀m(2,3,4,6,7) = (B+AC')
F2 = ⅀m(0,1,2,6) = (A'B'+BC')
F3 = ⅀m(2,3,5,6,7) = (AC+B)
Solution: There are 3 inputs: A, B & C. There are 5 distinct
product terms in the 4 outputs.
Unlike ROM, in a PLA implementation, the product terms
can be shared among the functions.
F0 =⅀m(0,1,4,6)=(A'B'+AC')
F1 =⅀m(2,3,4,6,7)=(B+AC')
F2 =⅀m(0,1,2,6)=(A'B'+BC')
F3 =⅀m(2,3,5,6,7)=(AC+B)
• Instead of AND-OR logic, PLA
may use NOR-NOR logic.
• 2-input NOR gate can be built
using nMOS transistors:
• NOR-NOR with inverters at
input and output = AND-OR.
F0 = ⅀m(0,1,4,6) = (A'B'+AC')
F1 = ⅀m(2,3,4,6,7) = (B+AC')
F2 = ⅀m(0,1,2,6) = (A'B'+BC')
F3 = ⅀m(2,3,5,6,7) = (AC+B)
Ex-5: Using PLA, Realize the following functions:
F1 = ⅀m(2,3,5,7,8,9,10,11,13,15)
F2 = ⅀m(2,3,5,6,7,10,11,14,15)
F3 = ⅀m(6,7,8,9,13,14,15)
Solution: After minimization, the simplified functions are :
F1 = ⅀m(2,3,5,7,8,9,10,11,13,15) = bd+b'c+ab'
F2 = ⅀m(2,3,5,6,7,10,11,14,15) = c+a'bd
F3 = ⅀m(6,7,8,9,13,14,15) = bc+ab'c'+abd
Here, the PLA requires 8 different product terms.
• To reduce the number of rows
in PLA, these functions can be
reorganized using K-map.
F1 = a'bd+abd+b'c+ab'c'
F2 = b'c+bc+a'bd
F3 = bc+ab'c'+abd
There are only 5 different
product terms, and , the
PLA table has only 5 rows.
• In case of PLA, unlike
memory, the number of terms
in each equation is not
important, as the size of PLA
does not depend on the number
of terms within an equation.
• To reduce the number of rows
in PLA, instead of using K-
maps, the Espresso algorithm
can be used. This is a complex
algorithm, which is used as
logic minimization algorithm
for VLSI synthesis.
F1 = a'bd+abd+b'c+ab'c'
F2 = b'c+bc+a'bd
F3 = bc+ab'c'+abd
The PLA implementation has 4
inputs, 5 product terms & 3 outputs.
Programmable
Array Logic
(PAL)
• It is a special case of PLA, in which AND array is programmable
and OR array is fixed.
• Due to this reason, PAL is less expensive than PLA, and is easier
to program as well.
• The following figure represents a segment of an un-programmed
PAL, along with the input buffers which contain two outputs.
Ex-6: Implement I1I2'+I1'I2.
Solution:
• As OR gates cannot be
programmed, AND
terms cannot be
shared among two or
more OR gates.
• Typical PALs have 10
to 20 inputs, and 2 to
10 outputs, with 2 to 8
AND gates driving
each OR gate.
Ex-7: Implement a full-adder using PAL.
Solution:
SUM = X'Y'Cin+X'YCin'+XY'Cin'+XYCin
COUT = XY+YCin+XCin
• PALs were made available that contained D flip-flops as
well, and were called as “sequential PALs”.
Ex-8: Implement Q+ = D = A'BQ'+AB'Q.
Solution:
PLD/GAL
(Programmable Logic Device
/ Generic Array Logic)
• PALs and PLAs are good for implementing small
circuitry. But, they are not re-programmable.
• When they are made as erasable/reprogrammable, by
incorporating Flash memory, such PALs are often
referred as PLDs/GALs.
• An example is 22CEV10, which is a CMOS electrically
erasable PLD, that can realize both combinational as
well as sequential circuits.
• 22CEV10 contains:
 12 declared input pins
 10 pins that can be programmed as input / output
 Programmable AND array (8 till 16 gates feeding each OR gate)
 10 OR gates, each of which drives an output macrocell
 10 D Flip-flops, with asynchronous reset and synchronous preset
 Each macrocell contains the D Flip-flop, multiplexer,
and additional programmability at the output
• 22CEV10 => 22 pins out of which 10 are bidirectional
System design using HDL - Module 3
• Each macrocell has 2 programmable interconnect bits: S1 & S0.
• When the particular bit is programmed, it is connected to 0 V.
• Erasing that bit disconnects it from 0 V, and it floats at logic-1.
S1 S0 Output
0 0 D Flip-flop output
0 1 D Flip-flop output inverted
1 0 OR output
1 1 OR output inverted
• CAD programs are available for PAL/PLD
programming. These programs accept logic equations,
truth tables, state graphs or state tables as inputs.
• They automatically generate the required bit patterns,
which can be downloaded into a PLD programmer,
which will create the necessary connections.
• PALASM (Programmable Array Logic ASsembler for
Military) from MMI & AMD, and ABEL (Advanced
Boolean Expression Language) from DATA I/O are the
two popular languages that are used for programming.
System design using HDL - Module 3
CPLD
(Complex Programmable
Logic Device)
• This is a programmable IC which is equivalent to several
PLDs in the same silicon chip. Typically a CPLD comprises
of 500 to 10,000 logic gates.
• It consists of a number of PAL-like logic blocks, along with a
programmable interconnect. The interconnect matrix is
implemented using crossbar switch. Even though it is
expensive, it results in predictable timing.
• CPLDs are electronically erasable and reprogrammable,
and hence are sometimes referred to as EPLDs (Erasable
Programmable Logic Device).
 Typically a CPLD contains a
number of macrocells, that are
grouped into function blocks.
 Each macrocell contains a flip-
flop and an OR-gate, and the
macrocell has its inputs
connected to an AND gate array.
 The major manufacturers of
CPLD are: Xilinx, Altera,
Lattice, Cypress and Atmel.
System design using HDL - Module 3
AN EXAMPLE
XILINX COOLRUNNER
(XCR3046XL)
System design using HDL - Module 3
• This CPLD has 4 function blocks, and each block has 16
associated macrocells. A function block is a programmable
AND-OR array, which is configured as a PLA.
• Each macrocell contains a flip-flop and additional
multiplexers, that route the signals from the function blocks
to the I/O blocks or to the interconnect array.
• The interconnect array selects signals from the macrocell
outputs and the I/O blocks, and connects them back to
function blocks. Thus, a signal generated from any function
block can be used as an input to any other function block.
System design using HDL - Module 3
• Initially, two D-inputs
have to be generated for
the Flip-flops.
• Later on, two outputs
(Z1, Z2) have to be
generated, by utilizing
the Flip-flop outputs.
• Hence, four macrocells
are required for the
implementation of the
required Mealy machine.
Ex-9: Implement a Mealy
sequential machine with
2 inputs and 2 outputs.
Ex-10: Implement a parallel adder with accumulator.
• The accumulator register
needs one FF for each bit.
• But that bit also needs to
generate the sum and
carry bits corresponding
to that particular bit.
• Hence, each bit of an
adder requires two
macrocells, one for the
sum and the accumulator,
and the other for the carry.
FPGA
(Field Programmable Gate Array)
They contain an array of identical logic blocks
with programmable interconnections.
User can program the functions realized by
each logic block, and can flexibly program
the connections between them.
ADVANTAGES DISADVANTAGES
The time-to-market of FPGA
product is much much lesser.
FPGAs are less dense than
MPGAs.
With FPGA, it is easier to correct
the mistakes in the design.
FPGAs are slower, due to the RC
delay in programmable points.
The prototyping cost is much
reduced, with the usage of FPGA.
Interconnect delays in FPGAs are
unpredictable.
At low volumes, FPGAs are
cheaper than MPGAs.
Programming overhead is much
higher, because of the resources.
MPGA versus FPGA
When compared to CPLD
the major advantage of FPGA is its
highly flexible programmable
interconnect, and due to this fact
itself, the major disadvantage is its
unpredictable interconnect delay.
FPGA typically contains three
programmable elements:
1. Programmable logic blocks
(Configurable Logic Blocks)
2. Programmable routing resources
3. Programmable I/O blocks
• Programmable logic blocks
• These are created by Muxes, LUTs, and AND-OR arrays.
• Programming refers to: a) Changing the contents of LUT,
b) Changing the I/O signals to the Muxes, c) Selecting or not
selecting the particular gates in the AND-OR arrays.
• Programmable interconnect
• For making or breaking the specific connections.
• For connecting various blocks in the chip to each other.
• For connecting specific I/O pins to specific logic blocks.
• Programmable I/O blocks
• I/O pads can be programmed as i/p, o/p or bidirectional.
• They also can be programmed as inverting, non-inverting, tri-
state, slew rate adjustable, passive pull-up etc.
• Based on the topology in which the logic blocks and the
interconnect resources are distributed inside, there can
be four different basic architectures of FPGAs that are
in the market since 1980s:
• Matrix based architecture
• Row based architecture
• Hierarchical PLD architecture
• Sea-of-gates architecture
• Modern FPGAs that are in the market, contain special
purpose blocks including a microprocessor.
Architectures of FPGA
1. Matrix based architecture (e.g., Most Xilinx FPGAs)
• This architecture is also called as “symmetrical array”, and it contains 8X8
arrays in smaller chips, and 100X100 or larger arrays in larger chips.
• Routing is called two-dimensional channeled routing, since routing
resources are available in horizontal and vertical directions.
2. Row based architecture (e.g., some Microsemi FPGAs)
• The logic blocks are organized into rows, and hence, there are rows of logic
blocks, and rows of routing resources.
• Routing is called one-dimensional channeled routing, as the routing
resources are channeled between the rows.
3. Hierarchical PLD architecture (e.g., Altera APEX20, APEX II)
• At the lower level, the FPGAs contain clusters of logic blocks with localized
resources for interconnection.
• At the higher level, the global interconnect is used for interconnection
between the clusters of logic blocks.
4. Sea-of-gates architecture (e.g., Microsemi Fusion)
• FPGAs contain a large number of gates, and there is an interconnect
superimposed on the sea-of-gates.
• There are other terminologies such as sea-of-cells or sea-of-tiles, to indicate
the topology with a large number of logic blocks.
• The term “Programming technology”
is used to denote the technology by
which the programmability in an
FPGA is achieved, especially for the
programmable interconnect.
• Some of the techniques are:
• SRAM programming technology
• EPROM / EEPROM / Flash
programming technology
• Antifuse programming technology
FPGA Programming Technologies
SRAM Programming Technology
• As in the case of ROM, an SRAM can be used to store the
“configuration bits” for interconnection, in an LUT.
• e.g., Sixteen SRAM cells can implement any function of
four variables.
• The programmable interconnect can be achieved by
SRAM, in the following two ways:
• Pass transistor is used for connecting two points
• Routing matrices are implemented by using mux
Disadvantages of SRAM Programming Technology
1. Six transistors are required for every SRAM cell.
• e.g., if FPGA has 1 million programmable points, 6 million transistors are
required for achieving this programmability.
2. Since SRAM is volatile, all the contents are lost during power failure. This is a
serious setback when an FPGA is used in the final product.
• As a solution, EPROM can be used as “boot ROM”, to store the
configuration bits, and its contents can be transferred to SRAM whenever
power gets resumed.
Advantages of SRAM Programming Technology
1. As SRAM is a volatile memory, new contents can be written again and again,
thus providing flexibility during prototyping.
2. Fabrication steps for manufacturing SRAM are same as that for manufacturing
other logic cells.
EPROM / EEPROM / Flash Programming Technology
• Instead of SRAM, EPROM cells are used to control the programmable
interconnections. Each EPROM cell contains a MOSFET, which has two gates: Control
gate and Floating gate.
• The drain of the transistor can be connected to VDD by means of a pull-up resistor.
When a high voltage (10 - 13 V) is applied to the control gate, electrons get injected into
the floating gate, and the transistor turns OFF.
• The electrons remain trapped at the floating gate. The trapped negative charges can be
removed, by exposing the EPROM to UV light.
Disadvantages of EPROM Programming Technology
1. EPROM is slower than SRAM, because of the dual-gate structure.
2. While manufacturing, EPROMs require more processing steps than SRAMs.
3. EPROM based switches have high ON-resistance, and also have high static-
power-consumption.
4. For erasure, the EPROM chip has to be physically removed from the PCB.
 EEPROM is similar to EPROM, but removal of the gate charge can be done
electrically. Hence, for erasure, the chip need not be removed from PCB.
 The memory cells can be selectively erased and can be rewritten, and this does
not require any additional equipment.
 Flash is a form of EEPROM, in which a block of cells can be erased at once, by
applying a large voltage at the control gate, causing the electrons to pull off.
 By sensing the amount of current flow, each cell in Flash can store multiple bits
of information, which in turn depends on the number of trapped electrons.
 While writing bits into, Flash is faster than EEPROM, but slower than SRAM.
• Antifuse programming element changes from high resistance (open - OFF) to low
resistance (closed - ON), when a high voltage is applied.
• Antifuses are built by dielectric layers between N+ diffusion and polysilicon
layers, or by amorphous silicon in between metal layers.
Antifuse Programming Technology
Advantages:
• When compared to MOSFETs, the area
consumed by the antifuse is smaller.
• Antifuse based connections are faster than
SRAM / EPROM technologies.
Disadvantages:
• The antifuse connection is OTP.
• Because of this, design change is not possible.
Comparison of FPGA Programming Technologies
Programming
technology
Storage Programmability Area overhead Resistance Capacitance
SRAM Volatile
In-Circuit
reprogrammable
Large
Medium to
high
High
EPROM
Non-
volatile
Out-of-Circuit
reprogrammable
Small High High
EEPROM /
Flash
Non-
volatile
In-Circuit
reprogrammable
Medium to
large
High High
Antifuse
Non-
volatile
Not
reprogrammable
Small Low Low
• Manufacturers use different names to denote their logic blocks:
• Xilinx calls them as Configurable Logic Blocks (CLB).
• Microsemi calls them as VersaTiles.
• Altera calls them as Logic Elements (LE), and a group
of LEs is called as Logic Array Blocks(LABs).
• Mainly two types of logic blocks are used in FPGAs:
1. LUT based programmable logic blocks.
2. Mux based programmable logic blocks.
I. Programmable logic block architectures
• Look Up Table contains memory cells along with multiplexers.
• The output for each input combination is stored in memory cells.
• The input combination is used as control inputs to the multiplexer.
• For a 2-variable function, 4 memory cells and a 2:1 mux is required.
• For an n-input function, 2n memory cells and 2n :1 mux are required.
1. LUT (Look-Up Table) based Programmable logic block
• Each block contains two
LUT4 and two flip-flops.
• The LUT4 can generate any
one function of 4 variables.
• The flip-flop has chip enable,
set and reset inputs.
• A multiplexer is used to select
in between the combinational
and the latched version of the
LUT4 output.
• The multiplexer is controlled
by a bit stored in memory.
• Choosing X1 as LSB and X4 as
MSB, X4 input need not be used,
as F1 uses only 3 variables.
• To store the contents in the LUT,
the truth table of the function has
to be constructed.
• From the truth table, the contents
of LUT to implement the function
F1 will be {0,1,1,0,0,0,1,1}.
• As LUT4 contains 16 memory
cells for output, it is better to
store the other 8 bits as well,
irrespective of the status of X4.
• Thus, the contents of LUT are
{0,1,1,0,0,0,1,1,0,1,1,0,0,0,1,1}.
Ex-11: Implement the function A'B'C+A'BC'+AB, using LUT.
2. Multiplexer based Programmable logic block
• With LUT, it is not necessary to minimize the function, as the number of
terms in the function is not important (all o/p bits need to be stored).
• But LUT requires storage space. To save it, multiplexers along with
basic gates, can be used.
• As there are 3 variables, we can choose a 4:1 mux, which has 2 select lines.
• The truth table can be constructed, so as to define the output in terms of C.
• The mux select lines can be A & B, and the mux input lines can be connected in
accordance with the last column in the truth table.
A B C F1 Mux i/p in terms of C
0 0 0 0 C
0 0 1 1 C
0 1 0 1 C'
0 1 1 0 C'
1 0 0 0 0
1 0 1 0 0
1 1 0 1 1
1 1 1 1 1
Ex-12: Implement the function A'B'C+A'BC'+AB, using mux.
System design using HDL - Module 3
II. Programmable interconnect
1. General purpose interconnect The completely non-blocking
switch matrix is very expensive.
e.g., in a 4X4 matrix, out of 16
switches, only 4 switches are
utilized at any point of time.
Crosspoint switch matrix
6-way switch
To reduce the number of
multiple connections for a
single route, the crosspoint can
be configured as a 6-way
switch. But, this crosspoint is
more complicated than the
earlier one.
The interconnect in between the
logic blocks should provide flexible
interconnection in between the
rows and columns (e.g., row-
column, row-row, column-column).
2. Direct interconnect
Direct interconnect
to 4 neighbors
Special connections
to 8 neighbors
To reduce the delay
in the switch
matrix, many
FPGAs provide
direct connections
between the logic
blocks, by means of
dedicated switches.
3. Global
interconnect
lines
For high fan-out and low-skew clock
distribution, FPGAs provide routing lines that
span the entire width & height of the device.
When the clock is distributed to a few million
gates in the chip, the delay in the wire causes the
clock edges to arrive at different times at different
parts of the chip. This is called as “clock skew”,
which needs to be eradicated, for the faithful
functionality of the circuitry on the chip.
Interconnects in
row-based FPGAs
The previous interconnects discussed, are applicable to matrix-based
architecture, which has symmetrical arrays. For row-based architecture,
as it is one-dimensional, it has arrays of switches in the routing channel,
which is situated in between the logic blocks.
i) Non-segmented ii) Segmented
When the 3 connections required are x, y & z, they can be done in 2 ways: non-
segmented (full length track, faster), segmented (reduced resources, slower)
Example nets
II. Programmable I/O blocks
• I/O blocks on modern FPGAs allow the
use of a pin as true or inverted, direct or
latched, input or output, and so on.
• The I/O options can be selected by means
of the configuration memory cells,
indicated in the figure as “M”.
• The inversion is performed using an XOR
gate, and one memory-bit.
• The direction of the pin is decided using a
tri-state buffer, and its control can be
selected as active high or active low, using
another memory-bit.
• Similarly, the rate-of-change of output (slew rate), and the pull-up option (open drain, built-
in resistor), can be configured using the memory cells (SRAM, EEPROM / Flash, antifuse).
Dedicated Specialized
Components in FPGA
1. Dedicated memory: The embedded RAM, can be
used to implement the memory needs of the
circuit, that is being designed.
2. Dedicated Arithmetic Units: The custom
implementation of adders and multipliers inside
FPGA, is smaller and faster, than its counterpart
that is implemented using FPGA.
3. DSP Blocks: To support DSP applications, the
vendors provide the hardware inside the FPGA for
encryption/decryption, FFTs, FIR filters, IIR
filters, compression/decompression, and so forth.
4. Embedded Processors: This is a hybrid solution
where part of the design is in a programmable
processor (high flexibility), and the remaining part
is implemented in hardware (better performance).
5. Content Addressable Memory: This is a special
kind of memory in which the content, and not the
address, is used to search the memory.
1. Rapid Prototyping
• As FPGAs contain 5 million or more gates, many large real-world systems can prototyped very
quickly using a single FPGA.
• If a single FPGA will not suffice, multiple FPGAs can be interconnected to realize larger systems,
by plugging the boards into a backplane.
2. Final Products in Medium Speed Systems
• Circuits realized using FPGAs typically operate in the range of 150-200 MHz. If this speed is
sufficient, FPGAs can be used for the final product, instead of the prototype.
• In the final product, if enhancements to the system are required, they can be done as software
updates, rather than hardware changes.
3. Glue Logic
• This is a digital circuitry that works as an interface between two different logic modules.
• Using SRAM FPGAs, the new interface logic can be implemented on the same FPGA.
4. Hardware Accelerators / Coprocessors
• For a software application, an FPGA can be used as a coprocessor, so that it is used to implement a
key kernel, and thus the application can be accelerated.
• Examples of such applications are - pattern matching, computer architecture simulator, emulator
boards, hardware testing boards, and so on.
Applications of FPGA
Design Flow for FPGACreate a behavioral, RTL or structural model of the
design using HDL
Simulate and Debug the Design
Synthesize the design targeting the desired device
Run a mapping of the design, that will break the
logic diagram into pieces that will fit into the CLBs
Run the place-and-route program, to place the logic
blocks in FPGA and to route the interconnections
Run a program that will generate the bit pattern
that is necessary to program the FPGA
Download the bit pattern into the configuration cells
and test the operation of FPGA
1 & 2 3&4&5 6 & 7
1
2
3
4
5
6
7
STATE MACHINE CHARTS
 A “State Machine” is used to control a digital system that carries
out a step-by-step procedure or an algorithm.
 A “State Diagram” or “State Graph” is used to specify the
operation of such state machine.
 A “State Machine Chart” is an alternative to state diagram, and
the SM chart has the following advantages:
• It offers an easier understanding of the digital system.
• It automatically satisfies the conditions of the state graph
(exactly one true transition from a state at any time, unique
definition of the next state for every input combination).
• It directly leads to a hardware realization of the system.
• An SM chart contains 3 principal components, as shown.
• An SM chart is constructed from SM blocks, where each SM
block describes the machine operation during one state.
• Therefore, each SM block contains exactly one state box,
together with decision boxes and conditional output boxes
that are associated with that particular state.
• Thus, an SM block contains exactly one entrance path, and
one or more exit paths.
• A path through an SM block from
entrance to exit is called as “link path”.
• In an SM block, when the system
enters that state, the outputs in the
state box become true.
• e.g., when state S1 is entered, Z1 & Z2
become 1. If X1 = 0, then Z3 & Z4 also
become 1. If X2 = 0, then the machine
goes to the next state via exit path 1.
During this condition, Z5 remains at 0.
• If X1 = 1, then Z3 & Z4 remain at 0,
and if X3 = 0, then Z5 becomes 1, and
the machine goes to the next state via
exit path 3.
• A given SM block can be
drawn in different forms, as
shown in the figure.
• Here, Z1 = A + BC. As this
is a combinational circuit,
there is only one state, and
there is no state change.
• The second SM chart
allows for individual testing
of input variables, and the
function is, Z1 = A + A'BC,
which is the same.
Rules for constructing an SM block
1. For every valid combination of input variables, exactly
one exit path must be defined.
2. Within an SM block, no internal feedback is allowed.
3. SM block can be drawn either in a serial form or in a
parallel form. Both are equivalent, as all the tests take
place within one clock time.
A given state graph can be converted into an equivalent SM chart, as shown. This
state graph has 3 Moore outputs (Za, Zb, Zc) and 2 Mealy outputs (Z1, Z2). Hence,
the Moore outputs will appear in state boxes and Mealy outputs will appear in
conditional output boxes. Each SM block will have only one decision box, as
there is only one input variable to be tested.
Example: Derivation of SM chart for a Binary multiplier
• Abbreviations: St = Start, Sh = Shift,
Ad = Add, M = current multiplier bit,
K = completion signal.
• If M = 1, the multiplicand is added to the
contents of accumulator, followed by a right
shift. If M = 0, then the addition is skipped,
and only the right shift occurs.
• Conversion of the SM chart into Verilog
code is a straightforward process.
• “case” statement can be used to specify
each state, and “if” statement can be used
for the conditional output boxes.
Verilog code for the
Binary multiplier
Realization of SM charts
Example-1:
 As there are 3 states, the state
assignments can be 00, 01 & 11.
 Taking these values as A & B,
Za = A'B', Zb = A'B, Zc = AB,
Z1 = ABX', Z2 = ABX.
 From the link paths 2 & 3, the
next state of A can be written as,
A+ = A'BX + ABX
 From the link paths 1, 2 & 3, the
next state of B is written as,
B+ = A'B'X + A'BX + ABX
Procedure for deriving the next state equation
1. Perform state assignment for all of the states.
2. Write the output equations directly from the SM chart.
3. For the next state, identify all the states in which Q = 1.
4. Find all the link paths that lead into the particular state.
5. For each link path, find a term that has value equal to 1.
6. The expression for Q+ is formed by ORing all the terms.
7. Q+ is realized using D-FF and combinational circuit.
Example-2:  As there are 4 states, the state
assignments can be 00, 01, 10 & 11,
respectively for S0, S1, S2 & S3.
 Load = A'B'St, Ad = A'BM,
Sh = A'BM' + AB'
 A is true in S2 & S3. Hence,
A+ = A'BM + A'BM'K + AB'K
 B is true in S1 & S3. Hence,
B+ = A'B'St + A'BM'K' + AB'K'
+ A'BM'K + AB'K
Or, B+ = A'B'St + A'BM' + AB'
A B St M K A+ B+ Load Sh Ad Done
S0
0 0 0 - - 0 0 0 0 0 0
0 0 1 - - 0 1 1 0 0 0
S1
0 1 - 0 0 0 1 0 1 0 0
0 1 - 0 1 1 1 0 1 0 0
0 1 - 1 - 1 0 0 0 1 0
S2
1 0 - - 0 0 1 0 1 0 0
1 0 - - 1 1 1 0 1 0 0
S3 1 1 - - - 0 0 0 0 0 1
State transition table for multiplier control
System design using HDL - Module 3

More Related Content

PPTX
design of high speed performance 64bit mac unit
PPT
Serial Peripheral Interface(SPI)
PPTX
Sequential circuits
PPTX
AMBA Ahb 2.0
PPT
VLSI- Unit I
PPTX
8051 serialp port
PPT
Data Flow Modeling
PPTX
Chapter 6: Sequential Logic
design of high speed performance 64bit mac unit
Serial Peripheral Interface(SPI)
Sequential circuits
AMBA Ahb 2.0
VLSI- Unit I
8051 serialp port
Data Flow Modeling
Chapter 6: Sequential Logic

What's hot (20)

PPT
CMOS Logic Circuits
PPTX
Registers
PDF
Edge triggered RS FF.pdf
DOCX
4 bit uni shift reg
PDF
Delays in verilog
DOCX
Microprocessor Interfacing and 8155 Features
PPTX
Adder
PPTX
COUNTERS(Synchronous & Asynchronous)
PPTX
PPTX
Verilog data types -For beginners
PPTX
Behavioral modelling in VHDL
PPTX
Programmable peripheral interface 8255
PDF
A report on 2 to 1 mux using tg
PPTX
Verilog operators.pptx
PPTX
8051 timer counter
PPTX
Latches and flip flop
PPTX
PPTX
Combinational Circuits & Sequential Circuits
CMOS Logic Circuits
Registers
Edge triggered RS FF.pdf
4 bit uni shift reg
Delays in verilog
Microprocessor Interfacing and 8155 Features
Adder
COUNTERS(Synchronous & Asynchronous)
Verilog data types -For beginners
Behavioral modelling in VHDL
Programmable peripheral interface 8255
A report on 2 to 1 mux using tg
Verilog operators.pptx
8051 timer counter
Latches and flip flop
Combinational Circuits & Sequential Circuits
Ad

Similar to System design using HDL - Module 3 (20)

PDF
EC8392 -DIGITAL ELECTRONICS -II YEAR ECE-by S.SESHA VIDHYA /ASP/ ECE/ RMKCET
PDF
fbga module 1 for vlsi study matiriajlse
PPTX
Introduction to CPLD: Field Programmable Gate Array
PPTX
Unit v memory & programmable logic devices
PDF
Introduction to FPGA, VHDL
PPTX
5B. .Semiconductor Memories Part II.pptx
PPTX
5B. Semiiconductor Memories Part II.pptx
PDF
Lecture13.pdf UNIT 4 In digital logic Circuits
PDF
microcontroller 8051 17.07.2023.pdf
PPTX
module7.pptx
PDF
Lab9500
PPT
programmable logic arrays, programmable logic designs,
PPT
LCDF3_Chap_03_P2.ppt777777777777777777777777777777777777
PDF
Programmable Logic Array(PLA), digital circuits
PPTX
Programmable logic array
PDF
n5acb0f1c011fb.pdf
PPTX
Memory types in fundamental of electronics.pptx
PPTX
DE-UNIT-V MEMORY DEVICES AND DIGITAL INTEGRATED CIRCUITS
PPT
PDF
Programmable Logic Devices
EC8392 -DIGITAL ELECTRONICS -II YEAR ECE-by S.SESHA VIDHYA /ASP/ ECE/ RMKCET
fbga module 1 for vlsi study matiriajlse
Introduction to CPLD: Field Programmable Gate Array
Unit v memory & programmable logic devices
Introduction to FPGA, VHDL
5B. .Semiconductor Memories Part II.pptx
5B. Semiiconductor Memories Part II.pptx
Lecture13.pdf UNIT 4 In digital logic Circuits
microcontroller 8051 17.07.2023.pdf
module7.pptx
Lab9500
programmable logic arrays, programmable logic designs,
LCDF3_Chap_03_P2.ppt777777777777777777777777777777777777
Programmable Logic Array(PLA), digital circuits
Programmable logic array
n5acb0f1c011fb.pdf
Memory types in fundamental of electronics.pptx
DE-UNIT-V MEMORY DEVICES AND DIGITAL INTEGRATED CIRCUITS
Programmable Logic Devices
Ad

More from Aravinda Koithyar (20)

PPTX
Analog Electronic Circuits - Module 5
PPTX
Analog Electronic Circuits - Module 4
PPTX
Analog Electronic Circuits - Module 3
PPTX
Analog Electronic Circuits - Module 2.3
PPTX
Analog Electronic Circuits - Module 2.2
PPTX
Analog Electronic Circuits - Module 2.1
PPTX
System design using HDL - Module 1
PPTX
System design using HDL - Module 5
PPTX
System design using HDL - Module 2
PPTX
Spiritual health through yoga
PPTX
Lessons of leadership
PPTX
Great sayings
PPTX
Sanskrit words with pictures
PPTX
Guru and Gaayathree
PPTX
Samskrtha Pravesha
PPTX
Adi shankara's literary works
PPT
Introduction to yoga
PPT
PPTX
Sacred secret
PPTX
Analog Electronic Circuits - Module 5
Analog Electronic Circuits - Module 4
Analog Electronic Circuits - Module 3
Analog Electronic Circuits - Module 2.3
Analog Electronic Circuits - Module 2.2
Analog Electronic Circuits - Module 2.1
System design using HDL - Module 1
System design using HDL - Module 5
System design using HDL - Module 2
Spiritual health through yoga
Lessons of leadership
Great sayings
Sanskrit words with pictures
Guru and Gaayathree
Samskrtha Pravesha
Adi shankara's literary works
Introduction to yoga
Sacred secret

Recently uploaded (20)

PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
오픈소스 LLM, vLLM으로 Production까지 (Instruct.KR Summer Meetup, 2025)
PPT
Chapter 6 Design in software Engineeing.ppt
PPTX
Simulation of electric circuit laws using tinkercad.pptx
PPTX
Geodesy 1.pptx...............................................
PPTX
OOP with Java - Java Introduction (Basics)
PDF
composite construction of structures.pdf
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
“Next-Gen AI: Trends Reshaping Our World”
PDF
BRKDCN-2613.pdf Cisco AI DC NVIDIA presentation
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
The-Looming-Shadow-How-AI-Poses-Dangers-to-Humanity.pptx
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
Geotechnical Engineering, Soil mechanics- Soil Testing.pdf
PPTX
Practice Questions on recent development part 1.pptx
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PPTX
Unit 5 BSP.pptxytrrftyyydfyujfttyczcgvcd
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Embodied AI: Ushering in the Next Era of Intelligent Systems
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
오픈소스 LLM, vLLM으로 Production까지 (Instruct.KR Summer Meetup, 2025)
Chapter 6 Design in software Engineeing.ppt
Simulation of electric circuit laws using tinkercad.pptx
Geodesy 1.pptx...............................................
OOP with Java - Java Introduction (Basics)
composite construction of structures.pdf
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
“Next-Gen AI: Trends Reshaping Our World”
BRKDCN-2613.pdf Cisco AI DC NVIDIA presentation
UNIT-1 - COAL BASED THERMAL POWER PLANTS
The-Looming-Shadow-How-AI-Poses-Dangers-to-Humanity.pptx
bas. eng. economics group 4 presentation 1.pptx
Geotechnical Engineering, Soil mechanics- Soil Testing.pdf
Practice Questions on recent development part 1.pptx
Arduino robotics embedded978-1-4302-3184-4.pdf
Unit 5 BSP.pptxytrrftyyydfyujfttyczcgvcd
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...

System design using HDL - Module 3

  • 1. SYSTEM DESIGN USING HDL (ECE43) # Digital system design using Verilog, Charles Roth, Lizy Kurian John, Byeong Kil Lee, 1st Edition, 2016, Cengage Learning 1 2.1, 2.2, 2.3 - 2.8, 2.11, 2.13 - 2.15 2 2.9, 2.10, 2.12, 2.16 - 2.19, 8.1, 8.2 3 3.1 - 3.4, 5.1, 5.2.1, 5.3 4 4.1 - 4.5, 4.8, 4.6, 4.7, 4.9, 4.11 5 6.1 - 6.5, 6.7 - 6.12 INTRODUCTION TO PROGRAMMABLE LOGIC DEVICES
  • 3. Need of programmable logic devices: • Implementation of a significant amount of functionality into one physical chip. • Removes the need for multiple off-the-shelf devices. • Easy reprogramming, therefore increased ability to change the design. • Easier to change the design in case of errors or change in the design specifications.
  • 4. Programmable logic Factory programmable devices ROM (Read only memory) MPGA (Mask Programmable Gate Array) Field Programmable Devices SPLD (Simple Programmable Logic Device) CPLD (Complex programmable Logic Device) FPGA (Field Programmable Gate Array) GAL (Generic Array Logic) PAL (Programmable Array Logic) PLA (Programmable Logic Array) PROM (Programmable Read Only Memory)
  • 5. • Factory Programmable Devices: Generic devices that are programmed at the factory to meet the Customer’s requirements. Programming can be done only once. Examples: ROM, MPGA • ROM: Primarily meant for memory, but can be used to implement combinational circuits. • MPGA: Also called as gate arrays, they have been a popular technology for creating ASIC.
  • 6. • Field Programmable Devices: Devices that are programmed by the user, rather than in factory. Factor SPLD CPLD FPGA Density Low (few hundred gates) Low to medium (500 to 12,000 gates) Medium to high (3000 to 5,000,000 gates) Timing Predictable Predictable Unpredictable Cost Low Low to Medium Medium to high Major Vendors (with device families) Lattice (GAL16LV8, GAL22V10), Cypress (PALCE16V8), AMD (22V10) Xilinx (CoolRunner, XC9500), Altera (MAX) Xilinx (Kintex, Artix, Virtex, Spartan), Altera (Stratix, Cyclone, Arria), Lattice (Mach, ECP), Microsemi (Axcelerator, Fusion)
  • 7. • PLA: It consists of programmable AND array & programmable OR array. • PAL: It is a special case of PLA, where OR array is fixed and only AND array is programmable. It can also contain flip-flops. • Earlier programmable devices were only one time programmable (OTP, PROM); later on, the advent of Ultraviolet and electronically erasable technology gradually led to re-programmable logic devices.
  • 8. CMOS Electrically Erasable PLDs: • It contains macroblocks with array of gates, flip-flops, multiplexers, or standard building blocks. • PLAs, PALs, GALs & PLDs are collectively referred as SPLDs. GAL (Generic Array Logic): • Lattice semiconductor created similar devices with easy reprogrammability, and called their line of devices as GALs.
  • 10. • ROM consists of an array of semiconductor devices that are interconnected to store an array of binary data. • Data stored in ROM can be read out when required, but cannot be changed under normal operating conditions. • Output pattern stored in ROM is called a word. • Each input serves as address, which selects one of the stored words as output.
  • 11. • Size of ROM is given as follows: 2n X m, where “n” represents the number of input lines and “m” represents the width of output lines.
  • 12. • A ROM’s size, with 4-bit output line and 3-bit input line can be written as, 8 words X 4 bits. • In the following example, when ABC=010, F0F1F2F3=0111.
  • 13. • ROM consists of a decoder and a memory array. When a pattern of 1’s and 0’s is applied as input to the decoder, any one of its output becomes 1, which in turn selects that particular stored word from the array. • Types of ROM:  Mask programmable ROM  PROM (user programmable)  EPROM (UV erasure)  EEPROM (Electrically erasable)  Flash memory
  • 14. • Mask programmable ROM: Data array is permanently stored during manufacture, by selectively including or omitting the switching elements, in the cross-point switch matrix. Special masks are used for this purpose, which is an expensive process. • PROM: One time, user programmable (fuse / antifuse). • EPROM: Programmer uses voltage pulses to store electronic charges in the memory array location. UV light is used for the erasure of complete data that is stored. • EEPROM: Uses electronic pulses for erasure of data. It can be reprogrammed only 100 to 1000 times. • Flash memories: They have built-in programming and erasure capabilities, and data can be written while in-circuit, without needing any separate programmer.
  • 15. • ROM can implement any combinational circuit, by storing the outputs for all of the input combinations. Hence, this method is also called as LUT method. Ex-1: Implement a 2 bit adder using ROM: Solution: Input : two 2-bit numbers. Output : Sum having 3-bits. • Can be implemented with 16 X 3 ROM.
  • 16. Data to be stored in memory: 0, 1, 2, 3, 1, 2, 3, 4, 2, 3, 4, 5, 3, 4, 5, 6
  • 17. Ex-2: Compute the size of the ROM to implement an 8:3 priority encoder. There will be 256 entries in the ROM. Size of the ROM: 28 X 4
  • 18. Ex-3: Implement the following state machine, of BCD to Excess-3 code converter, using ROM. PS NS Z X=0 X=1 X=0 X=1 S0 S1 S2 1 0 S1 S3 S4 1 0 S2 S4 S4 0 1 S3 S5 S5 0 1 S4 S5 S6 1 0 S5 S0 S0 0 1 S6 S0 - 1 -
  • 19. • Sequential circuit is designed using ROM and flip-flops. • ROM is used to realize the output functions and the next state equations. • The state of the circuit is stored in a register of D flip-flops, and fed back to the input of the ROM. • To realize the given Mealy machine, a ROM and 3 flip-flops are necessary.
  • 20. • The ROM will generate the next state equations and output Z, from the present states and input X. • Q1, Q2, Q3 and X are connected to the address lines, with X connected to the LSB. • Contents of ROM are: 3, 4, 6, 8, 9, 8, A, B, B, C, 0, 1, 1, 0, 0, 0.
  • 22. • PLA with n-input lines and m-output lines, can realize m-functions of n-variables. • When compared to ROM, instead of decoder, AND array is used to realize the product terms. • Later on, OR array is used to sum the product terms.
  • 23. Ex-4: Using PLA, Realize the following functions: F0 = ⅀m(0,1,4,6) = (A'B'+AC') F1 = ⅀m(2,3,4,6,7) = (B+AC') F2 = ⅀m(0,1,2,6) = (A'B'+BC') F3 = ⅀m(2,3,5,6,7) = (AC+B) Solution: There are 3 inputs: A, B & C. There are 5 distinct product terms in the 4 outputs. Unlike ROM, in a PLA implementation, the product terms can be shared among the functions.
  • 24. F0 =⅀m(0,1,4,6)=(A'B'+AC') F1 =⅀m(2,3,4,6,7)=(B+AC') F2 =⅀m(0,1,2,6)=(A'B'+BC') F3 =⅀m(2,3,5,6,7)=(AC+B)
  • 25. • Instead of AND-OR logic, PLA may use NOR-NOR logic. • 2-input NOR gate can be built using nMOS transistors: • NOR-NOR with inverters at input and output = AND-OR. F0 = ⅀m(0,1,4,6) = (A'B'+AC') F1 = ⅀m(2,3,4,6,7) = (B+AC') F2 = ⅀m(0,1,2,6) = (A'B'+BC') F3 = ⅀m(2,3,5,6,7) = (AC+B)
  • 26. Ex-5: Using PLA, Realize the following functions: F1 = ⅀m(2,3,5,7,8,9,10,11,13,15) F2 = ⅀m(2,3,5,6,7,10,11,14,15) F3 = ⅀m(6,7,8,9,13,14,15) Solution: After minimization, the simplified functions are : F1 = ⅀m(2,3,5,7,8,9,10,11,13,15) = bd+b'c+ab' F2 = ⅀m(2,3,5,6,7,10,11,14,15) = c+a'bd F3 = ⅀m(6,7,8,9,13,14,15) = bc+ab'c'+abd Here, the PLA requires 8 different product terms.
  • 27. • To reduce the number of rows in PLA, these functions can be reorganized using K-map. F1 = a'bd+abd+b'c+ab'c' F2 = b'c+bc+a'bd F3 = bc+ab'c'+abd There are only 5 different product terms, and , the PLA table has only 5 rows.
  • 28. • In case of PLA, unlike memory, the number of terms in each equation is not important, as the size of PLA does not depend on the number of terms within an equation. • To reduce the number of rows in PLA, instead of using K- maps, the Espresso algorithm can be used. This is a complex algorithm, which is used as logic minimization algorithm for VLSI synthesis. F1 = a'bd+abd+b'c+ab'c' F2 = b'c+bc+a'bd F3 = bc+ab'c'+abd The PLA implementation has 4 inputs, 5 product terms & 3 outputs.
  • 30. • It is a special case of PLA, in which AND array is programmable and OR array is fixed. • Due to this reason, PAL is less expensive than PLA, and is easier to program as well. • The following figure represents a segment of an un-programmed PAL, along with the input buffers which contain two outputs.
  • 31. Ex-6: Implement I1I2'+I1'I2. Solution: • As OR gates cannot be programmed, AND terms cannot be shared among two or more OR gates. • Typical PALs have 10 to 20 inputs, and 2 to 10 outputs, with 2 to 8 AND gates driving each OR gate.
  • 32. Ex-7: Implement a full-adder using PAL. Solution: SUM = X'Y'Cin+X'YCin'+XY'Cin'+XYCin COUT = XY+YCin+XCin
  • 33. • PALs were made available that contained D flip-flops as well, and were called as “sequential PALs”. Ex-8: Implement Q+ = D = A'BQ'+AB'Q. Solution:
  • 34. PLD/GAL (Programmable Logic Device / Generic Array Logic)
  • 35. • PALs and PLAs are good for implementing small circuitry. But, they are not re-programmable. • When they are made as erasable/reprogrammable, by incorporating Flash memory, such PALs are often referred as PLDs/GALs. • An example is 22CEV10, which is a CMOS electrically erasable PLD, that can realize both combinational as well as sequential circuits.
  • 36. • 22CEV10 contains:  12 declared input pins  10 pins that can be programmed as input / output  Programmable AND array (8 till 16 gates feeding each OR gate)  10 OR gates, each of which drives an output macrocell  10 D Flip-flops, with asynchronous reset and synchronous preset  Each macrocell contains the D Flip-flop, multiplexer, and additional programmability at the output • 22CEV10 => 22 pins out of which 10 are bidirectional
  • 38. • Each macrocell has 2 programmable interconnect bits: S1 & S0. • When the particular bit is programmed, it is connected to 0 V. • Erasing that bit disconnects it from 0 V, and it floats at logic-1. S1 S0 Output 0 0 D Flip-flop output 0 1 D Flip-flop output inverted 1 0 OR output 1 1 OR output inverted
  • 39. • CAD programs are available for PAL/PLD programming. These programs accept logic equations, truth tables, state graphs or state tables as inputs. • They automatically generate the required bit patterns, which can be downloaded into a PLD programmer, which will create the necessary connections. • PALASM (Programmable Array Logic ASsembler for Military) from MMI & AMD, and ABEL (Advanced Boolean Expression Language) from DATA I/O are the two popular languages that are used for programming.
  • 42. • This is a programmable IC which is equivalent to several PLDs in the same silicon chip. Typically a CPLD comprises of 500 to 10,000 logic gates. • It consists of a number of PAL-like logic blocks, along with a programmable interconnect. The interconnect matrix is implemented using crossbar switch. Even though it is expensive, it results in predictable timing. • CPLDs are electronically erasable and reprogrammable, and hence are sometimes referred to as EPLDs (Erasable Programmable Logic Device).
  • 43.  Typically a CPLD contains a number of macrocells, that are grouped into function blocks.  Each macrocell contains a flip- flop and an OR-gate, and the macrocell has its inputs connected to an AND gate array.  The major manufacturers of CPLD are: Xilinx, Altera, Lattice, Cypress and Atmel.
  • 47. • This CPLD has 4 function blocks, and each block has 16 associated macrocells. A function block is a programmable AND-OR array, which is configured as a PLA. • Each macrocell contains a flip-flop and additional multiplexers, that route the signals from the function blocks to the I/O blocks or to the interconnect array. • The interconnect array selects signals from the macrocell outputs and the I/O blocks, and connects them back to function blocks. Thus, a signal generated from any function block can be used as an input to any other function block.
  • 49. • Initially, two D-inputs have to be generated for the Flip-flops. • Later on, two outputs (Z1, Z2) have to be generated, by utilizing the Flip-flop outputs. • Hence, four macrocells are required for the implementation of the required Mealy machine. Ex-9: Implement a Mealy sequential machine with 2 inputs and 2 outputs.
  • 50. Ex-10: Implement a parallel adder with accumulator.
  • 51. • The accumulator register needs one FF for each bit. • But that bit also needs to generate the sum and carry bits corresponding to that particular bit. • Hence, each bit of an adder requires two macrocells, one for the sum and the accumulator, and the other for the carry.
  • 52. FPGA (Field Programmable Gate Array) They contain an array of identical logic blocks with programmable interconnections. User can program the functions realized by each logic block, and can flexibly program the connections between them.
  • 53. ADVANTAGES DISADVANTAGES The time-to-market of FPGA product is much much lesser. FPGAs are less dense than MPGAs. With FPGA, it is easier to correct the mistakes in the design. FPGAs are slower, due to the RC delay in programmable points. The prototyping cost is much reduced, with the usage of FPGA. Interconnect delays in FPGAs are unpredictable. At low volumes, FPGAs are cheaper than MPGAs. Programming overhead is much higher, because of the resources. MPGA versus FPGA
  • 54. When compared to CPLD the major advantage of FPGA is its highly flexible programmable interconnect, and due to this fact itself, the major disadvantage is its unpredictable interconnect delay.
  • 55. FPGA typically contains three programmable elements: 1. Programmable logic blocks (Configurable Logic Blocks) 2. Programmable routing resources 3. Programmable I/O blocks
  • 56. • Programmable logic blocks • These are created by Muxes, LUTs, and AND-OR arrays. • Programming refers to: a) Changing the contents of LUT, b) Changing the I/O signals to the Muxes, c) Selecting or not selecting the particular gates in the AND-OR arrays. • Programmable interconnect • For making or breaking the specific connections. • For connecting various blocks in the chip to each other. • For connecting specific I/O pins to specific logic blocks. • Programmable I/O blocks • I/O pads can be programmed as i/p, o/p or bidirectional. • They also can be programmed as inverting, non-inverting, tri- state, slew rate adjustable, passive pull-up etc.
  • 57. • Based on the topology in which the logic blocks and the interconnect resources are distributed inside, there can be four different basic architectures of FPGAs that are in the market since 1980s: • Matrix based architecture • Row based architecture • Hierarchical PLD architecture • Sea-of-gates architecture • Modern FPGAs that are in the market, contain special purpose blocks including a microprocessor. Architectures of FPGA
  • 58. 1. Matrix based architecture (e.g., Most Xilinx FPGAs) • This architecture is also called as “symmetrical array”, and it contains 8X8 arrays in smaller chips, and 100X100 or larger arrays in larger chips. • Routing is called two-dimensional channeled routing, since routing resources are available in horizontal and vertical directions.
  • 59. 2. Row based architecture (e.g., some Microsemi FPGAs) • The logic blocks are organized into rows, and hence, there are rows of logic blocks, and rows of routing resources. • Routing is called one-dimensional channeled routing, as the routing resources are channeled between the rows.
  • 60. 3. Hierarchical PLD architecture (e.g., Altera APEX20, APEX II) • At the lower level, the FPGAs contain clusters of logic blocks with localized resources for interconnection. • At the higher level, the global interconnect is used for interconnection between the clusters of logic blocks.
  • 61. 4. Sea-of-gates architecture (e.g., Microsemi Fusion) • FPGAs contain a large number of gates, and there is an interconnect superimposed on the sea-of-gates. • There are other terminologies such as sea-of-cells or sea-of-tiles, to indicate the topology with a large number of logic blocks.
  • 62. • The term “Programming technology” is used to denote the technology by which the programmability in an FPGA is achieved, especially for the programmable interconnect. • Some of the techniques are: • SRAM programming technology • EPROM / EEPROM / Flash programming technology • Antifuse programming technology FPGA Programming Technologies
  • 63. SRAM Programming Technology • As in the case of ROM, an SRAM can be used to store the “configuration bits” for interconnection, in an LUT. • e.g., Sixteen SRAM cells can implement any function of four variables. • The programmable interconnect can be achieved by SRAM, in the following two ways: • Pass transistor is used for connecting two points • Routing matrices are implemented by using mux
  • 64. Disadvantages of SRAM Programming Technology 1. Six transistors are required for every SRAM cell. • e.g., if FPGA has 1 million programmable points, 6 million transistors are required for achieving this programmability. 2. Since SRAM is volatile, all the contents are lost during power failure. This is a serious setback when an FPGA is used in the final product. • As a solution, EPROM can be used as “boot ROM”, to store the configuration bits, and its contents can be transferred to SRAM whenever power gets resumed. Advantages of SRAM Programming Technology 1. As SRAM is a volatile memory, new contents can be written again and again, thus providing flexibility during prototyping. 2. Fabrication steps for manufacturing SRAM are same as that for manufacturing other logic cells.
  • 65. EPROM / EEPROM / Flash Programming Technology • Instead of SRAM, EPROM cells are used to control the programmable interconnections. Each EPROM cell contains a MOSFET, which has two gates: Control gate and Floating gate. • The drain of the transistor can be connected to VDD by means of a pull-up resistor. When a high voltage (10 - 13 V) is applied to the control gate, electrons get injected into the floating gate, and the transistor turns OFF. • The electrons remain trapped at the floating gate. The trapped negative charges can be removed, by exposing the EPROM to UV light.
  • 66. Disadvantages of EPROM Programming Technology 1. EPROM is slower than SRAM, because of the dual-gate structure. 2. While manufacturing, EPROMs require more processing steps than SRAMs. 3. EPROM based switches have high ON-resistance, and also have high static- power-consumption. 4. For erasure, the EPROM chip has to be physically removed from the PCB.  EEPROM is similar to EPROM, but removal of the gate charge can be done electrically. Hence, for erasure, the chip need not be removed from PCB.  The memory cells can be selectively erased and can be rewritten, and this does not require any additional equipment.  Flash is a form of EEPROM, in which a block of cells can be erased at once, by applying a large voltage at the control gate, causing the electrons to pull off.  By sensing the amount of current flow, each cell in Flash can store multiple bits of information, which in turn depends on the number of trapped electrons.  While writing bits into, Flash is faster than EEPROM, but slower than SRAM.
  • 67. • Antifuse programming element changes from high resistance (open - OFF) to low resistance (closed - ON), when a high voltage is applied. • Antifuses are built by dielectric layers between N+ diffusion and polysilicon layers, or by amorphous silicon in between metal layers. Antifuse Programming Technology Advantages: • When compared to MOSFETs, the area consumed by the antifuse is smaller. • Antifuse based connections are faster than SRAM / EPROM technologies. Disadvantages: • The antifuse connection is OTP. • Because of this, design change is not possible.
  • 68. Comparison of FPGA Programming Technologies Programming technology Storage Programmability Area overhead Resistance Capacitance SRAM Volatile In-Circuit reprogrammable Large Medium to high High EPROM Non- volatile Out-of-Circuit reprogrammable Small High High EEPROM / Flash Non- volatile In-Circuit reprogrammable Medium to large High High Antifuse Non- volatile Not reprogrammable Small Low Low
  • 69. • Manufacturers use different names to denote their logic blocks: • Xilinx calls them as Configurable Logic Blocks (CLB). • Microsemi calls them as VersaTiles. • Altera calls them as Logic Elements (LE), and a group of LEs is called as Logic Array Blocks(LABs). • Mainly two types of logic blocks are used in FPGAs: 1. LUT based programmable logic blocks. 2. Mux based programmable logic blocks. I. Programmable logic block architectures
  • 70. • Look Up Table contains memory cells along with multiplexers. • The output for each input combination is stored in memory cells. • The input combination is used as control inputs to the multiplexer. • For a 2-variable function, 4 memory cells and a 2:1 mux is required. • For an n-input function, 2n memory cells and 2n :1 mux are required.
  • 71. 1. LUT (Look-Up Table) based Programmable logic block • Each block contains two LUT4 and two flip-flops. • The LUT4 can generate any one function of 4 variables. • The flip-flop has chip enable, set and reset inputs. • A multiplexer is used to select in between the combinational and the latched version of the LUT4 output. • The multiplexer is controlled by a bit stored in memory.
  • 72. • Choosing X1 as LSB and X4 as MSB, X4 input need not be used, as F1 uses only 3 variables. • To store the contents in the LUT, the truth table of the function has to be constructed. • From the truth table, the contents of LUT to implement the function F1 will be {0,1,1,0,0,0,1,1}. • As LUT4 contains 16 memory cells for output, it is better to store the other 8 bits as well, irrespective of the status of X4. • Thus, the contents of LUT are {0,1,1,0,0,0,1,1,0,1,1,0,0,0,1,1}. Ex-11: Implement the function A'B'C+A'BC'+AB, using LUT.
  • 73. 2. Multiplexer based Programmable logic block • With LUT, it is not necessary to minimize the function, as the number of terms in the function is not important (all o/p bits need to be stored). • But LUT requires storage space. To save it, multiplexers along with basic gates, can be used.
  • 74. • As there are 3 variables, we can choose a 4:1 mux, which has 2 select lines. • The truth table can be constructed, so as to define the output in terms of C. • The mux select lines can be A & B, and the mux input lines can be connected in accordance with the last column in the truth table. A B C F1 Mux i/p in terms of C 0 0 0 0 C 0 0 1 1 C 0 1 0 1 C' 0 1 1 0 C' 1 0 0 0 0 1 0 1 0 0 1 1 0 1 1 1 1 1 1 1 Ex-12: Implement the function A'B'C+A'BC'+AB, using mux.
  • 76. II. Programmable interconnect 1. General purpose interconnect The completely non-blocking switch matrix is very expensive. e.g., in a 4X4 matrix, out of 16 switches, only 4 switches are utilized at any point of time. Crosspoint switch matrix 6-way switch To reduce the number of multiple connections for a single route, the crosspoint can be configured as a 6-way switch. But, this crosspoint is more complicated than the earlier one. The interconnect in between the logic blocks should provide flexible interconnection in between the rows and columns (e.g., row- column, row-row, column-column).
  • 77. 2. Direct interconnect Direct interconnect to 4 neighbors Special connections to 8 neighbors To reduce the delay in the switch matrix, many FPGAs provide direct connections between the logic blocks, by means of dedicated switches.
  • 78. 3. Global interconnect lines For high fan-out and low-skew clock distribution, FPGAs provide routing lines that span the entire width & height of the device. When the clock is distributed to a few million gates in the chip, the delay in the wire causes the clock edges to arrive at different times at different parts of the chip. This is called as “clock skew”, which needs to be eradicated, for the faithful functionality of the circuitry on the chip.
  • 79. Interconnects in row-based FPGAs The previous interconnects discussed, are applicable to matrix-based architecture, which has symmetrical arrays. For row-based architecture, as it is one-dimensional, it has arrays of switches in the routing channel, which is situated in between the logic blocks. i) Non-segmented ii) Segmented When the 3 connections required are x, y & z, they can be done in 2 ways: non- segmented (full length track, faster), segmented (reduced resources, slower) Example nets
  • 80. II. Programmable I/O blocks • I/O blocks on modern FPGAs allow the use of a pin as true or inverted, direct or latched, input or output, and so on. • The I/O options can be selected by means of the configuration memory cells, indicated in the figure as “M”. • The inversion is performed using an XOR gate, and one memory-bit. • The direction of the pin is decided using a tri-state buffer, and its control can be selected as active high or active low, using another memory-bit. • Similarly, the rate-of-change of output (slew rate), and the pull-up option (open drain, built- in resistor), can be configured using the memory cells (SRAM, EEPROM / Flash, antifuse).
  • 81. Dedicated Specialized Components in FPGA 1. Dedicated memory: The embedded RAM, can be used to implement the memory needs of the circuit, that is being designed. 2. Dedicated Arithmetic Units: The custom implementation of adders and multipliers inside FPGA, is smaller and faster, than its counterpart that is implemented using FPGA. 3. DSP Blocks: To support DSP applications, the vendors provide the hardware inside the FPGA for encryption/decryption, FFTs, FIR filters, IIR filters, compression/decompression, and so forth. 4. Embedded Processors: This is a hybrid solution where part of the design is in a programmable processor (high flexibility), and the remaining part is implemented in hardware (better performance). 5. Content Addressable Memory: This is a special kind of memory in which the content, and not the address, is used to search the memory.
  • 82. 1. Rapid Prototyping • As FPGAs contain 5 million or more gates, many large real-world systems can prototyped very quickly using a single FPGA. • If a single FPGA will not suffice, multiple FPGAs can be interconnected to realize larger systems, by plugging the boards into a backplane. 2. Final Products in Medium Speed Systems • Circuits realized using FPGAs typically operate in the range of 150-200 MHz. If this speed is sufficient, FPGAs can be used for the final product, instead of the prototype. • In the final product, if enhancements to the system are required, they can be done as software updates, rather than hardware changes. 3. Glue Logic • This is a digital circuitry that works as an interface between two different logic modules. • Using SRAM FPGAs, the new interface logic can be implemented on the same FPGA. 4. Hardware Accelerators / Coprocessors • For a software application, an FPGA can be used as a coprocessor, so that it is used to implement a key kernel, and thus the application can be accelerated. • Examples of such applications are - pattern matching, computer architecture simulator, emulator boards, hardware testing boards, and so on. Applications of FPGA
  • 83. Design Flow for FPGACreate a behavioral, RTL or structural model of the design using HDL Simulate and Debug the Design Synthesize the design targeting the desired device Run a mapping of the design, that will break the logic diagram into pieces that will fit into the CLBs Run the place-and-route program, to place the logic blocks in FPGA and to route the interconnections Run a program that will generate the bit pattern that is necessary to program the FPGA Download the bit pattern into the configuration cells and test the operation of FPGA 1 & 2 3&4&5 6 & 7 1 2 3 4 5 6 7
  • 84. STATE MACHINE CHARTS  A “State Machine” is used to control a digital system that carries out a step-by-step procedure or an algorithm.  A “State Diagram” or “State Graph” is used to specify the operation of such state machine.  A “State Machine Chart” is an alternative to state diagram, and the SM chart has the following advantages: • It offers an easier understanding of the digital system. • It automatically satisfies the conditions of the state graph (exactly one true transition from a state at any time, unique definition of the next state for every input combination). • It directly leads to a hardware realization of the system.
  • 85. • An SM chart contains 3 principal components, as shown. • An SM chart is constructed from SM blocks, where each SM block describes the machine operation during one state. • Therefore, each SM block contains exactly one state box, together with decision boxes and conditional output boxes that are associated with that particular state. • Thus, an SM block contains exactly one entrance path, and one or more exit paths.
  • 86. • A path through an SM block from entrance to exit is called as “link path”. • In an SM block, when the system enters that state, the outputs in the state box become true. • e.g., when state S1 is entered, Z1 & Z2 become 1. If X1 = 0, then Z3 & Z4 also become 1. If X2 = 0, then the machine goes to the next state via exit path 1. During this condition, Z5 remains at 0. • If X1 = 1, then Z3 & Z4 remain at 0, and if X3 = 0, then Z5 becomes 1, and the machine goes to the next state via exit path 3.
  • 87. • A given SM block can be drawn in different forms, as shown in the figure. • Here, Z1 = A + BC. As this is a combinational circuit, there is only one state, and there is no state change. • The second SM chart allows for individual testing of input variables, and the function is, Z1 = A + A'BC, which is the same.
  • 88. Rules for constructing an SM block 1. For every valid combination of input variables, exactly one exit path must be defined. 2. Within an SM block, no internal feedback is allowed. 3. SM block can be drawn either in a serial form or in a parallel form. Both are equivalent, as all the tests take place within one clock time.
  • 89. A given state graph can be converted into an equivalent SM chart, as shown. This state graph has 3 Moore outputs (Za, Zb, Zc) and 2 Mealy outputs (Z1, Z2). Hence, the Moore outputs will appear in state boxes and Mealy outputs will appear in conditional output boxes. Each SM block will have only one decision box, as there is only one input variable to be tested.
  • 90. Example: Derivation of SM chart for a Binary multiplier • Abbreviations: St = Start, Sh = Shift, Ad = Add, M = current multiplier bit, K = completion signal. • If M = 1, the multiplicand is added to the contents of accumulator, followed by a right shift. If M = 0, then the addition is skipped, and only the right shift occurs. • Conversion of the SM chart into Verilog code is a straightforward process. • “case” statement can be used to specify each state, and “if” statement can be used for the conditional output boxes.
  • 91. Verilog code for the Binary multiplier
  • 92. Realization of SM charts Example-1:  As there are 3 states, the state assignments can be 00, 01 & 11.  Taking these values as A & B, Za = A'B', Zb = A'B, Zc = AB, Z1 = ABX', Z2 = ABX.  From the link paths 2 & 3, the next state of A can be written as, A+ = A'BX + ABX  From the link paths 1, 2 & 3, the next state of B is written as, B+ = A'B'X + A'BX + ABX
  • 93. Procedure for deriving the next state equation 1. Perform state assignment for all of the states. 2. Write the output equations directly from the SM chart. 3. For the next state, identify all the states in which Q = 1. 4. Find all the link paths that lead into the particular state. 5. For each link path, find a term that has value equal to 1. 6. The expression for Q+ is formed by ORing all the terms. 7. Q+ is realized using D-FF and combinational circuit.
  • 94. Example-2:  As there are 4 states, the state assignments can be 00, 01, 10 & 11, respectively for S0, S1, S2 & S3.  Load = A'B'St, Ad = A'BM, Sh = A'BM' + AB'  A is true in S2 & S3. Hence, A+ = A'BM + A'BM'K + AB'K  B is true in S1 & S3. Hence, B+ = A'B'St + A'BM'K' + AB'K' + A'BM'K + AB'K Or, B+ = A'B'St + A'BM' + AB'
  • 95. A B St M K A+ B+ Load Sh Ad Done S0 0 0 0 - - 0 0 0 0 0 0 0 0 1 - - 0 1 1 0 0 0 S1 0 1 - 0 0 0 1 0 1 0 0 0 1 - 0 1 1 1 0 1 0 0 0 1 - 1 - 1 0 0 0 1 0 S2 1 0 - - 0 0 1 0 1 0 0 1 0 - - 1 1 1 0 1 0 0 S3 1 1 - - - 0 0 0 0 0 1 State transition table for multiplier control