Use of Reset
One of the first considerations in coding RTL is
the use of reset in the design. It used to be that one was best off
using the global dedicated reset line in the FPGA to reset all
registers in the device since it conserves routing resources. This is
still an option in many FPGA families, but with availability of
abundant routing resources in today's FPGAs, it isn't as necessary.
Also, it isn't necessary to reset all of the registers in a design,
for example, pipeline registers don't generally need to be reset.
Still, even if one chooses not to use the global reset line, the use
of reset for most registers is required to get the simulation up and
running. Multiple engineers working on the same device have to agree
on the approach since it affects coding style throughout the design.
One easy way to initialize a signal in an HDL, and this is a method
that is not recommended, is to use an initial value on the declaration.
With few exceptions, the use of a default assignment in the signal
declaration should not exist in RTL since it may create a
situation where the RTL simulation will not match the back annotated (timing) simulation. An
example of this is:
SIGNAL dont_do_this : SLB := '1';
An exception to this rule is when there is some small circuit which
needs to be operational during reset and the FPGA guarantees the power up
state of the registers. An example of this would be a counter which is used to elongate
reset and must run when reset is active.
The following two sections continue the
discussion of reset implementation issues.
Reset Polarity
One problem that arises with FPGA design is that some FPGAs have
an active high reset while others have an active low
reset. It would be nice to code in a portable fashion, that is where
the code is independent of the FPGA vendor. A solution is to use a
generic which specifies the active level of reset and which is
passed down to all registers throughout the hierarchy. In the
following example, a generic called "RESET_LEVEL" accomplishes this:
- LIBRARY IEEE;
- USE IEEE.STD_LOGIC_1164.ALL;
- ENTITY blatz IS
- -- active level of reset for registers is defined here
- GENERIC (RESET_LEVEL : SLB := '1');
- PORT (CLK : IN SLB;
- RESET : IN SLB;
- CE : IN SLB;
- DIN : IN SLB;
- DOUT : OUT SLB);
- END blatz;
ARCHITECTURE behavior OF blatz IS
BEGIN
a_register: PROCESS (CLK, RESET)
BEGIN
- IF RESET = RESET_LEVEL THEN
- DOUT <= '0';
- ELSIF CLK'event AND CLK = '1' THEN
- IF CE = '1' THEN -- clock enable
- DOUT <= DIN;
- END IF;
- END IF;
END PROCESS a_register;
END behavior;
RESET_LEVEL must be passed down through the hierarchy through the use of
port mapping. The following is an example of port mapping in a component
instantiation:
- blatz_c: blatz
- GENERIC MAP (RESET_LEVEL => RESET_LEVEL)
- PORT MAP (CLK => my_clk,
- RESET => my_reset,
- CE => my_ce,
- DIN => my_din,
- DOUT => my_dout);
By implementing reset as a generic which is passed down in this fashion, the value of the RESET_LEVEL
generic can be changed at the top level which will invert the active level of reset for all
registers in the design.
Use of Asynchronous Resets
Other than the use of asynchronous reset for globally resetting the device,
the use of asynchronous resets is best avoided when possible. Asynchronous resets
create timing paths which are hard for static timing
analyzers to deal with, and can make code less portable.
It's worth noting that some FPGA vendors have started providing a fuse selection for
synchronous vs. asynchronous clear and set inputs on registers. This has the advantage
that it can provide
other synchronous paths for the logic into the register - besides the D input.
For example, consider the case where
two terms A and B will be used to set a register. This would be equivalent to an OR gate
on the D input to the register with A and B as inputs to the OR gate.
But if the asynchronous SET input were actually a synchronous
set, then the OR gate could be eliminated with A driving the D input and B driving the
synchronous set input.
The FPGA vendors have suggested using this, and it makes
sense, however it creates a caveat; global
chip reset will need to be routed into the synchronous logic. With the typical fanout of global reset
being one of the largest nets in the design, this reset signal quickly becomes the critical timing
path in the design. The way to get around this is to break the timing path for global reset
as it enters the chip. One must also ensure that this doesn't create any problems, that reset
is active for multiple clock cycles and that there isn't any logic that transitions immediately out
of reset.
Resource Sharing
Resource Sharing is the ability for the synthesizer to automatically
utilize a common function -
common to two processes for example. Consider the following
code fragment in a process:
- IF count = 237 AND request = '1' THEN
- start_count <= '1';
- END IF;
- IF count = 237 AND latch_data = '1' THEN
- data_out <= data_in;
- END IF;
Will the synthesizer create one or two comparators
for "count = 237"? The answer depends upon the quality of the
synthesizer.
The general rule for resource sharing is to perform it
manually, when possible, eliminating the possibility that the synthesizer
will screw it up. Our example can be rewritten to use a concurrent signal assignment
for the comparator:
- Count_eq237 <= '1' WHEN count = 237 ELSE '0';
and the process rewritten as follows:
- IF count_eq237 = '1' AND request = '1' THEN
- start_count <= '1';
- END IF;
- IF count_eq237 = '1' AND latch_data = '1' THEN
- data_out <= data_in;
- END IF;
It is recommended that this practice be used for all large
arithmetic and relational operations
(adders, subtractors, comparators, etc.)
Clocks
Although practical and often times necessary to have multiple
clocks in a design, one should not
generate local clocks within a clock domain in an FPGA design. This practice is often
used in gate array design to write to registers in the device (such as the control register,
status register, etc.).
In FPGAs, which have flip-flops with clock enables built in,
the clock enable should be used. The
clock enable is always the outer IF under the clock sensitivity expression:
- a_register: PROCESS (CLK, RESET)
- BEGIN
- IF RESET = RESET_LEVEL THEN
- DOUT <= '0';
- ELSIF CLK'event AND CLK = '1' THEN
- IF CE = '1' THEN -- clock enable
- DOUT <= DIN;
- END IF;
- END IF;
- END PROCESS a_register;
The use of locally generated clocks should be avoided.
Organizing Register Blocks
Almost all FPGAs and ASICs have one or more register blocks which is used to
control and monitor the status of the device. The register block can be located centrally, or can be disbursed
in each of the sub-modules in the top level. For example, there might be a block in the receiver, a block in the
transmitter, a block in the DMA controller, or there could be one central block that feeds all of these individual
modules. The advantage of having one register block is that the decode circuitry does not need to be replicated.
With one central block, there are many individual registers that must be brought out from
the register block to all of the individual blocks, and this can be a lot of typing to do initially, as well as
cumbersome for changes.
A better way to organize these is as a record type.
- IF CLK'event AND CLK = '1' THEN
- IF address = CONTROL_REG_ADDR AND REGWRITE = '1' THEN
- CONTROL_REG <= DATA_IN;
- END IF;
- IF address = STATUS_REG_ADDR AND REGWRITE = '1' THEN
- STATUS_REG <= DATA_IN;
- END IF;
- END IF;
In the above example, if there are many registers, a large fanout is created
on the signal REGWRITE and the address
bus which can be difficult to deal with. A better way to accomplish this is with the following:
- ELSIF CLK'event AND CLK = '1' THEN
- -- Default assignments:
- write_controlreg <= '0';
- write_statusreg <= '0';
- IF REGWRITE = '1' THEN
- CASE address IS
- WHEN CONTROL_REG_ADDR =>
- write_controlreg <= '1';
- WHEN STATUS_REG_ADDR =>
- write_statusreg <= '1';
- WHEN OTHERS => NULL;
- END CASE;
- END IF;
- IF write_controlreg = '1' THEN
- control_reg <= DATA_IN;
- END IF;
- IF write_statusreg = '1' THEN
- status_reg <= DATA_IN;
- END IF;
END IF;
Notice that the IF statements which test the address have been changed to the
more efficient CASE statement, since a priority encoder is not desired on the address.
The signals "write_controlreg" and "write_statusreg"
become clock enables for the data registers "control_reg" and "status_reg", and
can be physically located adjacent to the flip-flops which comprise the register. This allows for a
higher operating frequency and less routing congestion.
Note that an additional pipeline has been added to the data path and this must be taken into account;
it may be necessary to pass the input data "DATA_IN" through a pipeline register to compensate.
Decoding Register Writes
Generally, one must decode lower order address lines and
a qualified strobe to write to internal
registers within the device (control register, status register, etc.). This could be accomplished in
the following fashion, but is not recommended for higher performance designs:
- IF CLK'event AND CLK = '1' THEN
- IF address = CONTROL_REG_ADDR AND REGWRITE = '1' THEN
- CONTROL_REG <= DATA_IN;
- END IF;
- IF address = STATUS_REG_ADDR AND REGWRITE = '1' THEN
- STATUS_REG <= DATA_IN;
- END IF;
- END IF;
In the above example, if there are many registers, a large fanout is created
on the signal REGWRITE and the address
bus which can be difficult to deal with. A better way to accomplish this is with the following:
- ELSIF CLK'event AND CLK = '1' THEN
- -- Default assignments:
- write_controlreg <= '0';
- write_statusreg <= '0';
- IF REGWRITE = '1' THEN
- CASE address IS
- WHEN CONTROL_REG_ADDR =>
- write_controlreg <= '1';
- WHEN STATUS_REG_ADDR =>
- write_statusreg <= '1';
- WHEN OTHERS => NULL;
- END CASE;
- END IF;
- IF write_controlreg = '1' THEN
- control_reg <= DATA_IN;
- END IF;
- IF write_statusreg = '1' THEN
- status_reg <= DATA_IN;
- END IF;
END IF;
Notice that the IF statements which test the address have been changed to the
more efficient CASE statement, since a priority encoder is not desired on the address.
The signals "write_controlreg" and "write_statusreg"
become clock enables for the data registers "control_reg" and "status_reg", and
can be physically located adjacent to the flip-flops which comprise the register. This allows for a
higher operating frequency and less routing congestion.
Note that an additional pipeline has been added to the data path and this must be taken into account;
it may be necessary to pass the input data "DATA_IN" through a pipeline register to compensate.
State Machines
State machine design in VHDL is more a matter of style preference than rules.
What is presented here is one preferred style and some recommendations based upon personal preference.
Although many synthesizers recommend the "two process" state machine,
this approach
generates non-registered outputs which can glitch. Other than that caution (which can be solved
by registering the outputs in a clocked process) and the fact that they are a bit more difficult to
read, they are an acceptable method of coding.
Another method is to create a single clocked process containing
all state assignments and
output assignments. Since all signal assignments in a clocked process become flip-flops, the
outputs from the state machine are registered and will not glitch. An example of such a state
machine is shown here:
- ARCHITECTURE behavior OF state_machine IS
- TYPE STATETYPE IS (IDLE, GETBUS, HAVEBUS);
- SIGNAL state: STATETYPE;
- BEGIN
- sm_process: PROCESS (CLK, RESET)
- BEGIN
- IF (RESET = RESET_LEVEL) THEN
- BUSREQ <= '0';
- BUSFREE <= '0';
- state <= IDLE;
- ELSIF CLK'EVENT AND CLK = '1' THEN
- -- define inactive state for all outputs
- BUSREQ <= '0';
- BUSFREE <= '0';
- CASE (state) IS
- WHEN IDLE =>
- IF MEMREQ = '1' THEN
- state <= GETBUS;
- END IF;
- WHEN GETBUS =>
- BUSREQ <= '1';
- IF BUSGNT = '1' THEN
- state <= HAVEBUS;
- END IF;
- WHEN HAVEBUS =>
- IF MEMREQ = '0' THEN
- BUSFREE <= '1';
- state <= IDLE;
- END IF;
- END CASE;
- END IF;
- END PROCESS;
In this state machine, the state bits are created as an enumerated type
by the following declarations:
- TYPE STATETYPE IS (IDLE, GETBUS, HAVEBUS);
- SIGNAL state: STATETYPE;
The advantage of using an enumerated type is that the values for STATETYPE
(that is, IDLE, GETBUS, and HAVEBUS)
are displayed in the simulator waveform window. The synthesizers provide a mechanism to
control the assignment of enumerated type to actual values ("one-hot", binary, random, etc.).
One can also use constant declarations to define the state bit assignments manually,
if desired.
As it turns out, unless all of the outputs from the state
machine (in our example, BUSREQ
and BUSFREE) are assigned to a value in every state of the state machine,
the state machine must generate
"extra" logic to maintain the outputs for those states in which the outputs aren't assigned
- the state machine will not be fully optimized. But assigning all outputs in every state
can make the state machine quite unreadable -
especially in a large state machine.
A solution for this, shown in the example above, is to have
default assignments at the start of the
clocked portion of the process and to override these in specific states. In our example above,
BUSREQ and BUSFREE are normally low and are asserted high when active. They are
assigned to '0' at the beginning of the clocked process, and assigned to '1' in the states where
they are to be active. This allows all of the outputs to be assigned a value in all states, and for
the state machine to be fully optimized, while still being quite readable.
It's worth noting the assignments for the two outputs have different dependencies and will exhibit
different timing. The output "BUSREQ" is only dependent upon being in the state GETBUS. If the
state bits are one-hot encoded, then BUSREQ will only depend upon one flip-flop. The "BUSFREE" output
is dependent upon being in the state HAVEBUS, and is also dependent upon the input MEMREQ.
Project Packages
Although it always seems like extra overhead, software programmers
learned many years ago that it pays to use symbolic definitions for
commonly used constants. For example, addresses for registers, bit
assignments and other constants should
be defined in one project package and then included in all modules. A
separate package can be used for simulation constants. It's useful to
have the simulation package reference the project package when
necessary - all
constants should be defined in only one place. Therefore when a
constant needs to be changed, it is changed in one place and
simulation and synthesis will simply work.
Style
It is important to adopt a style that is meaningful to the designer, as well as
to others that might have to read their code. There is no right or wrong in this category;
it is simply a matter of preference. Since VHDL is case-insensitive, one has the option
of using case to increase readability. My personal preference is to use upper case for VHDL keywords,
for constants, and for signals declared in ports. Internal signals are always lower case in
this scheme. By using upper case for ports and lower case for signals, it's always easy to
determine if a signal is an input or output to the module or is an internal signal.
Another style preference is the location for component declarations. When an entity/architecture
pair contains many declared components, they tend to make the module less readable. An option
to alleviate this is to put these declarations into a package - either the project package or
some other package, and then import the package into the entity.