Commit f6a63cc7 authored by Marc's avatar Marc
Browse files

added filter test and version v1.1

parent c4ac332e
Prompt for target technology
CONFIG_SYN_INFERRED
Selects the target technology for memory and pads.
The following are available:
- Inferred: Generic FPGA or ASIC targets if your synthesis tool
is capable of inferring RAMs and pads automatically.
- Actel ProAsic/P/3, IGLOO/2, RTG4, PolarFire and Axcelerator FPGAs
- Aeroflex UT25CRH Rad-Hard 0.25 um CMOS
- Altera: Most Altera FPGA families
- Altera-CycloneIII: Altera Cyclone-III/IV FPGA family
- Altera-Stratix: Altera Stratix FPGA family
- Altera-StratixII: Altera Stratix/Cyclone-II FPGA families
- Altera-StratixIII: Altera Stratix-III FPGA family
- Altera-StratixIV: Altera Stratix-IV FPGA family
- Altera-StratixV: Altera Stratix-V FPGA family
- ATC18: Atmel-Nantes 0.18 um rad-hard CMOS
- NanoXploree : Brave-Medium
- Quicklogic : Eclipse/E/II FPGAs
- UMC-0.18 : UMC 0.18 um CMOS with Virtual Silicon libraries
- Xilinx-Spartan/2/3/6: Xilinx Spartan/2/3/6 libraries
- Xilinx-Spartan3E: Xilinx Spartan3E libraries
- Xilinx-Virtex/E: Xilinx Virtex/E libraries
- Xilinx-Virtex2/4/5/6/7: Xilinx Virtex2/4/5/6/7 libraries
Note: Level of technology support depends on type of GRLIB
distribution. A technology may be present in this list while the
tech support files are missing from the GRLIB distribution.
Actel support is only available in commercial and FT distributions.
Additional target technologies are available that are not selectable
via the xconfig tool.
Ram library
CONFIG_MEM_VIRAGE
Select RAM generators for ASIC targets.
Transceiver type
CONFIG_TRANS_GTP0
Select the transceiver type used in your FPGA
Infer ram
CONFIG_SYN_INFER_RAM
Say Y here if you want the synthesis tool to infer your
RAM automatically. Say N to directly instantiate technology-
specific RAM cells for the selected target technology package.
Infer pads
CONFIG_SYN_INFER_PADS
Say Y here if you want the synthesis tool to infer pads.
Say N to directly instantiate technology-specific pads from
the selected target technology package.
No async reset
CONFIG_SYN_NO_ASYNC
Say Y here if you disable asynchronous reset in some of the IP cores.
Might be necessary if the target library does not have cells with
asynchronous set/reset.
Scan support
CONFIG_SYN_SCAN
Say Y here to enable scan support in some cores. This will enable
the scan support generics where available and add logic to make
the design testable using full-scan.
Use Virtex CLKDLL for clock synchronisation
CONFIG_CLK_INFERRED
Certain target technologies include clock generators to scale or
phase-adjust the system and SDRAM clocks. This is currently supported
for Xilinx, Altera and Proasic3 FPGAs. Depending on technology, you
can select to use the Xilinx CKLDLL macro (Virtex, VirtexE, Spartan1/2),
the Xilinx DCM (Virtex-2, Spartan3, Virtex-4), the Xilinx PLLE2 (Virtex-7,
Kintex-7, Artix-4), the Altera ALTDLL (Stratix, Cyclone),the
Proasic3 PLL or the NanoXplore-NX_PLL.
Choose the 'inferred' option to skip a clock generator.
Clock multiplier
CONFIG_CLK_MUL
When using the Xilinx DCM, Xilinx PLLE2 or Altera ALTPLL,
the system clock can be multiplied with a factor of 2 - 32,
and divided by a factor of 1 - 32. This makes it possible to
generate almost any desired processor frequency. When using
the Xilinx CLKDLL generator, the resulting frequency scale f
actor (mul/div) must be one of 1/2, 1 or 2. On Proasic3,
the factor can be 1 - 128.
When using NanoXplore-NXPLL the frequency of the generated clock is
out_freq = (in_freq*mul*2)/ (2^clk_odiv) if in_freq < 100 MHz
out_freq = (in_freq*mul)/ (2^clk_odiv) if in_freq > 100 MHz
mul can be in range 1-15
WARNING: The resulting clock must be within the limits specified
by the target FPGA family.
Clock divider
CONFIG_CLK_DIV
When using the Xilinx DCM, Xilinx PLLE2 or Altera ALTPLL,
the system clock can be multiplied with a factor of 2 - 32,
and divided by a factor of 1 - 32. This makes it possible to
generate almost any desired processor frequency. When using
the Xilinx CLKDLL generator, the resulting frequency scale f
actor (mul/div) must be one of 1/2, 1 or 2. On Proasic3,
the factor can be 1 - 128.
WARNING: The resulting clock must be within the limits specified
by the target FPGA family.
Output clock divider
CONFIG_OCLK_DIV
When using the Proasic3 PLL, the system clock is generated by three
parameters: input clock multiplication, input clock division and
output clock division. Only certain values of these parameters
are allowed, but unfortunately this is not documented by Actel.
To find the correct values, run the Libero Smartgen tool and
insert you desired input and output clock frequencies in the
Static PLL configurator. The mul/div factors can then be read
out from tool.
When using NanoXplore-NXPLL the frequency of the generated clock is
out_freq = (in_freq*mul*2)/ (2^clk_odiv) if in_freq < 100 MHz
out_freq = (in_freq*mul)/ (2^clk_odiv) if in_freq < 100 MHz
Output clock divider, 2nd clock
CONFIG_OCLKB_DIV
See help for 'Ouput division factor'. Set this to 0 to disable the
second clock output.
Output clock divider, 3rd clock
CONFIG_OCLKC_DIV
See help for 'Ouput division factor'. Set this to 0 to disable the
third clock output.
System clock multiplier
CONFIG_CLKDLL_1_2
The Xilinx CLKDLL can scale the input clock with a factor of 0.5, 1.0,
or 2.0. Useful when the target board has an oscillator with a too high
(or low) frequency for your design. The divided clock will be used as the
main clock for the whole processor (except PCI and ethernet clocks).
System clock multiplier
CONFIG_DCM_2_3
The Xilinx DCM and Altera ALTDLL can scale the input clock with a large
range of factors. Useful when the target board has an oscillator with a
too high (or low) frequency for your design. The divided clock will
be used as the main clock for the whole processor (except PCI and
ethernet clocks). NOTE: the resulting frequency must be at least
24 MHz or the DCM and ALTDLL might not work.
Enable CLKDLL for PCI clock
CONFIG_PCI_CLKDLL
Say Y here to re-synchronize the PCI clock using a
Virtex BUFGDLL macro. Will improve PCI clock-to-output
delays on the expense of input-setup requirements.
Use PCI clock system clock
CONFIG_PCI_SYSCLK
Say Y here to the PCI clock to generate the system clock.
The PCI clock can be scaled using the DCM or CLKDLL to
generate a suitable processor clock.
External SDRAM clock feedback
CONFIG_CLK_NOFB
Say Y here to disable the external clock feedback to synchronize the
SDRAM clock. This option is necessary if your board or design does not
have an external clock feedback that is connected to the pllref input
of the clock generator.
Number of processors
CONFIG_PROC_NUM
The number of processor cores. The LEON3MP design can accomodate
up to 4 LEON3 processor cores. Use 1 unless you know what you are
doing ...
Force values from example configuration
CONFIG_LEON3_MIN
If you select any other value than Custom-configuration then the
configuration choices will be forced to settings corresponding to
the selected example configuration. The example configurations are
described in the document LEON/GRLIB Design and Configuration Guide
located at doc/guide.pdf.
Due to limitations in the configuration tool it is not possible to
update all choices without first saving the configuration and
restarting the tool. Perform the steps below to set the processor
configuration to example configuration values:
1. Select example configuration (MIN, GP, HP). This will result
in most processor configuration menus being disabled.
2. Go back to the main menu and select Save and Exit
3. Start the xconfig tool again
4. Go into processor configuration and select Custom-configuration.
All configuration values can now be changed again and have been
initialized to example configuration that was selected earlier.
5. Make any changes and then quit xconfig via Save and Exit
Some options set from the example configuration may need to be changed.
One such setting is the FPU multiplier, if the FPU is enabled, where
timing is improved for several FPGA technologies by selecting the
technology specific multiplier and where the DW multiplier is generally
a good choice for ASIC.
Number of SPARC register windows
CONFIG_IU_NWINDOWS
The SPARC architecture (and LEON) allows 2 - 32 register windows.
However, any number except 8 will require that you modify and
recompile your run-time system or kernel. Unless you know what
you are doing, use 8.
SPARC V8 multiply and divide instruction
CONFIG_IU_V8MULDIV
If you say Y here, the SPARC V8 multiply and divide instructions
will be implemented. The instructions are: UMUL, UMULCC, SMUL,
SMULCC, UDIV, UDIVCC, SDIV, SDIVCC. In code containing frequent
integer multiplications and divisions, significant performance
increase can be achieved. Emulated floating-point operations will
also benefit from this option.
By default, the gcc compiler does not emit multiply or divide
instructions and your code must be compiled with -mv8 to see any
performance increase. On the other hand, code compiled with -mv8
will generate an illegal instruction trap when executed on processors
with this option disabled.
The divider consumes approximately 2 kgates, the multiplier 6 kgates.
Multiplier latency
CONFIG_IU_MUL_LATENCY_2
Implementation options for the integer multiplier.
Type Implementation issue-rate/latency
2-clocks 32x32 pipelined multiplier 1/2
4-clocks 16x16 standard multiplier 4/4
5-clocks 16x16 pipelined multiplier 4/5
MAC operation
CONFIG_IU_MUL_MAC
If you say Y here, the SPARC V8e UMAC/SMAC (multiply-accumulate)
instructions will be enabled. The instructions implement a
single-cycle 16x16->32 bits multiply with a 40-bits accumulator.
The details of these instructions can be found in the LEON manual,
This option is only available when 16x16 multiplier is used.
Multiplier structure
CONFIG_IU_MUL_INFERRED
Structure options for the integer multiplier. The multiplier
can be implemented with the following structures:
* Inferred by the synthesis tool
* Generated using Module Generators from NTNU
* Using technology specific netlists (TechSpec)
* Using Synopsys Designware (DW02_mult and DW_mult_pipe)
Branch prediction
CONFIG_IU_BP
Enabling branch prediction will improve performance with
up to 20%, depending on application. The timing and area
overhead are minor, so it is recommended to always enable
this option.
Single vector trapping
CONFIG_IU_SVT
Single-vector trapping is a SPARC V8e option to reduce code-size
in small applications. If enabled, the processor will jump to
the address of trap 0 (tt = 0x00) for all traps. No trap table
is then needed. The trap type is present in %psr.tt and must
be decoded by the O/S. Saves 4 Kbyte of code, but increases
trap and interrupt overhead. Currently, the only O/S supporting
this option is eCos. To enable SVT, the O/S must also set bit 13
in %asr17.
Load latency
CONFIG_IU_LDELAY
Defines the pipeline load delay (= pipeline cycles before the data
from a load instruction is available for the next instruction).
One cycle gives best performance, but might create a critical path
on targets with slow (data) cache memories. A 2-cycle delay can
improve timing but will reduce performance with about 5%.
Reset address
CONFIG_IU_RSTADDR
By default, a SPARC processor starts execution at address 0.
With this option, any 4-kbyte aligned reset start address can be
choosen. Keep at 0 unless you really know what you are doing.
Power-down
CONFIG_PWD
Say Y here to enable the power-down feature of the processor.
Might reduce the maximum frequency slightly on FPGA targets.
For details on the power-down operation, see the LEON3 manual.
Hardware watchpoints
CONFIG_IU_WATCHPOINTS
The processor can have up to 4 hardware watchpoints, allowing to
create both data and instruction breakpoints at any memory location,
also in PROM. Each watchpoint will use approximately 500 gates.
Use 0 to disable the watchpoint function.
Floating-point enable
CONFIG_FPU_ENABLE
Say Y here to enable the floating-point interface for the GRFPU-lite
or GRFPU. Note that no FPUs are provided with the GPL version
of GRLIB. Both FPUs are commercial cores and must be obtained separately.
FPU selection
CONFIG_FPU_GRFPU
Select between Gaisler Research's GRFPU and GRFPU-lite FPUs. All cores
are fully IEEE-754 compatible and support all SPARC FPU instructions.
GRFPU Multiplier
CONFIG_FPU_GRFPU_INFMUL
On FPGA targets choose inferred multiplier. For ASIC implementations
choose between Synopsys Design Ware (DW) multiplier or Module
Generator (ModGen) multiplier. The DW multiplier gives better results
(smaller area and better timing) but requires a DW license.
The ModGen multiplier is part of GRLIB and does not require a license.
The Technology Specific multiplier option selects a pre-designed
multiplier using technology specific macrocells when available, else
an inferred multiplier is used.
Shared GRFPU
CONFIG_FPU_GRFPU_SH
If enabled multiple CPU cores will share one GRFPU.
GRFPC Configuration
CONFIG_FPU_GRFPC0
Configures the GRFPU-LITE controller.
In simple configuration controller executes FP instructions
in parallel with integer instructions. FP operands are fetched
in the register file stage and the result is written in the write
stage. This option uses least area resources.
Data forwarding configuration gives ~ 10 % higher FP performance than
the simple configuration by adding data forwarding between the pipeline
stages.
Non-blocking controller allows FP load and store instructions to
execute in parallel with FP instructions. The performance increase is
~ 20 % for FP applications. This option uses most logic resources and
is suitable for ASIC implementations.
Floating-point netlist
CONFIG_FPU_NETLIST
Say Y here to use a VHDL netlist of the GRFPU-Lite. This is
only available in certain versions of grlib.
Enable Instruction cache
CONFIG_ICACHE_ENABLE
The instruction cache should always be enabled to allow
maximum performance. Some low-end system might want to
save area and disable the cache, but this will reduce
the performance with a factor of 2 - 3.
Enable Data cache
CONFIG_DCACHE_ENABLE
The data cache should always be enabled to allow
maximum performance. Some low-end system might want to
save area and disable the cache, but this will reduce
the performance with a factor of 2 at least.
Instruction cache associativity
CONFIG_ICACHE_ASSO1
The instruction cache can be implemented as a multi-way cache with
1 - 4 ways. Higher associativity usually increases the cache hit
rate and thereby the performance. The downside is higher power
consumption and increased gate-count for tag comparators.
Note that a 1-way cache is effectively a direct-mapped cache.
Instruction cache way size
CONFIG_ICACHE_SZ1
The size of each way in the instuction cache (kbytes). Valid values
are 1 - 64 in binary steps. Note that the full range is only supported
by the generic and virtex2 targets. Most target packages are limited
to 2 - 16 kbyte. Large way size gives higher performance but might
affect the maximum frequency (on ASIC targets). The total instruction
cache size is the number of way multiplied with the way size.
Instruction cache line size
CONFIG_ICACHE_LZ16
The instruction cache line size. Can be set to either 16 or 32
bytes per line. Instruction caches typically benefit from larger
line sizes, but on small caches it migh be better with 16 bytes/line
to limit eviction miss rate.
Instruction cache replacement algorithm
CONFIG_ICACHE_ALGORND
Cache replacement algorithm for caches with 2 - 4 ways. The 'random'
algorithm selects the way to evict randomly. The least-recently-replaced
(LRR) algorithm evicts the way least recently replaced. The least-
recently-used (LRU) algorithm evicts the way least recently accessed.
The random algorithm uses a simple 1- or 2-bit counter to select
the eviction way and has low area overhead. The LRR scheme uses one
extra bit in the tag ram and has therefore also low area overhead.
However, the LRR scheme can only be used with 2-way caches. The LRU
scheme has typically the best performance but also highest area overhead.
A 2-way LRU uses 1 flip-flop per line, a 3-way LRU uses 3 flip-flops
per line, and a 4-way LRU uses 5 flip-flops per line to store the access
history.
Instruction cache locking
CONFIG_ICACHE_LOCK
Say Y here to enable cache locking in the instruction cache.
Locking can be done on cache-line level, but will increase the
width of the tag ram with one bit. If you don't know what
locking is good for, it is safe to say N.
Data cache associativity
CONFIG_DCACHE_ASSO1
The data cache can be implemented as a multi-way cache with
1 - 4 ways. Higher associativity usually increases the cache hit
rate and thereby the performance. The downside is higher power
consumption and increased gate-count for tag comparators.
Note that a 1-way cache is effectively a direct-mapped cache.
If the MMU is enabled then bus snooping is required to avoid
aliasing effects in multi-way caches.
Data cache way size
CONFIG_DCACHE_SZ1
The size of each way in the data cache (kbytes). Valid values are
1 - 64 in binary steps. Note that the full range is only supported
by the generic and virtex2 targets. Most target packages are limited
to 2 - 16 kbyte. A large cache gives higher performance but the
data cache is timing critical an a too large setting might affect
the maximum frequency (on ASIC targets). The total data cache size
is the number of ways multiplied with the way size.
Note that when the MMU is enabled, bus snooping should also be
enabled and the data cache way size should not exceed the MMU
page size.
Data cache line size
CONFIG_DCACHE_LZ16
The data cache line size. Can be set to either 16 or 32 bytes per
line. A smaller line size gives better associativity and higher
cache hit rate, but requires a larger tag memory.
Data cache replacement algorithm
CONFIG_DCACHE_ALGORND
See the explanation for instruction cache replacement algorithm.
Data cache locking
CONFIG_DCACHE_LOCK
Say Y here to enable cache locking in the data cache.
Locking can be done on cache-line level, but will increase the
width of the tag ram with one bit. If you don't know what
locking is good for, it is safe to say N.
Data cache snooping
CONFIG_DCACHE_SNOOP
Say Y here to enable data cache snooping on the AHB bus.
With snooping, AHB writes by other masters to data that is in the
data cache will be automatically detected and the cache line will
be marked as invalid. This simplifies software design in systems
with DMA units or multiple processors accessing the same memory.
It is a requirement (together with separate physical/snoop tags)
to run Linux on more than one CPU.
Depending on the separate snoop tags option, snooping will be
implemented using either dual-port (common tags) or two-port
RAMs (separate tags). A workaround if dual port RAMs are
not available for the target technology is to use separate
snoop tags even if it is not needed otherwise.
Separate snoop tags
CONFIG_DCACHE_SNOOP_SEPTAG
Enable a separate memory to store the data tags used for snooping.
This is necessary when snooping support is wanted in systems
with MMU, typically for SMP systems. In this case, the snoop
tags will contain the physical tag address while the normal
tags contain the virtual tag address.
When separate snoop tags are enabled, the tag RAMs will be implemented
using two-port RAMs instead of dual-port RAMs. A workaround if
dual port RAMs are not available for the target technology is
therefore to use separate snoop tags even if it is not needed.
Note that it is also possible to implement the tags using single port
RAMs with valid bits in flip-flops.
Single-port RAM for separate tags
CONFIG_DCACHE_SNOOP_SP
Use single-port RAM to implement separate physical tags. This leads
to the cache tag valid bits being implemented in flip-flops.
Fixed cacheability map
CONFIG_CACHE_FIXED
If this variable is 0, the cacheable memory regions are defined
by the AHB plug&play information (default). To overriden the
plug&play settings, this variable can be set to indicate which
areas should be cached. The value is treated as a 16-bit hex value
with each bit defining if a 256 Mbyte segment should be cached or not.
The right-most (LSB) bit defines the cacheability of AHB address
0 - 256 MByte, while the left-most bit (MSB) defines AHB address
3840 - 4096 MByte. If the bit is set, the corresponding area is
cacheable. A value of 00F3 defines address 0 - 0x20000000 and
0x40000000 - 0x80000000 as cacheable.
Local data ram
CONFIG_DCACHE_LRAM
Say Y here to add a local ram to the data cache controller.
Accesses to the ram (load/store) will be performed at 0 waitstates
and store data will never be written back to the AHB bus.
Size of local data ram
CONFIG_DCACHE_LRAM_SZ1
Defines the size of the local data ram in Kbytes. Note that most
technology libraries do not support larger rams than 16 Kbyte.
Start address of local data ram
CONFIG_DCACHE_LRSTART
Defines the 8 MSB bits of start address of the local data ram.
By default set to 8f (start address = 0x8f000000), but any value
(except 0) is possible. Note that the local data ram 'shadows'
a 16 Mbyte block of the address space.
MMU enable
CONFIG_MMU_ENABLE
Say Y here to enable the Memory Management Unit.
Include supervisor bit in tag
CONFIG_MMU_SV
Say Y here to tag information in data and instruction cache
with access permissions. Recommended setting is Y for security.
MMU split icache/dcache table lookaside buffer
CONFIG_MMU_COMBINED
Select "combined" for a combined icache/dcache table lookaside buffer,
"split" for a split icache/dcache table lookaside buffer
MMU tlb replacement scheme
CONFIG_MMU_REPARRAY
Select "LRU" to use the "least recently used" algorithm for TLB
replacement, or "Increment" for a simple incremental replacement
scheme.
Combined i/dcache tlb
CONFIG_MMU_I2
Select the number of entries for the instruction TLB, or the
combined icache/dcache TLB if such is used.
Split tlb, dcache
CONFIG_MMU_D2
Select the number of entries for the dcache TLB.
Fast writebuffer
CONFIG_MMU_FASTWB
Only selectable if split tlb is enabled. In case fast writebuffer is
enabled the tlb hit will be made concurrent to the cache hit. This
leads to higher store performance, but increased power and area.
MMU pagesize
CONFIG_MMU_PAGE_4K
The deafult SPARC V8 SRMMU page size is 4 Kbyte. This limits the
cache way size to 4 Kbyte, and total data cache size to 16 Kbyte,
when the MMU is used. To increase the maximum data cache size,
the MMU pages size can be increased to up 32 Kbyte. This will
give a maximum data cache size of 128 Kbyte.
Note that an MMU page size different than 4 Kbyte will require
a special linux tool-chain if glibc is used. If you don't know
what you are doing, stay with 4 Kbyte.
DSU enable
CONFIG_DSU_ENABLE
The debug support unit (DSU) allows non-intrusive debugging and tracing
of both executed instructions and AHB transfers. If you want to enable
the DSU, say Y here and select the configuration below.
Trace buffer enable
CONFIG_DSU_TRACEBUF
Say Y to enable the trace buffer. The buffer is not necessary for
debugging, only for tracing instructions and data transfers.
Enable instruction tracing
CONFIG_DSU_ITRACE
If you say Y here, an instruction trace buffer will be implemented
in each processor. The trace buffer will trace executed instructions
and their results, and place them in a circular buffer. The buffer
can be read out by any AHB master, and in particular by the debug
communication link.
Size of trace buffer
CONFIG_DSU_ITRACESZ1
Select the buffer size (in kbytes) for the instruction trace buffer.
Each line in the buffer needs 16 bytes. A 128-entry buffer will thus
need 2 kbyte.
Enable two-port instruction trace buffer
CONFIG_DSU_ITRACE_2P
If you say Y here, the instruction trace buffer will be implemented
in each processor with a two-port RAM. It will then be possible
to read the buffer through its second port, while the processor
instructions are being traced.
Enable AHB tracing
CONFIG_DSU_ATRACE
If you say Y here, an AHB trace buffer will be implemented in the
debug support unit processor. The AHB buffer will trace all transfers
on the AHB bus and save them in a circular buffer. The trace buffer
can be read out by any AHB master, and in particular by the debug
communication link.
Size of trace buffer
CONFIG_DSU_ATRACESZ1
Select the buffer size (in kbytes) for the AHB trace buffer.
Each line in the buffer needs 16 bytes. A 128-entry buffer will thus
need 2 kbyte.
AHB trace buffer filters
CONFIG_DSU_AFILT
If you say Y here, the debug support unit will allow filtering of
AHB trace buffer inputs. Filters allow to select which masters,
slaves and types of accesses that should be saved to the buffer.
AHB statistics outputs
CONFIG_DSU_ASTAT
If you say Y here, the debug support unit will enable its statistics
outputs. These outputs can be connected to a L4STAT unit in order to
count events on the AHB bus. This option will also enable AHB trace
buffer filters.
LEON3FT enable
CONFIG_LEON3FT_EN
Say Y here to use the fault-tolerant LEON3FT core instead of the
standard non-FT LEON3.
IU Register file protection
CONFIG_IUFT_NONE