1. About the DIB Intel Stratix 10 FPGA IP User Guide
This user guide provides the features, architecture description, steps
to instantiate, and guidelines to design the DIB
FPGA IP, specifically for
Stratix® 10 GX 1SG10M
This document is intended for:
Design architect to make IP selection during system level
design planning phase
Hardware designers when integrating the IP into their system
Validation engineers during system level simulation and
hardware validation phase
The following table lists other reference documents which are related to the
Stratix® 10 IP.
Method of transmitting multiple data signals over one channel in a
series of time slots.
2. About the DIB Intel Stratix 10 FPGA IP
The Direct Interface Bus (DIB)
Stratix® 10 FPGA IP enables direct communication between the two dies in a
Stratix® 10 GX 10M variant.
Stratix® 10 GX 10M variant has two dies.
Each die is configured separately. The connection that the DIB
Stratix® 10 FPGA IP provides between the two dies is statically set at configuration
Each DIB instance must have at least one pin location assigned to allow
for other pin locations to be automatically assigned.
The two dies in a
Stratix® 10 GX 10M variant
are identical; die 1 is rotated 180 degrees to die 2.
2.1. Release Information
Intel® FPGA IP versions match the
Quartus® Prime Design Suite software versions until v19.1. Starting in
Quartus® Prime Design Suite software version 19.2,
Intel® FPGA IP has a new versioning scheme.
Intel® FPGA IP version (X.Y.Z) number
can change with each
Quartus® Prime software version. A change in:
X indicates a major revision of the IP. If you update the
Quartus® Prime software, you must
regenerate the IP.
Y indicates the IP includes new features. Regenerate your IP
to include these new features.
Z indicates the IP includes minor changes. Regenerate your IP
to include these changes.
Table 4. Direct Interface Bus (DIB)
Stratix® 10 FPGA IP Release Information
2.2. Device Support for DIB Intel Stratix 10 FPGA IP
The IP supports only the
Stratix® 10 GX
Table 5. Intel Device Support
Stratix® 10 GX (1SG10M)
The following terms define IP core support levels for
Intel® FPGA IP cores:
Advance support—the IP
core is available for simulation and compilation for this device family. Timing
models include initial engineering estimates of delays based on early
post-layout information. The timing models are subject to change as silicon
testing improves the correlation between the actual silicon and the timing
models. You can use this IP core for system architecture and resource
utilization studies, simulation, pinout, system latency assessments, basic
timing assessments (pipeline budgeting), and I/O transfer strategy (data-path
width, burst depth, I/O standards tradeoffs).
IP core is verified with preliminary timing models for this device family. The
IP core meets all functional requirements, but might still be undergoing timing
analysis for the device family. It can be used in production designs with
Final support—the IP
core is verified with final timing models for this device family. The IP core
meets all functional and timing requirements for the device family and can be
used in production designs.
2.3. DIB Intel Stratix 10 FPGA IP Features
Stratix® 10 FPGA IP offers three
modes of operation: Bypass, Asynchronous, and Synchronous.
Stratix® 10 FPGA IP includes the
Time-division multiplexing (TDM) ratio
Asynchronous mode: 1:1, 2:1, or 4:1
Synchronous mode: 1:1, 2:1, or 4:1
Maximum transfer clock rate of 400 MHz (Asynchronous and Synchronous
Bypass transfer latency of 2.5 ns (through direct interface bus only)
Three subsystems; each subsystem consists of 24 standard channels and 1
Four banks per channel
Maximum 22 I/Os per bank for Bypass mode and 20 I/Os for Asynchronous and
Read and Write I/Os set per bank
Table 6. Total I/Os Available
Asynchronous or Synchronous
For TDM modes, the latency for each TDM case
includes the core-to-periphery and periphery-to-core timing closures for both TX and RX
instances. The latency number does not include potential pipe stages needed to achieve a
required maximum frequency.
For bypass modes, the latency number includes
only the latency across the DIB interface (hard wire).
latency depends on the pointer value of the hardened TDM, in DIB
clock cycles; 1 cycle = 2.5 ns, 2 cycles = 5 ns, and so on.
3. Functional Description
Stratix® 10 FPGA IP provides three
direct interface bus subsystems per
Stratix® 10 GX 10M
Every DIB subsystem contains 24 standard channels, and the bypass operation
mode also contains an extra AUX channel. Each channel has four banks, and each bank contains
22 or 20 usable I/Os, depending on the mode of the operation.
Stratix® 10 FPGA IP provides three modes of
operation for each DIB channel instance:
Figure 1. DIB
Stratix® 10 FPGA IP Interface
Figure 2. DIB Subsystem (24 Standard Channels and 1 AUX Channel)
Figure 3. Single DIB Channel
3.1. Bypass Mode
In Bypass mode, the DIB acts as a wire connection between
The Bypass mode includes the following features:
The propagation delay between dies is the wire delay.
The latency between DIB pins is 2.5 ns.
The directional granularity is at the bank level.
This mode does not require a DIB clock.
Use appropriate input or output delay constraints during
compilation to accommodate the interface (refer to Timing Transfer for Bypass Mode).
Enables both standard and AUX channels.
Figure 4. Bypass Mode Block Diagram
Figure 5. Bypass Mode Timing DiagramThe IP treats the connection between the TX die and RX dies as a wire
signal. Ensure that the RX DUT clock rising edge (with respect to the TX DUT clock rising
edge) accounts for the time delay from the TX side flip flop, across the interface, and to
the receiving flip flop.
3.1.1. AUX Channel Settings
You can enable any combination settings for the AUX channel with certain
Follow these restrictions:
Banks 0 and 3 are 22 bits wide and configurable as TX or RX.
a total width of
16 bits are for TX data only
1 bit is reserved for clock
a total width of
16 bits are for RX data only
1 bit is reserved for clock
Figure 6. AUX Channel Settings
3.2. Asynchronous Mode
In Asynchronous mode, the DIB uses the hard TDM provided by the
The Asynchronous mode includes the following features:
Use this mode when 1:1, 2:1, or 4:1 TDM multiplexing is required and no
soft TDM logic is required.
System clock is not used.
The DIB clock and DUT clock have a synchronous relationship.
You may use your own asynchronous DUT clock, but the dut_clk input to a DIB instance must be connected to a
clock of the ratio selected in the parameter editor.
The DIB subsystem TDM block always ensures that all RX ports (four ports
from 4:1 TDM, two ports from 2:1 TDM, or single port from 1:1 TDM) are available to be
sampled in the next system clock (or DUT clock) on the RX side.
The DIB TDM multiplexer halts the multiplexer select at the last data input until
all data are sampled at the RX system clock.
When the data sampling is complete, the multiplexer select rolls back to the
initial location for a new phase of data on the TX side.
You can connect the DIB clock to the DUT flip flops only in an
Figure 7. Asynchronous Mode Setup
Figure 8. Asynchronous Mode Timing Diagram
3.3. Synchronous Mode
Opt for Synchronous mode if you want the DIB to use your own soft TDM
The Synchronous mode includes the following features:
Use this mode when 1:1, 2:1, or 4:1 TDM multiplexing is required with soft TDM
The transfer from the DIB to the soft TDM logic is synchronous
The DIB clock and system clock have a synchronous relationship.
The ratio of the DIB clock and system clock is equal to the TDM
There is no requirement for a system/DUT clock or DIB/DUT clock
The DIB subsystem TDM multiplexers continue to sample and deliver without
halting on every dib_clk signal.
The soft TDM logic maintains data coherency across dies to ensure that
all DUT data is successfully transferred to the other die.
You can reduce latency by one clock cycle by turning on the Reduce Sync Mode P2C Latency parameter.
Note: You may face
difficulty in closing timing when you enable this parameter.
The RX die generates the system clock for the soft TDM logic on the
rem_clk is a divided version of
dib_clk transmitted from the TX side.
The TDM ratio you select in the parameter editor determines the
division. For example, TDM ratio 2:1 sets rem_clk to
be half of dib_clk.
Figure 9. Synchronous Mode Setup with 4:1 Soft TDM and 2:1 DIB (Hard)
TDMThis setup is based on the settings of 4:1 Soft TDM, 2:1 DIB TDM.
Figure 10. Synchronous Mode with 4:1 Soft TDM and 2:1 DIB (Hard) TDM Timing
DiagramThis setup is based on the settings of 4:1 Soft TDM, 2:1 DIB TDM.
Figure 11. Synchronous Mode Setup with 4:1 Soft TDM and 4:1 DIB (Hard)
TDMThis setup is based on the settings of 4:1 Soft TDM, 4:1 DIB TDM.
Figure 12. Synchronous Mode with 4:1 Soft TDM and 4:1 DIB (Hard) TDM Timing
DiagramThis setup is based on the settings of 4:1 Soft TDM, 4:1 DIB TDM, and
1:1 DIB clock.
4. Creating and Parameterizing the Intel FPGA IP
Use the Intel FPGA IP design flow to get started with the DIB
Stratix® 10 FPGA IP.
The Intel FPGA IP Library is installed as part of the
Quartus® Prime installation process. You can select and parameterize any Intel FPGA IP
from the library. Intel provides an integrated parameter editor that allows you to customize
Stratix® 10 FPGA IP to support a wide variety of
applications. The parameter editor guides you through the setting of parameter values and
selection of optional ports.
4.1. IP Catalog and Parameter Editor
The IP Catalog displays the IP cores available
for your project, including
Intel® FPGA IP and other IP
that you add to the IP Catalog search path.
Use the following features of
the IP Catalog to locate and customize an IP core:
Filter IP Catalog to
Show IP for active device family or
Show IP for all device families. If
you have no project open, select the Device
Family in IP Catalog.
Type in the search
field to locate any full or partial IP core name in IP Catalog.
Right-click an IP core
name in IP Catalog to display details about supported devices, to open the IP
core's installation folder, and for links to IP documentation.
Click Search for Partner
IP to access partner IP information on the web.
The parameter editor prompts you to specify an IP variation name,
optional ports, and output file generation options. The parameter editor generates a
Quartus® Prime IP file (.ip) for an IP variation in
Quartus® Prime Pro Edition projects.
4.2. Creating a New Intel Quartus Prime Project
You can create a new
Quartus® Prime project with
the New Project Wizard. Creating a new project
allows you to do the following:
Specify the working directory for the project.
Assign the project name.
Designate the name of the top-level design entity.
Quartus® Prime software.
File menu, click
New Project Wizard.
New Project Wizard: Directory, Name, Top-Level
Entity page, specify the working directory, project name, and
top-level design entity name. Click
In the New Project Wizard: Add
Files page, select the existing design files (if any) you want
to include in the project. Click Next.
New Project Wizard: Family & Device
Settings page, select the device family and specific device you
want to target for compilation. Click
In the EDA Tool Settings page, select
the EDA tools you want to use with the
Quartus® Prime software to develop your project.
Review the summary of your chosen settings in the New Project Wizard window, then click Finish to complete the
Quartus® Prime project creation.
In the IP Catalog (Tools > IP Catalog > Miscellaneous), locate and double-click the Direct
Interface Bus (DIB)
Stratix® 10 FPGA
Specify a top-level name for your custom IP variation. This
name identifies the IP variation files in your project. If prompted, also
specify the target
device family and output file HDL preference. Click OK.
After parameterizing the IP, go to the Example Design tab and
click Generate Example Design to create
the simulation testbench. Skip to 5 if you do not want to generate the
Set a name for your
<example_design_directory> and click
OK to generate supporting files and scripts.
The testbench and scripts are located in the <example_design_directory>/simulation folder.
Click Finish or
Generate HDL to generate synthesis
and other optional files matching your IP variation specifications. The
parameter editor generates the top-level
.qip or .qsys IP variation file and HDL files for synthesis and
The top-level IP variation is added to the current
Quartus® Prime project. Click Project > Add/Remove Files in Project to manually add a .qip
or .qsys file to a project. Make
appropriate pin assignments to connect ports.
Note: Some parameter options are grayed out
if they are not supported in a selected configuration or it is a derived parameter.
4.4. Compiling the DIB Intel Stratix 10 FPGA IP Design
After successfully compiling your design, program the targeted
Intel® device with the
Quartus® Prime Programmer and verify the design in hardware. For instructions on
programming the FPGA device, refer to the Intel Quartus Prime Pro Edition User
5. Designing with the DIB Intel Stratix 10 FPGA IP
When designing with the DIB
Stratix® 10 FPGA IP, you need
to take into account certain considerations to ensure a fully-functioning design. Follow the
design guidelines provided.
5.1. Reset Architecture
The DIB subsystem is either in freeze mode or user mode.
Upon power-up, the DIB subsystem enters freeze mode. All the freeze signals
from the DIB subsystem get asserted when the system asserts the power-on reset signal. During
freeze mode, the DIB subsystem is in a safe state and all interface signals to the core fabric
are driven high.
During freeze mode, the DIB I/Os are tri-stated, and the DIB SSM configures
the entire DIB subsystem.
Stratix® 10 GX 10M variant, the external
reset is controlled by user logic or your system design.
You must track all the dib_ready pins
from both dies to determine that both
Stratix® 10 GX 10M
variants are ready for data transactions.
You should enable the external reset only after all the dib_ready_n pins are asserted.
Only after enabling the external reset, you enable the cross-die
5.2. Clocking in Asynchronous and Synchronous Modes
The DIB subsystem requires the fabric clock, sourced from an IOPLL to
clock the DIB subsystem.
The DIB subsystem does not have any PLLs, therefore the clocks come from
The DIB subsystem sends a source synchronous clock to another DIB subsystem in
the adjacent die (TX or RX). In Synchronous mode, the system clock is synchronous to the DIB
Both the system clock (if applicable) and the DIB clock should be derived from
the same IOPLL output, and routed to the DCM (core clock multiplexer) nearest to the DIB
The DCM has a divider that does the division of 1, 2, or 4.
Sharing the same clock and using the divider within the DCM reduces clock
Stratix® 10 GX 10M die contains 24 IOPLLs and
you can program each IOPLL to produce nine unique clocks (divided from the PLL's VCO).
On the receiving die, the DIB clock and system clock (if applicable) on the
DIB RX core is derived from the source synchronous clock in the DIB TX core.
The DIB clock on the RX side runs at the same frequency as the DIB clock
on the TX side.
The DIB clock goes through the clock divider inside the DIB to generate
the rem_clk port on the RX die.
Each DIB channel has its own independent system clock and associated DIB
The smallest granularity for the clock domain is per channel.
Stratix® 10 GX 10M variant parts per die
has a total of 72 clock domains for the system clock and DIB clock.
Note:Intel recommends that each channel should not have its own unique clock
source at the die level to reduce clock uncertainties.
5.2.1. Clocking Options
Intel recommends two clocking
options that you can use as reference for clock routing from IOPLLs to DIB
Clocking Option 1
All three DIB subsystems are clocked by a single IOPLL.
The clock network for this option is larger because the network spans across
the die height.
Advantages of this clock option:
The divergence point is closer to the DIB subsystem because
it is from a single PLL.
For timing closure, you need to account for clock
uncertainties from the divergence point to the leaf.
No clock uncertainties occur when using multiple IOPLLs.
Figure 13. DIB Clocking Option 1
Clocking Option 2
Each DIB subsystem is clocked by its own PLL.
This option is more efficient if the logic clocked by one IOPLL is not
required to interact with the logic clocked by another IOPLL. Cross-PLL interactions
incur larger clock uncertainties.
If a DIB subsystem (or even a few channels) is mutually exclusive
with the other sections of logic or another DIB subsystem, you can use multiple
IOPLLs to clock the logic driven by those unique clock domains.
In this case, the clock network span from the divergence point to the
leaves is shorter because the network does not need to span across the die
This option also incurs less clock uncertainties.
Figure 14. DIB Clocking Option 2
5.2.2. Clock Synchronization
Intel recommends you to follow
the board design guidelines for clock synchronization.
DUT Clock Synchronization between 2
Stratix® 10 GX 10M Dies
To synchronize the DUT reference clock, Intel recommends 20 ps.
DIB/System Clock Synchronization between 2
Stratix® 10 GX 10M Dies
The DIB clock and the system clock on the DIB RX side derive from the source
synchronous clock on the TX side.
The DIB clock on the DIB RX channel operates at the same
frequency as the DIB clock on the DIB TX channel.
The system clock on the RX dies is a divided clock from the
source synchronous DIB clock to match the frequency of the system clock on
the TX die. The clock divider supports division by 1 up to 16.
Note: The duty cycle is 60:40 for the odd divider, and you
need to consider this duty cycle if any negative-edge flops are
The TX version and RX version of the system clock on the
same die are not synchronous because the clocks are from different IOPLLs on
Figure 15. Clocking Synchronization Using Different IOPLLs
5.3. Timing Closure
You must compile each
Stratix® 10 GX 10M
die instance in the
Quartus® Prime Pro Edition software separately.
Separate compilation means that you must configure the timing closure for each die
Especially in cases when data or clocks are being transferred from one die to
another, you may need to use certain budgeting schemes to enable timing closure timing in each
die independently. Only then, the timing closure across the two dies is guaranteed.
Consider the following timing transfers to account for the data transfer from
the system clock on one die to the system clock on the other die.
Timing transfer for Bypass mode:
Core to DIB I/O
DIB I/O to core
Timing transfer for TDM Synchronous and Asynchronous modes:
TX die: Core to DIB or Periphery
RX die: DIB or Periphery to Core
Across dies: TDM to TDM
5.3.1. Timing Transfer for Bypass Mode
In the Bypass mode use case, two different dies, to a respective
IOPLL, share a single reference clock.
Each PLL has the same frequency output clock configuration and the respective
counters are synchronous in each die. The same frequency allows the data to be transferred
synchronously from the clock in one die to the clock in the other die. However, the variations
in the dies may affect how well the clocks in each die align with each other.
To analyze each die separately, consider these factors:
The clock uncertainty of one die relative to the clock on the other die
must be computed.
The available data uncertainty for the transfer across the two dies and
the link must be computed and budgeted for each die. For example, you may divide the
uncertainty up to 40% for each die and 20% for the link.
Given these two factors, you may now create the appropriate SDC constraints
for each die so that the
Quartus® Prime Pro Edition software can close
timing for each die independently.
For the TX die, create a virtual clock that has the clock uncertainty of
the RX die, and then set the appropriate set_output_delay
-max and set_output_delay -min constraints
relative to the virtual clock that encompasses the link and data uncertainty in the other
For the RX die, set the appropriate set_input_delay -max and set_input_delay -min
With these SDC constraints, the
Quartus® Prime Pro Edition software places and routes the DIB-to-core and core-to-DIB
connections to meet the timing requirements.
provides the information to determine the clock and data uncertainties because the Timing Analyzer in the
Quartus® Prime software would not have the information.
Figure 16. Timing Closure for Bypass Mode
5.3.2. Timing Transfer for TDM Modes
There are three types of timing transfer in TDM modes.
Figure 17. Timing Closure for TDM Mode
Table 8. Timing Transfer for TDM Modes
Core-to-periphery transfer (TX side)
A synchronous transfer from the system clock with a flop in the
core to the DIB clock with a flop in the periphery on the same die.
The Timing Analyzer in the
Quartus® Prime software analyzes this path as a
Intel determines the
timing closure for this path.
Periphery-to-core transfer (RX side)
A transfer from the DIB clock from the RX die to the system clock
on the RX die.
For SYNC mode
only: The Timing Analyzer in
Quartus® Prime software analyzes this path as a
5.4. Setting Bypass, Asynchronous, and Synchronous Modes in One DIB Instance
You may set each bank, in a channel of four banks, to different DIB
However, be mindful of the following limitations:
If you set bank 0 to RX in Bypass mode, and any of the other banks to RX
in Asynchronous or Synchronous mode, then pad_0_dib_pad pin cannot be timed, and therefore cannot be used.
If you set bank 3 to TX in Bypass mode, and any of the other banks to TX
in Asynchronous or Synchronous mode, then pad_3_dib_pad pin cannot be timed, and therefore cannot be used.
Note: In these situations, the
Quartus® Prime software displays a warning message in the
parameter editor and flags a critical warning in the Fitter.
6. DIB Intel Stratix 10 FPGA IP Interface
All the interfaces for each DIB channel are always present. Unused
signals are not connected.
Stratix® 10 FPGA IP has three main
The direction of the DIB pad signals are dynamically generated based on your
settings in the parameter editor.
All standard and AUX channels have the same top-level signals.