Increasing design densities and shrinking process technology nodes have only increased the challenges and risks associated with multi-million dollar chip design projects but simultaneous advancements on the FPGA horizon has given way to means of answering those challenges at prototyping stage. But with analog/mixed signal blocks being integral part of the SoCs, one limiting factor for FPGA-based emulation was the lack of mixed signal blocks available in FPGA architectures and only way out was to have test chips (very expensive and time consuming!) work in conjunction with FPGA-based digital logic. The evolution of FPGA architecture in recent years now enables the emulation team to replicate entire SOC design,including transceiver and other phy building blocks, on FPGA-based platforms.
FPGA architecture evolution over last two generations have enabled the latest FPGA devices capable of handling many more challenges associated with prototyping of multi-million gate SoCs designs involving increased logic complexities, speeds along with mixed signal blocks like high speed serial interfaces like USB 2.0/3.0, PCIe 1.0/2.0, SAS/SATA at 3/6 Gbps for different types of system connectivity in modern designs. Traditionally, at the prototyping stage, high speed serial PHY/serdes test chips used to work in conjunction with FPGA-based digital logic and that required additional budget and complex sub system level (individual Phy + controller) effort before full chip emulation could be carried out.
This paper talks about how we at LSI were able to exploit FPGA architecture to emulate 1.5Gbps serdes within Xilinx V5 device on HAPs-51T platform from Synopsys as part of our chip emulation on FPGA-based platform. The figure below shows how Synopsys HAPs-51T platform having Virtex-5 device enabled LSI to emulate 1.5G serdes functionality on the FPGA device without any external test chip for connectivity to external world.
High Serial Interfaces on FPGA and their use
The high speed serial I/Os available with Xilinx are used basically to emulate the behaviour of high speed protocols used in high end SoCs of the current generation. These transceivers have all modern high-speed serial communication blocks like the encoding, elastic buffers, OOB/Beacon Signaling, Comma Detection and serialization-deserialization (SERDES) inside them enabling them to be useful for prototyping all modern high-speed serial protocols like PCI Express, SAS upto 6 Gbps on serial links.
They are highly configurable and, as Xilinx confirms, are power efficient. Each Virtex5 device can have up to 24 transceivers allowing multi-channel operations and scalability.
There are two families of transceivers that are available on Virtex-5 devices. They are:
Depending on the type of transceiver we need, we select the device (and eventually the HAPS board) to be used. Generally there are certain considerations in selecting the type of transceiver. Below are some useful guidelines:
- Speed of data transfer - GTP transceivers can support data rates upto 3.125 Gbps whereas GTX transceivers support data rates upto 6.0 Gbps.
- Datapath width - This is a very important factor to be considered while deciding which transceiver should be used. For e.g max data path width which a GTP transceiver supports is 20 bits ( if you disable the 8/10 bit encoding block).So if your SerDES has the 20 bit Tx/Rx lines, it would make sense to use GTP transceiver. If your SerDES has a 40 bit Tx/Rx line then it is advisable to go for the GTX transceivers. Off-course if you need to use the GTP transceiver for a 40 bit transceiver datapath this can be achieved since a transceiver tile has 2 transceivers per tile. Channel bonding technique can be used to achieve this but again you have two manage two sets of TX and RX serial line.
- Features - With the integration of 64b/66b and 64b/67b RX gearbox, GTX simplifies high line-rate designs saving thousands of logic cells in the design implementation.
- Cost - GTP transceivers score over GTX transceivers where cost of the device is of concern. GTX, as is evident, with its high data rate support and support for wide standards naturally comes at a higher cost.
Design and verification considerations for transceiver based designs
As given above, these transceivers help in emulating the high-speed design which use the latest serial protocols. But we need to make some careful considerations in using these transceivers from design and verification point of view. The reference design used in this case is a SAS/SATA based design, but many of the considerations are generic and are applicable to other protocols as well.
Glue logic - Additional glue logic is needed to bridge between serdes and Phy digital logic i.e. there is need to build a gearbox logic between our digital logic which has 40 bit data interface and the GTP 20 bit interface.
Scaling the logic - With limited FPGA resources at disposal, it helps to scale down the design i.e. out of a 8 or 4 channel SERDES, implementing only 2 channels at emulation stage will still serve the purpose and reduce the effort involved at emulation stage without losing the essence of the exercise.
Clock distribution and alignment - The best way to do this is use the Xilinx PLLs inside the FPGAs to generate these clocks and ensure phase and edge alignment of different TX/RX clocks for serdes and glue logic. This will ensure that there are no clock-domain crossing issues causing faulty data latch and other issues.
Re-use building blocks from ASIC Design - FPGA has some functionality which may be redundant during emulation. For example a transceiver in FPGA may have functionality like 8/10 bit encoders, elastic buffers while the design block of PHY in ASIC/SoC may also have the encoders, elastic buffers and OOB/Beacon Signaling present in it In such a scenario it is advisable to use these functionalities from the ASIC and bypass the GTP blocks as these are the blocks that will go into the final chip, so they can be tested.
Retain key aspects of functionality - There are key features like Power down, power saving modes of high speed serial interfaces which must be carried out in emulation phase. GTP block provides TXPOWER DOWN pin which should be retained while generating the Core using Xilinx Core-Generator utility. This pin is effective as this can be used in tandem with the TXELECIDLE pin to control the BURST time and the IDLE time on the SERIAL line.
Ensure sufficient Reset Time - We need to ensure that the GTP/GTX reset is asserted for sufficient time and we wait for the sufficient time for all channels to be reset (for our case this time was 160 microsecs). This can lead to unpredictable behavior in simulations for the transceivers.
Use right GTP clock time-periods - There are some limitations with the Xilinx GTP models used in simulations which can cause the BFMs or verification components sitting on the other end of the serial line to flag errors. So it makes sense to use a imprecise clock period during simulation for the GTP clock. This could be illustrated in the issue that we encountered in our SAS based prototyping. If the clock input (CLKIN) time period was given as 6.6667ns (which is a better approximation of 150MHz) in the verification environment, the bit width on the serial line changes. Using the time period as 6.66 ns after which the bit width was constant and the communication and data transfer was possible.
Verify Data Integrity at Clock-domain Crossing - This is as much as verification consideration as a design factor. If there is CDC (clock domain crossing) in simulations, then either synchronization should be provided or we need to see that clocks are edge aligned and data integrity is ensured. As mentioned earlier use of PLL should be recommended for generating clocks when data transfer is happening between clock domains.
Implementation on HAPs
The HAPS platform consists of a single FPGA board with all the major interfaces that are needed for system prototyping made available either through plug-in cards or onboard. The HAPS boards of particular interest for Serial Interface based design are the HAPS-51T and the HAPS-51FXT boards. The HAPS-51T has the Virtex-5 device having GTP transceiver while the HAPS-51FXT has the Virtex-5 device having the GTX transceiver.
Depending on the guidelines given in Chapter-1, we can choose the HAPS platform that suites our design requirements. In addition to 476 I/O signals in the HapsTrak II connectors, the HAPS-51T has 24 high-speed SerDes channels available in HapsTrak MGB connectors. The SAS (SAS Core+Phy+SerDes) logic rests inside the FPGA on the HAPS board. Another board consisting of a processor based system (may consist of single or multiple FPGAs) will be used for register programming as well as driving data blocks on the SAS interface. The HDD Controller card will interface the 1.5G/3G Hard-disk with the entire SAS based system. The 24 RocketIO GTX channels in the FPGA are available in the HapsTrak MGB connectors. These connectors also hold 30 I/O signals available for general purpose use.
- Development in the FPGA technology has enabled the emulation of mixed signal de-sign blocks and high speed protocols by use of the high-speed serial transceivers avail-able inside the FPGA silicon.
- Careful selection of the device (and in effect the right HAPS board) based on design requirement can reduce cost and debug effort enabling an optimal prototyping solution.
- Due care needs to be taken while emulating the mixed signal designs such that all the major digital blocks inside the PHY like the 8/10 bit encoding, Elastic buffers and OOB signalling blocks are retained from the original design and only the serializers-deserializers and other minimum logic is used from the GTP/GTX transceivers. This ensures that the blocks that go on the ASIC testchip are getting tested.
- Design and verification of serial interfaces would be easy if the guidelines given above are followed.
- HAPS board facilitates for prototyping the System as a whole using the HAPS MGB connectors and Hard Disk Drive Controller Board.