-
Notifications
You must be signed in to change notification settings - Fork 2
3. Hardware Verification
The verification of all functionality required by AGILE is based on two elements, (i) Formal Verification (FV) and (ii) constrained random stimulus.
The (i) Formal Verification targets involve writing design properties using the SystemVerilog Assertions (SVA) paradigm, to be processed by the Cadence JasperGold (JG) tool. JG attempts to generate formal proofs of the validity of these properties or, alternatively, counterexamples which can be used for debugging of functional issues. This flow is mainly used during the design bring-up stage to enable the identification of functional bugs early in the design process (prior to the development of the randomized testbench). Due to the high complexity and bandwidth requirements of the AGILE design, data coherency checks are not performed using FV, and these are left to the randomized testbench.
The (ii) randomized testbench contains a number of tests aimed at reaching high functional and code coverage within all units of AGILE. Stimulus is driven into the design using the AXI-L Verification IP, which is responsible for programming workload into the Nodeslots. The correct behaviour is verified by means of unit checkers. The checker has a handle to the input and output interfaces to each unit and generates expectation items based on the input transactions. The checker also generates observation items from the response transactions on the output interfaces. The expectation/observation scoreboard items are pushed into separate data structures, and any mismatches are evaluated.
From a stimulus perspective, several fields in the Nodeslot programming are randomized, including the requested numerical format, neighbour count and aggregation function. This is done to ensure correctness for all features individually and when taking place simultaneously within the accelerator. For example, sum aggregation should function correctly with or without other functions being computed. Additionally, the stimulus is further constrained along the way to account for corner cases.
Within the test hierarchy, a state machine is kept for each Nodeslot, which is updated according to the received programming through the AXI-L interconnect and the request/response loop between the NSB and all other functional units. Each unit checker generates expectation items based on the testbench view of the Nodeslot states. In the case of a mismatch between the RTL and testbench state machines for a Nodeslot due to an incorrect transition, this should result in errors to assist with the debugging.
In the project, the primary target for Formal Verification (FV) was the library components (see Section [section:library]), given their high rate of use and functional importance in the design, as well as smaller units within each of the primary sub-modules (Prefetcher, Aggregation Engine and Transformation Engine). FV tools were suitable for these use cases, given the low relative complexity and sequential depth. In each case, the main focus was on deadlock and liveness properties. Deadlock refers to the state in which units within the design are unable to proceed because of inter-dependency on each other to release resources or take some action. Liveness relates to the property of guaranteeing that the design reaches a certain desirable state.
-
AXI Read Master: responsible for driving the AXI memory interface according to a request packet defining byte count and start address (see Section [section:lib_axi_read_master]). An expectation of the required number of AXI transactions and beats per transaction is created based on the fetch request. Checks ensure that when a fetch response is issued back, the expectations match the real number of beats and transactions driven through the AXI channels.
-
Prefetcher Fetch Tag: responsible for driving the request interface to the Adjacency and Message Read Masters (see Section [section:prefetcher_fetch_tag]). Checks ensure that (1) the requested number of bytes are consistent with the neighbour count and precision when a normal (adjacency or messages) response is issued, (2) partial responses are only issued when the requested byte count exceeds the queue capacity, (3) message fetch requests should not be issued until after the adjacency NSB response has been sent.
The following checks are in place in the NSB checker.
-
INVALID_REQUEST_STATE: Requests to the PREF, AGE, Transformation Engine etc. occur at the correct state for each Nodeslot. For example, a request to the Transformation Engine should not be issued from the NSB before the Aggregation Engine request has received a response, causing the Nodeslot to transition to the TRANSFORMATION state.
-
<PREF/AGGR/TRANS>_REQ_PAYLOAD_MISMATCH: The neighbour counts, start address and other payloads provided in requests to the Prefetcher, AGE and Transformation Engines are consistent with the Nodeslot programming from the AXI-L register writes and other events in the Nodeslot state machine.
-
INVALID_<TRANS>_REQ: Requests to the Transformation Engine should not be issued until the required population count is reached in the Aggregation buffer. This is configured according to the NSBCONFIG POPULATIONCOUNT parameter in the NSB.
The Prefetcher interacts with all other functional units in the design, as well as the AXI interconnect for Memory requests. The Prefetcher checker maintains checks for the following functionality:
-
CORRECT_AXI_REQ_COUNT: Following from an NSB request, the appropriate number of AXI transactions are issued, according to the neighbour count and start address in the request.
-
NSB_RESP_PAYLOADS_MISMATCH: In a response with STATUS = OK, the fetched_neighbours count is the same as the neighbour count in the request phase. PARTIAL responses only occur for Nodeslots in which the neighbour count is greater than the Fetch Tag capacity. Furthermore, if a PARTIAL response is received, the fetched_neighbours count should be saturated at this capacity, which signals to the NSB that the Fetch Tag is full, and the Prefetcher will continue fetching incoming messages once the AGE begins its computation.
-
INVALID_PARTIAL_UPDATE: in cases when a Nodeslot’s neighbour count is higher than the Fetch Tag capacity, the Prefetcher will issue a secondary update transaction after the initial done response to signal to the NSB that the full neighbour count has been reached. This update transaction should only take place for Nodeslots that received PARTIAL responses in the first place.
The following checks will be in place in the AGE Checker. Most importantly, this checker is responsible for data coherency checks on the aggregation results.
-
INVALID_PREF_REQ_PAYLOADS: payloads in the AGE’s request to the Prefetcher should match those received from the NSB.
-
INVALID_BUFFER_TRANSACTION_PAYLOADS: after completing aggregation, the AGE should store the results in one of the Aggregation Buffers, according to the Nodeslot precision. The chosen precision should match the Nodeslot programming.
-
AGGREGATED_MESSAGES_MISMATCH: the resulting aggregation of incoming messages for a Nodeslot, stored in the Aggregation Buffer, should match the internal expectation of the Testbench.
-
INVALID_BUFFER_PULSE: the FTE should only fetch features from the correct Aggregation Buffer, according to the precision in the NSB request. Additionally, only buffers slots associated with the Nodeslots in the transformation request should be popped.
-
INVALID_WEIGHT_CHANNEL_REQ: a Weight Channel request should only be issued for the precision in the latest NSB request.
-
INVALID_TRANSFORMATION_BUFFER_WRITE: the FTE can optionally write transformation results onto the Transformation Buffers for further processing or directly back into DRAM, according to programming in its register bank. In WRITEBACK-only mode, no buffer transactions should be observed.
-
DATA_CONSISTENCY: the transformation results stored in the buffer or written out to DRAM should match the expectation of the testbench.