Skip to content
Ravencus edited this page Feb 21, 2023 · 14 revisions

MultiFPGA project

Overview

The MultiFPGA project aims to develop:

1. Design Methodology
2. Low Latency & High Bandwidth Interface
3. Neural Network Layers Implementation
4. Communication Patterns

for efficiently deploying neural networks on a multi-FPGA system with minimal performance loss.

We will be working with PYNQ-Z2 and ZCU102 boards mainly, with Vitis-HLS and Vivado tools from Xilinx, AMD.

The project is being completed as part of a course project for ECE8893 at Georgia Tech, directed by Prof. Hao from SHARC LAB.

Roadmap

  • BRAM-to-BRAM communication interface
  • Dataflow across FPGAs
  • Serial partition and deployment of NN on multi-FPGA system
  • NoC and network topology design
  • Model profiling and task-level partition of NN

Timeline

  • Stage One: Communication Interface between PL sides of boards (Impl Due: February 17) (Impl on Schedule)

    • Explore board specifications of ZCU102 and PYNQ-Z2 [Documented]
    • Instantiate Aurora 64B/66B IP to enable communication. [Assigned to Zihan Zheng][Feb 18: Impl merged]
      • (Primary) On-board test with simplest data transfer task
      • Finalize the above design as a memory kernel module
      • Loopback test if connector is available
    • (Optional) Instantiate ZCU102 PL-Based 1G/10G Ethernet with SPF+ module
    • Performance evaluation [Documented]
    • (Optional) Build PYNQ image for ZCU102 [skipped: Using Ubuntu on ZCU102 with PYNQ library]
    • (Optional) On-board test of PS-Based Ethernet
  • Stage One Report Composition (Report Due Update: February 25 <- 20)

  • Stage Two: Dataflow across FPGAs (Impl Due Update: February 28 <- 21)

    • Simulation Methods Study: How to simulate/co-simulate the system? [Documented]
    • (Primary) Implement and test consecutive dataflow across boards. [Assigned to Zihan Zheng][Feb 20: Impl merged]
      • Aurora IP for SFP+ interface
      • (Optional) FMC connector
      • (Optional) SMA connector
    • Bottleneck analysis and performance estimation. [Documented]
    • (Primary) Integrate layers implemented in Lab2 with the above dataflow.
      • Fixed weights and structures
    • Modify the design to support consecutive input data.
      • Fixed weights and structures
    • Performance evaluation [Documented]
    • Performance comparison with other communication methods.
    • (Optional) Weight-loading strategy design.
    • (Optional) On-board performance evaluation module design.
    • (Optional) Github-CI scripts for sanity check.
  • Stage Two Report Composition (Report Due: February 27)

  • Project Proposal Presentation (Presentation Due: March 2)

  • Stage Three: Deployment of a Complete Neural Network on a multi-FPGA system (Impl Due: March 21)

    • Simulation Methods Study: How to simulate/co-simulate the system? [Documented]
    • (Primary) Migrating Methodologies Study: software-hardware co-design [Documented]
      • Quantization
      • Pruning
      • Frame Dropping
      • etc...
    • (Primary) Implement/Migrate a neural network to our design in Stage 2.
    • On-board evaluation.
  • Stage Three Report Composition (Report Due: March 27)

  • Paper Presentation (Presentation Due: March 28)

  • (Optional) Stage Four: NoC design, Network Topology and Task-level Partition (Impl Due: TBD)

    • Design NoC module to handle complicated data transfer and sharing requests.
    • (Primary) Network topology design for advanced resource planning and task scheduling.
    • Explore task-level partition. [Documented]
      • One large layer on multiple boards.
    • Explore model profiling. [Documented]
      • Which stages have least bandwidth requirements?
    • Explore reconfiguration in the flight.
      • Can two/three boards mimic infinite resource like a VM?
  • Final Project Presentations (Presentation Due: May 2)

General Recommendations

  • Create issue / comment existing issue / assign the issue to yourself before attempting a listed task.
  • Create a new branch with a descriptive name for the task.
  • The [Documented] label suggests the task should be well documented for quick report composing and study purpose.
  • Leave the finishing date when marking a task as done.

All current schedules are not validated and subject to change.

Clone this wiki locally