-
Notifications
You must be signed in to change notification settings - Fork 0
Home
The MultiFPGA project aims to develop:
1. Design Methodology
2. Low Latency & High Bandwidth Interface
3. Neural Network Layers Implementation
4. Communication Patterns
for efficiently deploying neural networks on a multi-FPGA system with minimal performance loss.
We will be working with PYNQ-Z2 and ZCU102 boards mainly, with Vitis-HLS and Vivado tools from Xilinx, AMD.
The project is being completed as part of a course project for ECE8893 at Georgia Tech, directed by Prof. Hao from SHARC LAB.
- BRAM-to-BRAM communication interface
- Dataflow across FPGAs
- Serial partition and deployment of NN on multi-FPGA system
- NoC and network topology design
- Model profiling and task-level partition of NN
-
Stage One: Communication Interface between PL sides of boards
(Impl Due: February 17)(Impl on Schedule)- Explore board specifications of ZCU102 and PYNQ-Z2 [Documented]
- Instantiate Aurora 64B/66B IP to enable communication.
[Assigned to Zihan Zheng][Feb 18: Impl merged]- (Primary) On-board test with simplest data transfer task
- Finalize the above design as a memory kernel module
- Loopback test if connector is available
- (Optional) Instantiate ZCU102 PL-Based 1G/10G Ethernet with SPF+ module
- Performance evaluation [Documented]
- (Optional) Build PYNQ image for ZCU102 [skipped: Using Ubuntu on ZCU102 with PYNQ library]
- (Optional) On-board test of PS-Based Ethernet
-
Stage One Report Composition
(Report Due Update: February 25 <- 20) -
Stage Two: Dataflow across FPGAs
(Impl Due Update: February 28 <- 21)- Simulation Methods Study: How to simulate/co-simulate the system? [Documented]
- (Primary) Implement and test consecutive dataflow across boards.
[Assigned to Zihan Zheng][Feb 20: Impl merged]- Aurora IP for SFP+ interface
- (Optional) FMC connector
- (Optional) SMA connector
- Bottleneck analysis and performance estimation. [Documented]
- (Primary) Integrate layers implemented in Lab2 with the above dataflow.
- Fixed weights and structures
- Modify the design to support consecutive input data.
- Fixed weights and structures
- Performance evaluation [Documented]
- Performance comparison with other communication methods.
- (Optional) Weight-loading strategy design.
- (Optional) On-board performance evaluation module design.
- (Optional) Github-CI scripts for sanity check.
-
Stage Two Report Composition
(Report Due: February 27) -
Project Proposal Presentation
(Presentation Due: March 2) -
Stage Three: Deployment of a Complete Neural Network on a multi-FPGA system
(Impl Due: March 21)- Simulation Methods Study: How to simulate/co-simulate the system? [Documented]
- (Primary) Migrating Methodologies Study: software-hardware co-design [Documented]
- Quantization
- Pruning
- Frame Dropping
- etc...
- (Primary) Implement/Migrate a neural network to our design in Stage 2.
- On-board evaluation.
-
Stage Three Report Composition
(Report Due: March 27) -
Paper Presentation
(Presentation Due: March 28) -
(Optional) Stage Four: NoC design, Network Topology and Task-level Partition
(Impl Due: TBD)- Design NoC module to handle complicated data transfer and sharing requests.
- (Primary) Network topology design for advanced resource planning and task scheduling.
- Explore task-level partition. [Documented]
- One large layer on multiple boards.
- Explore model profiling. [Documented]
- Which stages have least bandwidth requirements?
- Explore reconfiguration in the flight.
- Can two/three boards mimic infinite resource like a VM?
-
Final Project Presentations
(Presentation Due: May 2)
- Create issue / comment existing issue / assign the issue to yourself before attempting a listed task.
- Create a new branch with a descriptive name for the task.
- The [Documented] label suggests the task should be well documented for quick report composing and study purpose.
- Leave the finishing date when marking a task as done.
All current schedules are not validated and subject to change.