-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Description
High priority
- Codegen should be able to generate one DPU binary per upmem.launch #4
- Linkage issue with memrefCopy
- Checking that VA runs (@h4midf)
- Add histogram op (@h4midf)
- In CINM
- In CINM->CNM
- Implement fallback strategy to run CINM ops on CPU #7
- Make sure that input memref to
upmem.scatteris legal, that is, all scattered elements must be contiguous in memory #6 - Write a README
- Remove artifact-specific stuff
Cost model (deadline 04.08)
Currently we have cinm.compute with attributes for workgroup shape and DPU memory size.
We assume this specification is correct, that is, the lowering pass cannot change them.
They should be obtained through the cost model.
- Implement a simple Samsung dialect
- Implement a pass that annotates Samsung and Upmem kernels with their time estimation
- Implement the upmem cost estimator in C++
Lower priority
- Add verifier for shape of scatter map in UPMEM
- Fix the GPU lowering, was probably broken by recent changes to CNM
Optimization
- Hoist buffer alloc and free outside of loops
- Malloc avoidance
- Avoid tensor reshapes that do a copy (Especially for VA that's a problem)
- Unify buffers across loop iterations
- Affine map simplification with dimension sizes
Metadata
Metadata
Assignees
Labels
No labels