Skip to content

Can online_hadamard be integrated into the model weights? #20

@Chunyuan-Li

Description

@Chunyuan-Li

First of all, thank you for your work. I noticed that in the paper, R_res is mentioned as contributing the most to performance improvement, followed by R_down. So, if we want to keep the model structure unchanged, since R_down.T can be fused into W_down, can R_down be fused into W_gate and W_up? Thank you very much.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions