Skip to content

aggressive issue? #9

@Zephyrose

Description

@Zephyrose
        current['module'] = 'skipped'
        if current['layer'] == 27:
            x = cache_dic['cache'][-1]['noise']

论文中的激进缓存的意思不是跳过前26层的计算嘛,只计算最后一层。这里的实现是跳过所有dit的计算?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions