Skip to content
This repository was archived by the owner on Oct 31, 2023. It is now read-only.
This repository was archived by the owner on Oct 31, 2023. It is now read-only.

Understanding graphs from papers #18

@prajjwal1

Description

@prajjwal1

Thanks for replying to my previous questions. In the fig 3 of your paper, i had few queries.

  1. In Average Span vs Span Limit (Central graph), you showed that in case of fixed span model, span increases as span limit increases. I wanted to ask, as per your code base, spans are already monitored by current_val only if adapt_span_enabled is set to True (line). So how did you measure the span of fixed model because in that case, the bool value will be false, and then AdaptiveSpan won't monitor it. How did you measure the span of fixed model ?

  2. In FLOPS vs Span Limit, you showed that FLOPS keep on increasing in the case of fixed span model while in the case adaptive span, FLOPS were constant (approximately linear). After through inspection, FLOPS are constant in adaptive span but they don't see seem to be rising in case of standard attention as well. In both the cases, FLOPS are same. Could you please share some insights.

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions