Skip to content

Question on training time with Resnet18 #4

@jackzhan01

Description

@jackzhan01

I am intrigued by your work that demonstrates the effectiveness of ZO optimization in training large-scale models. Your main experiments on CIFAR-10 using ResNet20 show that it takes approximately 60 minutes per epoch (a result I have successfully replicated).
image
However, your framework, Deepzero, utilizes CGE, which causes the inference time to increase linearly with the model size. In Appendix D, Table A3, you reported training ResNet18, whose model size is approximately ten times larger than ResNet20. I am curious about how long it took to train ResNet18 using the Deepzero framework.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions