Hi,
I'm trying to reproduce table 1 from the paper but get varying results even for SGD. My hardware setup is definitely different, I run the codes on a single GPU machine but I think SGD results should not vary.
| Model |
Aug |
CIFAR-10 |
CIFAR-100 |
| WRN-28-10(200 epochs) |
Basic |
3.8 ± 0.1 |
19.1±0.1 |
| WRN-28-10(200 epochs) |
Cutout |
2.9 ± 0.1 |
17.7±0.2 |
| WRN-28-10(200 epochs) |
AA |
3.6 ± 0.1 |
19.0±0.4 |
| ------------- |
------------- |
------------- |
------------- |
| Shake-Shake (26 2x96d) |
Basic |
3.5 ± 0.1 |
19.0± 0.1 |
| Shake-Shake (26 2x96d) |
Cutout |
2.9 ± 0.1 |
18.2±0.2 |
| Shake-Shake (26 2x96d) |
AA |
3.3 ±0.1 |
17.9±0.2 |
| ------------- |
------------- |
------------- |
------------- |
Can you maybe provide corresponding FLAGS for basic / cutout / aa settings in table 1