-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
Hi,
Thank you for your excellent work!
I am currently trying to reproduce ESC-base-no-adv but have encountered some issues.
- According to the README, training a base ESC model on 4×RTX 4090 GPUs takes approximately 12 hours for 250k steps using 3-second speech clips with a batch size of 36. Based on the provided config, the batch size per GPU should be 9, totaling 36 across 4 GPUs.
- 1.a. I am using 2×A6000 (48GB) to train ESC-base-no-adv. To match the total batch size, I set 18 bs/GPU × 2 GPUs = 9 bs/GPU × 4 GPUs. However, the training speed appears significantly slower (~11x):
<<<<Experimental Setup: esc-base-non-adv>>>>
BatchSize_per_Device: Train 18 Test 4 LearningRate: 0.0001
Total_Training_Steps: 5000*50=250000
Pre-Training_Steps: 5000*15=75000
Optimizer: AdamW Scheduler: constant
Quantization_Dropout: 0.75
Model #Parameters: 8.74M
TQDM: 23:29<129:24:34
I know the performance of A6000 is different with 4090, but the training speed will not be lost as so much (I guess?).
- 1.b. I noticed that the train_data_path in config differs from the dns_training dataset you provided (it matches the one for ESC-large instead). Did you use a different dataset for ESC-base-no-adv?
- I anticipate that I will try to do some research on your ESC repo and may have some follow-up questions. It would be even better if you would be willing to email me your personal contact information (e.g. WeChat). My email is isjiawei.du@gmail.com.
Thank for your work again and I look forward to your reply.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels