Skip to content

How to reduce training loss with a limited dataset? #55

@Dr-xinyu

Description

@Dr-xinyu

I am currently training a dual-arm gripper(2 robotiq_2f_85) model using a relatively small custom dataset. The dataset consists of over 400 objects, with 2,000 generated grasp poses per object。 I modified the PickDataset to create a dualarmdataset and adjusted the training shell script from the tutorial to run on two GPUs. I just modified the input data dimensions passed to the network, leaving the rest of the architecture unchanged
However, I am encountering convergence issues:

Generator Loss: Stagnates around 2.5 after approximately 1k epochs and does not decrease further, even up to 4k epochs.
Image
Discriminator Loss: The total loss across both GPUs stabilizes around 1.3 after 1k epochs and remains flat.
Image
Output Quality: The generated grasp poses are completely unusable.

Aside from expanding the dataset (which is time-consuming), are there any other strategies or specific hyperparameters I should adjust to improve convergence?
my generator's sh script

Image my discriminator's script Image I would greatly appreciate your help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions