-
Notifications
You must be signed in to change notification settings - Fork 1
Conghoan branch #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this being deleted entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's file is too heavy to commit so I shifted it to LFS. but I didn't change this file. so just reject this change
| outputs/train/aloha_sim_transfer_cube_human/model_10000.safetensors filter=lfs diff=lfs merge=lfs -text | ||
| resnet18-f37072fd.safetensors filter=lfs diff=lfs merge=lfs -text | ||
| outputs/train/act_aloha_sim_transfer_cube_human/model_10000.safetensors filter=lfs diff=lfs merge=lfs -text | ||
| *.ipynb filter=lfs diff=lfs merge=lfs -text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this being added? Are you saying you want to keep .iypnb files as lfs hosts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good - agreed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although we need to work on the naming. No spaces in the names. I'd create a directory called tests/ and put test.py under there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you deleting this file entirely, or are you shifting it to LFS? If so, where's the LFS upload?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's file is too heavy to commit so I shifted it to LFS. but I didn't change this file. so just reject this change
| import tinygrad | ||
| from tinygrad import Tensor, nn, dtypes | ||
| from tinygrad.ops import Variable | ||
| # from tinygrad.ops import Variable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No dangling comments. Why is this commented out? What shifted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it has been shifted to "from tinygrad import Variable"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where's that import? Is it needed now?
| mean_val = stats[key]["mean"] | ||
| std_val = stats[key]["std"] | ||
| # Convert to numpy if needed (handle both numpy arrays and torch tensors) | ||
| if hasattr(mean_val, 'numpy'): | ||
| mean_val = mean_val.numpy() | ||
| if hasattr(std_val, 'numpy'): | ||
| std_val = std_val.numpy() | ||
| buffer["mean"].assign(mean_val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is okay, but it's just verbose. Is there any way to make this into a one or two liner?
| min_val = stats[key]["min"] | ||
| max_val = stats[key]["max"] | ||
| # Convert to numpy if needed (handle both numpy arrays and torch tensors) | ||
| if hasattr(min_val, 'numpy'): | ||
| min_val = min_val.numpy() | ||
| if hasattr(max_val, 'numpy'): | ||
| max_val = max_val.numpy() | ||
| buffer["min"].assign(min_val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Read above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I back up your result and trained it again to compare results. You can reject the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I back up your result and trained it again to compare results. You can reject the change.
| import pathlib | ||
| resnet18_IMAGENET1K = ResNet(Block, [2, 2, 2, 2], num_classes=1000) | ||
| state_dict = nn.state.safe_load("resnet18-f37072fd.safetensors") | ||
| model_path = pathlib.Path(__file__).parent / "resnet18-f37072fd.safetensors" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good
| # parser.add_argument("--env_name", type=str, choices=['AlohaTransferCube-v0', 'AlohaInsertion-v0'], default='AlohaTransferCube-v0') | ||
| parser.add_argument("--env_name", type=str, choices=['AlohaTransferCube-v0', 'AlohaInsertion-v0'], default='AlohaInsertion-v0') | ||
| # parser.add_argument("--model_path", type=str, default='outputs/train/aloha_sim_transfer_cube_human/model_final.safetensors') | ||
| # parser.add_argument("--model_path", type=str, default='outputs/train/aloha_sim_transfer_cube_human/model_final.safetensors') | ||
| parser.add_argument("--model_path", type=str, default='outputs/train/aloha_sim_insertion_human/model_30000_original.safetensors') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No dangling comments. Also, why would we peg the model to open to 30,000 instead of final?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the original version, there was only the model with 30000 steps. so I just run the train.py file again to get the models and compared the results. both 30000 steps and the final models are not good now. you can reject the change and help me to compare the results
| fps = env.metadata["render_fps"] | ||
|
|
||
| # Encode all frames into a mp4 video. | ||
| video_path = output_directory / "rollout.mp4" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why rollout3 vs rollout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is just for testing to compare your result before
| # from lerobot.common.datasets.lerobot_dataset import LeRobotDataset | ||
|
|
||
| from lerobot.datasets.lerobot_dataset import LeRobotDataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't know it shifted.
Again, no dangling comments
| # Start of training code | ||
| parser=argparse.ArgumentParser(description="Argument Parser for ACT training on simulated environments") | ||
| parser.add_argument("env_name", type=str, choices=['aloha_sim_transfer_cube_human', 'aloha_sim_insertion_human'], default='aloha_sim_insertion_human') | ||
| parser.add_argument("--env_name", type=str, choices=['aloha_sim_transfer_cube_human', 'aloha_sim_insertion_human'], default='aloha_sim_insertion_human') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep solid
| ######################################################################## | ||
| # Handle unused parameters by assigning zero gradients | ||
| optimizers_list = [opt] | ||
| if cfg.train_backbone_separately: | ||
| optimizers_list.append(opt_backbone) | ||
| for optimizer in optimizers_list: | ||
| for param in optimizer.params: | ||
| if param.grad is None: | ||
| # Create zero gradient with same shape and device as parameter | ||
| param.grad = Tensor.zeros(*param.shape, device=param.device, requires_grad=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain this a bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were some parameters not have gradients so when I run it, I got the error as the attachment. I searched and it said that the issue is that opt.step() is being called without first ensuring all parameters have gradients. The problem is that not all parameters are used in the forward pass, so they don't get gradients. then I changed the code like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it - okay!
| batch_converted = {} | ||
| for k, v in batch.items(): | ||
| if isinstance(v, torch.Tensor): | ||
| batch_converted[k] = Tensor(v.detach().cpu().numpy(), requires_grad=False) | ||
| else: | ||
| batch_converted[k] = v # Keep strings, lists, etc. as-is | ||
|
|
||
| batch = batch_converted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this makes sense in theory. But why are you mixing Torch and tinygrad tensors? Where's it coming from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
cool - @CongHoanCoder , just clean up the pull request (e.g. no commented out imports / code, that's sloppy. also no randomly deleted files). I can check again and validate on my side when you're ready |

I make a few changes from your code. Please help me review it. Thanks