-
Notifications
You must be signed in to change notification settings - Fork 2
Feat/train on apple silicon #164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ry and validating train/validation splits - Added functionality to remove the __MACOSX directory after extraction. - Implemented checks to ensure the presence of 'train' and 'validation' directories in the extracted content, raising errors if not found.
…directory validation - Renamed the function `list_dir` to `list_directories` for clarity. - Updated the extraction logic to use the new function name. - Improved validation checks for 'train' and 'validation' directories by using `Path` objects for better consistency and readability.
- Updated `get_gpus_count` to support MPS alongside CUDA. - Modified default value of `amp_enabled` in training and validation functions to False. - Improved model loading logic to handle MPS and CUDA availability with appropriate error handling.
…ailability check - Changed default value of `amp_enabled` to True in training and validation functions. - Added a check to ensure CUDA is available when using AMP, with a warning if not.
- Enhanced device selection to prioritize CUDA, MPS, or fallback to CPU based on availability. - Removed redundant CUDA availability check from the training loop.
- Expanded the list of supported model families in focoos_hub.py to include IMAGE_CLASSIFIER for enhanced functionality.
- Introduced a new field `mps_available` in the `GPUInfo` class to indicate MPS support. - Updated the `get_gpu_info` function to populate the `mps_available` field based on the availability of MPS. - Bumped focoos version to 0.22.0 in the lock file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is being reviewed by Cursor Bugbot
Details
Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
| try: | ||
| self.model = self.model.to(device="mps") | ||
| except Exception: | ||
| logger.warning("Unable to use MPS") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Model Transfer Inefficiency Between CUDA and MPS
The model initialization logic attempts to move the model to CUDA, then immediately to MPS, if both are available. This sequential device placement causes inefficient double transfers and implicitly prioritizes MPS, which might not be the intended or optimal behavior.
No description provided.