Skip to content

Conversation

@bniladridas
Copy link
Member

Summary

Remove the PyTorch dependency and fix an unsafe serialization mechanism in main.py by replacing torch.save() (pickle-based) with a safe NumPy-based approach.

Why

  • PyTorch was only used for torch.save(), which pickles the entire CMA-ES object.
  • Pickle-based serialization is unsafe due to arbitrary code execution risk.
  • This introduced an unnecessary heavyweight dependency and a security concern.

What changed

  • Removed PyTorch as a dependency.
  • Replaced torch.save() with np.save() to persist only the best weights array.
  • Updated loading logic to reconstruct the agent from saved weights.
  • Serialization is now safe, minimal, and deterministic.

Behavior

Before

  • Required PyTorch solely for serialization.
  • Serialized the full CMA-ES object via pickle (unsafe).

After

  • No PyTorch dependency.
  • Only NumPy weights are saved and loaded.
  • Model loading, inference, and evaluation work unchanged.

Backward Compatibility

  • No user-facing API changes.
  • from_pretrained, get_action, and evaluate behave exactly as documented.
  • The README.md remains accurate.

Testing

Manually verified loading and evaluation:

source venv/bin/activate && python -c "
from harpertoken.models.model import CMAESAgent
agent = CMAESAgent._from_pretrained(
    model_id='.',
    revision=None,
    cache_dir=None,
    force_download=False,
    proxies=None,
    resume_download=None,
    local_files_only=True,
    token=None
)
print('Model loaded successfully')
print('Weights shape:', agent.weights.shape)
mean_reward, std_reward = agent.evaluate(num_episodes=5)
print('Evaluation: Mean reward {:.2f} ± {:.2f}'.format(mean_reward, std_reward))
"

Output:

  • Model loaded successfully
  • Weights shape: (8,)
  • Evaluation runs correctly

Risk

Low. Changes are internal, scoped to serialization logic, and covered by manual validation.

@gemini-code-assist
Copy link

Summary of Changes

Hello @bniladridas, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the security and reduces the dependency load of the project by replacing an unsafe pickle-based serialization method with a more secure and explicit NumPy and JSON-based approach. The change removes PyTorch as a dependency and refactors model saving and loading across the codebase, while maintaining backward compatibility for existing models. The core functionality of model training, inference, and evaluation remains unchanged, ensuring a low-risk transition.

Highlights

  • Security Vulnerability Fix: The primary goal of this PR is to address an unsafe serialization mechanism that relied on Python's pickle module, which carries an arbitrary code execution risk. This has been resolved by switching to a safer, NumPy-based approach.
  • PyTorch Dependency Removal: The project no longer requires PyTorch, as its sole use was for the torch.save() function, which has now been replaced. This reduces the overall dependency footprint.
  • Serialization Mechanism Overhaul: The serialization process has been refactored to save model weights as weights.npy (using NumPy) and model metadata as metadata.json. This makes the serialization explicit, minimal, and deterministic.
  • Backward Compatibility: The model loading logic (_from_pretrained) has been updated to support both the new weights.npy/metadata.json format and the older model.npy format, ensuring existing models can still be loaded.
  • Codebase Refinement: Several files have been updated to reflect the new serialization methods, including push_to_hub.py, model.py, main.py, train_model.py, and related test and inspection scripts. This includes adjustments to how model parameters are handled during training and evaluation.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses a critical security vulnerability by replacing unsafe pickle-based serialization with a secure combination of NumPy arrays for weights and JSON for metadata. The removal of the PyTorch dependency is also a great simplification. The changes are well-implemented across the training scripts, model definition, and utility scripts. However, I've found a critical issue where backward compatibility for loading old models re-introduces the same pickle deserialization vulnerability, undermining the primary goal of this PR. My review includes a comment with a suggestion to fully eliminate this risk.

Comment on lines 116 to 140
# old format
old_path = os.path.join(model_id, "model.npy")
loaded = np.load(old_path, allow_pickle=True)
if isinstance(loaded, np.ndarray):
model.weights = loaded
model.env_name = "CartPole-v1"
model.env = gym.make(model.env_name)
model.observation_space = model.env.observation_space.shape[0]
model.action_type = "discrete"
model.num_actions = model.env.action_space.n
model.action_space = model.num_actions
else:
model_data = loaded.item()
model.weights = model_data["weights"]
model.observation_space = model_data["observation_space"]
model.action_type = model_data["action_type"]
model.env_name = model_data.get("env_name", "CartPole-v1")
model.env = gym.make(model.env_name)
if model.action_type == "discrete":
model.num_actions = model_data["num_actions"]
model.action_space = model.num_actions
elif model.action_type == "continuous":
model.action_dim = model_data["action_dim"]
model.action_bounds = model_data["action_bounds"]
model.action_space = model.action_dim

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-critical critical

This backward compatibility block re-introduces the critical pickle deserialization vulnerability that this pull request aims to fix. Using np.load with allow_pickle=True on a user-provided local path (model_id) allows for arbitrary code execution if a malicious model file is loaded. This undermines the security improvements made elsewhere.

To ensure the library is secure, I strongly recommend removing support for the old, insecure model format. Instead, you should raise an error to inform the user that the model format is outdated and unsafe to load.

                raise ValueError(
                    "The model at the specified path is in an old, insecure format. "
                    "Loading of pickle-based models is disabled for security reasons. "
                    "Please convert the model to the new format (separate 'weights.npy' and 'metadata.json' files)."
                )

@bniladridas bniladridas changed the title Fix pickle deserialization security vulnerability fix: remove unsafe pickle serialization Jan 7, 2026
- Replace unsafe pickle-based serialization with safe NumPy arrays and JSON metadata
- Update model loading/saving to avoid arbitrary code execution from untrusted sources
- Remove unnecessary PyTorch dependency from training script
- Fix bugs in main.py training implementation for proper linear policy
- Update all related files: models, training, hub, tests, scripts
@bniladridas bniladridas merged commit e4de953 into main Jan 7, 2026
5 checks passed
@testapp-bot testapp-bot moved this from Todo to Done in Project for rl Jan 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants