Skip to content

Conversation

@chapman20j
Copy link
Collaborator

Resolves #100
Adds the Gemma3 model to the Bonsai Repo. This first commit is just a working version, but I am still working on optimizing it.

Reference
Refer to Issue #100

Checklist

  • I have read the Contribution Guidelines and used pre-commit hooks to format this commit.
  • I have added all the necessary unit tests for my change. (run_model.py for model usage, test_outputs.py and/or model_validation_colab.ipynb for quality).
  • (If using an LLM) I have carefully reviewed and removed all superfluous comments or unneeded, commented-out code. Only necessary and functional code remains.
  • I have signed the Contributor License Agreement (CLA).

Updated configs
Moved embed_tokens to more natural place
Updated run_model to use sampler and stop at end_of_turn token
Added test_sharding_gemma3
Added batched forward test. Need more complex behavior and testing
@jenriver
Copy link
Member

Also, please make sure your selective tests are passing

Comment on lines 32 to 60
def make_input(processor, dtype=torch.float32, msg1=True):
if msg1:
messages = [
{"role": "system", "content": [{"type": "text", "text": "You are a helpful assistant."}]},
{
"role": "user",
"content": [
{
"type": "image",
"url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg",
},
{"type": "text", "text": "What is shown in this image?"},
],
},
]
else:
messages = [
{"role": "system", "content": [{"type": "text", "text": "You are a helpful assistant."}]},
{
"role": "user",
"content": [
{
"type": "image",
"image": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bee.jpg",
},
{"type": "text", "text": "Describe this image in detail."},
],
},
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very lengthy and difficult to read. How about having something like this?

def make_input(processor, dtype=torch.float32, msg1=True):
    url = "pipeline-cat-chonk.jpeg" if msg1 else "bee.jpg"
    prompt = "What is shown in this image?" if msg1 else "Describe this image in detail."
    img_key = "url" if msg1 else "image"
    
    messages = [
        {"role": "system", "content": [{"type": "text", "text": "You are a helpful assistant."}]},
        {"role": "user", "content": [
            {"type": "image", img_key: f"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/{url}"},
            {"type": "text", "text": prompt}
        ]}
    ]
    # Add your return statement or processor logic here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented your suggested change. The final output ends up longer because of the ruff formatting, but it is shorter than before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gemma3

2 participants