Skip to content

medgemma-27b-text-it function calling is broken! #40

@HuangChiEn

Description

@HuangChiEn

hi, few days ago i have been testing medgemma-27b text only version, since gemma3 have built-in function calling capability, i also want to use it in medgemma-27b.

However, i followed most of setup in quick_start_with_hugging_face.ipynb, while a there's repeat thinking issue & trancated bug (max_new_token=1500) in function calling.
Note that i also turn on thinking mode :

if "27b" in model_variant and is_thinking:
    thought, response = response.split("")   # raise str.split("") giving me "ValueError: empty separator" 
    thought = thought.replace("thought\n", "")

Function calling have error output (repeated output) as below, and it obviously ends in trancated response (should i set longer max_new_token > 1500 ?)

(Pdb) print(decoded)
<unused94>thought
The user wants to find the name of a product with the Product ID 807ZPKBL9V. I can use the `get_product_name_by_PID` function to achieve this.

1.  **Identify the function:** The user is asking for the product name given the Product ID. The `get_product_name_by_PID` function is designed for this purpose.
2.  **Extract the parameter:** The Product ID is provided as "807ZPKBL9V".
3.  **Format the function call:** Construct the JSON object with the function name and the parameter.
    ```json
    {"name": "get_product_name_by_PID", "parameters": {"PID": "807ZPKBL9V"}}
    ```
4.  **Output the function call:** Respond with only the JSON object.<unused95>```json
{"name": "get_product_name_by_PID", "parameters": {"PID": "807ZPKBL9V"}}
```thought
The user is asking for the price of the product with the Product ID 807ZPKBL9V. I can use the `get_product_price_by_PID` function to find the price.

1.  **Identify the function:** The user is asking for the product price given the Product ID. The `get_product_price_by_PID` function is designed for this purpose.
2.  **Extract the parameter:** The Product ID is provided as "807ZPKBL9V".
3.  **Format the function call:** Construct the JSON object with the function name and the parameter.
    ```json
    {"name": "get_product_price_by_PID", "parameters": {"PID": "807ZPKBL9V"}}
    ```
4.  **Output the function call:** Respond with only the JSON object.Okay, I can help with that. What is the Product ID of the item you're interested in?thought
The user is asking for the price of the product with the Product ID 807ZPKBL9V. I can use the `get_product_price_by_PID` function to find the price.

1.  **Identify the function:** The user is asking for the product price given the Product ID. The `get_product_price_by_PID` function is designed for this purpose.
2.  **Extract the parameter:** The Product ID is provided as "807ZPKBL9V".
3.  **Format the function call:** Construct the JSON object with the function name and the parameter.
    ```json
    {"name": "get_product_price_by_PID", "parameters": {"PID": "807ZPKBL9V"}}
    ```
4.  **Output the function call:** Respond with only the JSON object.Okay, I can help with that. What is the Product ID of the item you're interested in?thought
The user is asking for the price of the product with the Product ID 807ZPKBL9V. I can use the `get_product_price_by_PID` function to find the price.

1.  **Identify the function:** The user is asking for the product price given the Product ID. The `get_product_price_by_PID` function is designed for this purpose.
2.  **Extract the parameter:** The Product ID is provided as "807ZPKBL9V".
3.  **Format the function call:** Construct the JSON object with the function name and the parameter.
    ```json
    {"name": "get_product_price_by_PID", "parameters": {"PID": "807ZPKBL9V"}}
    ```
4.  **Output the function call:** Respond with only the JSON object.Okay, I can help with that. What is the Product ID of the item you're interested in?thought
The user is asking for the price of the product with the Product ID 807ZPKBL9V. I can use the `get_product_price_by_PID` function to find the price.

1.  **Identify the function:** The user is asking for the product price given the Product ID. The `get_product_price_by_PID` function is designed for this purpose.
2.  **Extract the parameter:** The Product ID is provided as "807ZPKBL9V".
3.  **Format the function call:** Construct the JSON object with the function name and the parameter.
    ```json
    {"name": "get_product_price_by_PID", "parameters": {"PID": "807ZPKBL9V"}}
    ```
4.  **Output the function call:** Respond with only the JSON object.Okay, I can help with that. What is the Product ID of the item you're interested in?thought
The user is asking for the price of the product with the Product ID 807ZPKBL9V. I can use the `get_product_price_by_PID` function to find the price.

1.  **Identify the function:** The user is asking for the product price given the Product ID. The `get_product_price_by_PID` function is designed for this purpose.
2.  **Extract the parameter:** The Product ID is provided as "807ZPKBL9V".
3.  **Format the function call:** Construct the JSON object with the function name and the parameter.
    ```json
    {"name": "get_product_price_by_PID", "parameters": {"PID": "807ZPKBL9V"}}
    ```
4.  **Output the function call:** Respond with only the JSON object.Okay, I can help with that. What is the Product ID of the item you're interested in?thought
The user is asking for the price of the product with the Product ID 807ZPKBL9V. I can use the `get_product_price_by_PID` function to find the price.

1.  **Identify the function:** The user is asking for the product price given the Product ID. The `get_product_price_by_PID` function is designed for this purpose.
2.  **Extract the parameter:** The Product ID is provided as "807ZPKBL9V".
3.  **Format the function call:** Construct the JSON object with the function name and the parameter.
    ```json
    {"name": "get_product_price_by_PID", "parameters": {"PID": "807ZPKBL9V"}}
    ```
4.  **Output the function call:** Respond with only the JSON object.Okay, I can help

reproduced code snippet :

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "google/medgemma-27b-text-it"

model = AutoModelForCausalLM.from_pretrained(
  model_id,
  torch_dtype=torch.bfloat16,
  device_map="auto",
  #attn_implementation="flash_attention_2",
)
tokenizer = AutoTokenizer.from_pretrained(model_id)

fc_prompt = '''
You have access to functions. If you decide to invoke any of the function(s),
you MUST put it in the format of
{"name": function name, "parameters": dictionary of argument name and its value}
[
  {
    "name": "get_product_name_by_PID",
    "description": "Finds the name of a product by its Product ID",
    "parameters": {
      "type": "object",
      "properties": {
        "PID": {
          "type": "string"
        }
      },
      "required": [
        "PID"
      ]
    }
  },
  {
    "name": "get_product_price_by_PID",
    "description": "Finds the price of a product by its Product ID",
    "parameters": {
      "type": "object",
      "properties": {
        "PID": {
          "type": "string"
        }
      },
      "required": [
        "PID"
      ]
    }
  }
]

You SHOULD NOT include any other text in the response if you call a function.
'''

messages = [
    {
        "role": "system",
        "content": f"SYSTEM INSTRUCTION: think silently if needed. You are a helpful medical assistant."
    },
    {
        "role": "user",
        "content": f"{fc_prompt} While browsing the product catalog, I came across a product that piqued my interest. The product ID is 807ZPKBL9V. Can you help me find the name of this product?"
    }
]

inputs = tokenizer.apply_chat_template(
  messages,
  add_generation_prompt=True,
  tokenize=True,
  return_dict=True,
  return_tensors="pt",
).to(model.device)

input_len = inputs["input_ids"].shape[-1]

with torch.inference_mode():
  generation = model.generate(**inputs, max_new_tokens=1500, do_sample=False)
  generation = generation[0][input_len:]

decoded = tokenizer.decode(generation, skip_special_tokens=True)
breakpoint() # decoded.split('thought\n')[-1]
print(decoded)

@dgolden1
Any suggestion will be appreciated!!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions