Skip to content

Use tiktoken.el for token counting of openai's models #14

@zkry

Description

@zkry

Hello!

I noticed that one of the methods for the providers is llm-count-tokens which currently does a simple heuristic. I recently wrote a port of tiktoken that could add this functionality for at least the OpenAI models. The implementation in llm-openai.el would essentially look like the following:

(require 'tiktoken)
(cl-defmethod llm-count-tokens ((provider llm-openai) text)
  (let ((enc (tiktoken-encoding-for-model (llm-openai-chat-model provider))))
    (tiktoken-count-tokens enc text)))

There would be some design questions like should it use the chat-model or the embedding-model when doing this. Like maybe it would first try to count with the embedding-model if it exists, otherwise the chat-model, with some default.

Definitely let me know your thoughts and I could have a PR up for it along with any other required work.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions