Use tiktoken.el for token counting of openai's models

Hello! 

I noticed that one of the methods for the providers is `llm-count-tokens` which currently does a simple heuristic. I recently wrote a [port of tiktoken](https://github.com/zkry/tiktoken.el) that could add this functionality for at least the OpenAI models. The implementation in llm-openai.el would essentially look like the following:

```
(require 'tiktoken)
(cl-defmethod llm-count-tokens ((provider llm-openai) text)
  (let ((enc (tiktoken-encoding-for-model (llm-openai-chat-model provider))))
    (tiktoken-count-tokens enc text)))
```

There would be some design questions like should it use the chat-model or the embedding-model when doing this. Like maybe it would first try to count with the embedding-model if it exists, otherwise the chat-model, with some default.

Definitely let me know your thoughts and I could have a PR up for it along with any other required work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use tiktoken.el for token counting of openai's models #14

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Use tiktoken.el for token counting of openai's models #14

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions