Skip to content

Evaluate models  #11

@betogaona7

Description

@betogaona7

Run the evaluation and get final scores (500 prompts).

  • airoboros-13B-GPTQ
  • bluemoonrp-13b
  • gpt4-x-vicuna-13B-GPTQ
  • GPT4All-13B-snoozy-GPTQ
  • koala-13B-GPTQ-4bit-128g - Very poor performance
  • Llama-2-13B-GPTQ
  • Manticore-13B-GPTQ
  • Metharme-13b-4bit-GPTQ
  • Nous-Hermes-13B-GPTQ
  • stable-vicuna-13B-GPTQ - Very poor performance
  • vicuna-7B-GPTQ-4bit-128g - Corrupted model.
  • open_llama_3b_4bit_128g - Corrupted model.
  • guanaco-33B-GPTQ
  • h2ogpt-oasst1-512-30B-GPTQ
  • tulu-30B-GPTQ * this may require more memory, can't process to prompts sequentially atm
  • WizardLM-30B-Uncensored-GPTQ

Metadata

Metadata

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions