Releases · Nexesenex/croco.cpp

07 Oct 02:46

Nexesenex

CCPP_v1.99003_b6387-75_IKLpr642_RMv1.17.291m

7cd5012

Croco.Cpp_v1.99003_b6387-75_IKLpr642_RMv1.17.291m Latest

Latest

WIP - I'm trying to update Croco.Cpp (the version supporting the second, third, and trellis gen of IQ_K quants).

A lot of LCPP commits are lost in the process, but well, can't do better than that in a reasonable time.

I had to ditch a lot of Cuda, KV cache and Graph updates, as well as GLM4.5, GPT OSS, and Nemotron v2 to move forward. Maybe I'll get GPT-OSS back from IK_Llama, Maybe.

For a more reliable and up to date fork of Kobo, very close to Esobold but with a some perks of Croco, use EsoCrok (available in the GitHub Actions), but lose the aforementioned IK Quants.

Full Changelog: v1.97060_b6110_IKLpr642_RMv1.14.9m...CCPP_v1.99003_b6387-75_IKLpr642_RMv1.17.291m

Assets 3

29 Sep 19:26

Nexesenex

v1.99420_b6636-6_Q6-IO2346_RMv1.17.99m

f230006

EsoCroK v1.99420_b6636-6_Q6-IQ23456K_RMv1.17.99m

Linux : https://github.com/Nexesenex/croco.cpp/actions/runs/18103635230/artifacts/4134862663

Assets 2

29 Sep 02:35

Nexesenex

v1.99410_b6609-6_Q6-IO2346_RMv1.17.99m

fdad5eb

EsoCroK v1.99410_b6609-6_Q6-IQ23456K_RMv1.17.99m

Nothing special here, just an updated version of the previous release.
Cuda 12.9 (Ampere tested, also compiled for Maxwell, Pascal and Turing).
JG's recent works on Cuda FA is not included, to retain compatibility with Q6_0 and IQ_K quants, until I eventually find the missing bit of code to make everything work as it should.

Linux : https://github.com/Nexesenex/croco.cpp/actions/runs/18084083530/artifacts/4128237663

Assets 3

16 Aug 12:50

Nexesenex

v1.98035_b6178_RMv1.15.91m

33f6b02

EsoCroK v1.98035_b6178_RMb1.15.91m

Beyond concedo's work and the new LCPP commits up to b6178..

a fix for the SWA ago I borked I don't know how.
More FA KV cache submodes formally activated in the Cuda FA files.
Some little clean-up of my added code (in koboldcpp.py, makefile)
PDF features activated because they can be compiled in the workflows (not on my machine).

Download directly in workflows : https://github.com/Nexesenex/croco.cpp/actions

The Windows build should work.
The Linux ones are to be tested, I don't run this system.

Assets 2

14 Aug 01:22

Nexesenex

1.98020_b6150_RMb1.15.91m

8c55e2b

EsoCroK v1.98020_b6150_RMb1.15.91m

Nothing new, except Concedo's and Jaxxks's last work, and LCPP b6150

Builds :

Without extra KV cache :

Windows old pc : https://github.com/Nexesenex/croco.cpp/actions/runs/16952122435/artifacts/3760814737

Linux old pc : https://github.com/Nexesenex/croco.cpp/actions/runs/16952128325/artifacts/3760665227

As well as the linux build down there.

With extra KV cache (down there) :

CPU build (includes Vulkan)

Cuda build.

Assets 5

10 Aug 17:45

Nexesenex

v1.98000_b6123_RMv1.15.9m

6754f92

EsoCroK v1.98000_b6123_RMb1.15.9m

Ntohing new compared to last, beyond Concedo's additional work.

Assets 4

10 Aug 03:24

Nexesenex

v1.97300_b6123_RMv1.15.9m

dc1191a

EsoCroK v1.97300_b6123_RMb1.15.9m

Added:

Support for IQ_K quants IQ2_K, IQ3_K, IQ4_K, IQ5_K, IQ6_K on CPU and CUDA, including Cuda MMQ Kernels.

The CPU release also contains the CLBlast and Vulkan backends, which are not updated with Q6_0 and the IQ_K quants.

Assets 4

08 Aug 23:11

Nexesenex

v1.97200_b6119_RMv1.14.9m

b3419c5

EsoCroK v1.97200_b6119_RMv1.14.9m

Initial release of EsoCroK.

A rebase of Croco on the last Esobold version (up to the 20270807), with the basics of Croco.

IQ4_NL activated for KV Cache.
IK's Q6_0 integrated, and adapted as best as I could for the CUDA backend. (I think it works lol).
20 or so KV modes, including those reliant on IQ4_NL and Q6_0, both for the main model and a draft model.
A few optimisations (mostly IK's).
Vast range of context steps in GUI.
Loosened GGUF restrictions.
Some half-baked additional chat templates and prompts.
And most importantly, a plain compatibility with GLM 4.5 Air and OpenAI GPT OSS.

Reason for this alternative Croco:

Too much mess in my previous merges to come back in line to offer compatibility for GLM 4.5 and OpenAI OSS in a reasonable amount of time.
Q6_0 is the most important missing quant in Llama Mainline, to quantize the ffn_down of Qwen and GLM 4.5 with a high quality/size ratio, and the various KV Quants are the most interesting feature of my fork after the compatibility with IK_Quants.
Too much bugs for the sane user to handle, the code needed a purge.
Recover the long-lost compatibility with the different backends, including makefile builds, HIP and Vulkan.

What's next?

The first gen of IQ_K Quants (their template is similar to the mainline quant, the main job is to factor properly the CUDA MMQ Kernel beyond the shuffle of the files (Croco is still a viable base for that par). Can do, maybe will.
The second gen of IQ_K Quants and the Trellis quants have a slightly modified template, and I might need the help of a dev familiar with Johannes Gaessler or Ik's work to port ONE 2nd gen IQ_K quant (preferably IQ4_KS) and ONE Trellis quant (preferably IQ2_KT) to the current llama.cpp mainline so I'm able to reproduce the port on the others.

OR.

Keep reworking my Croco despite the growing delta with both mainline and IK_Llama, the fate of an Hybrid.

Didn't decide myself yet.

Anyway, enjoy EsoCroK!

Assets 4

07 Aug 13:10

Nexesenex

v1.97060_b6110_IKLpr642_RMv1.14.9m

53b5d3e

Croco.Cpp v1.97060_b6110_IKLpr642_RMv1.14.9m

WIP, as usual.

GLM 4.5 Air (and probably the 355b) works, at least for the mainline llama.cpp quants. The IK quants do not work yet properly on this model, I need to sort this out later.
No GPT-OSS yet, I need to manually merge due to the divergences between my fork and llama.cpp mainline in the CPU and CUDA backends. For later, so.

Note : Apparently, even the GLM mainline quants is borked on Croco at high context. To be investigated.

Assets 4

29 Jul 03:55

Nexesenex

v1.97020_b6014_IKLpr624_RMv1.14.9m

2a8f9fa

Croco.Cpp v1.97020_b6014_IKLpr624_RMv1.14.9m

Croco.Cpp v1.97020_b6014_IKLpr624_RMv1.14.9m

Assets 4

Releases: Nexesenex/croco.cpp

Croco.Cpp_v1.99003_b6387-75_IKLpr642_RMv1.17.291m

Uh oh!

EsoCroK v1.99420_b6636-6_Q6-IQ23456K_RMv1.17.99m

Uh oh!

EsoCroK v1.99410_b6609-6_Q6-IQ23456K_RMv1.17.99m

Uh oh!

EsoCroK v1.98035_b6178_RMb1.15.91m

Uh oh!

EsoCroK v1.98020_b6150_RMb1.15.91m

Uh oh!

EsoCroK v1.98000_b6123_RMb1.15.9m

Uh oh!

EsoCroK v1.97300_b6123_RMb1.15.9m

Uh oh!

EsoCroK v1.97200_b6119_RMv1.14.9m

Uh oh!

Croco.Cpp v1.97060_b6110_IKLpr642_RMv1.14.9m

Uh oh!

Croco.Cpp v1.97020_b6014_IKLpr624_RMv1.14.9m

Uh oh!