Mojo struct

Q6_KEncoding

The Q6_K quantization encoding.

Because this holds the quantized data in a special packing format, it currently does not print float values at runtime—it's just a bag of bits in uint8 format.

Implemented traits

AnyType, QuantizationEncoding

Methods

`quantize`

static quantize(tensor: Tensor[float32]) -> Tensor[uint8]

Quantizes the full-precision tensor tensor to Q6_K.

Args:

tensor (Tensor[float32]): Full-precision tensor to quantize. The innermost dimension of the tensor must be a factor of 256.

Returns:

Quantized Q6_K tensor. The tensor datatype is uint8 because this is simply a bytes buffer. Each scalar is actually stored with 6 bits.

Raises:

If the last dimension size is not a factor of 256.

`id`

static id() -> String

Identifier for the Q6_K quantized encoding.

Was this page helpful?

Thank you! We'll create more content like this.

Thank you for helping us improve!
If you'd like to share more information, please report an issue on GitHub

😔 What went wrong?

Implemented traits​

Methods​

quantize​

id​

Implemented traits

Methods

`quantize`

`id`