Introduction#

Soniox 7B is a powerful LLM. Learn more on our blog post. It is released under Apache 2.0 license and can be used without restrictions.

Deployment#

Soniox 7B can be used and deployed with:

Download#

Soniox-7B-v1.0: Hugging Face or zip

Conversation template#

If you are using the model with vLLM and the Chat Completions API, the correct conversation template will be applied automatically. Otherwise, make sure to use the conversation template described here.

The template to build a prompt is defined as follows:

<s>[CLS:soniox] [INST] Instruction [/INST] Model answer</s>[INST] Follow-up instruction [/INST]

Note

  • <s> and </s> are special tokens for beginning of string (BOS) and end of string (EOS), while [CLS:soniox], [INST] and [/INST] are regular strings.
  • This format should be strictly respected; otherwise, the model might generate sub-optimal outputs.

Here is an example on how to correctly tokenize instructions:

[BOS_TOKEN_ID] + tok("[CLS:soniox]") +
tok("[INST]") + tok(USER_MESSAGE_1) + tok("[/INST]") +
tok(BOT_MESSAGE_1) + [EOS_TOKEN_ID] +
...
tok("[INST]") + tok(USER_MESSAGE_N) + tok("[/INST]") +
tok(BOT_MESSAGE_N) + [EOS_TOKEN_ID]

Here the function tok should return token IDs without BOS or EOS.