OpenAI API-compatible provider
While we provide these recommendations as a starting point, finding the optimal settings for OpenAI-compatible models often requires trial and error. Every use case is unique, and you may need to experiment with different configurations to achieve the best results for your specific needs. If you’re working on a custom implementation or need help fine-tuning these settings, our Support team is here to help — just reach out and we’ll guide you through the process.
For complete data security, it’s advisable to self-host models using your own infrastructure. Although we’d suggest using one of the cloud model providers for cost-effectiveness and ease of use, you may still host your own models and connect them to AI Assistant.
While this guide focuses on connecting inference frameworks (model hosting) to AI Assistant, self-hosting setups require specialized AI/machine learning infrastructure knowledge and experience. Refer to the vLLM documentation, Hugging Face TGI, or Ollama for in-depth guidance on setting up these environments.
Service configuration file
To connect an OpenAI API-compatible provider to AI Assistant, you’ll need to create a service configuration file, as explained in the model-provider configuration guide.
Currently, we suggest using the Llama 3.1 70B Instruct model. If this model isn’t available in your region of operation, you may also use Mistral AI Mistral Large.
For an embedding model, we suggest using Mixedbread large v1 model or the Nomic v1.5 model:
version: '1' aiServices: chat: provider: name: 'openai-compat' baseUrl: 'your-inference-framework-url' apiKey: 'your-api-key' # Optional model: 'llama3.1:70b' textEmbeddings: provider: name: 'openai-compat' baseUrl: 'your-inference-framework-url' apiKey: 'your-api-key' # Optional model: 'mxbai-embed-large'
-
provider
:-
name
: The name of the provider. Set this toopenai-compat
. -
baseUrl
: The base URL of your inference framework. For example, Ollama hosts its OpenAI API-compatible endpoint athttp://localhost:11434/v1
. -
apiKey
: An optional API key if your inference framework requires it. You can set it here or as an environment variable (OPENAI_COMPAT_API_KEY
).
-
-
model
: The name of the model you want to use. For example,llama3.1:70b
for the chat service, ormxbai-embed-large
for the embedding service.