How to connect to a Managed Inference endpoint

Send a chat-completion request to a Managed Inference endpoint from the CLI or an OpenAI-compatible client.

Send a chat-completion request to a Managed Inference endpoint from the CLI or any OpenAI-compatible client.

Prerequisites

You need the following before you start.

A running Managed Inference Job with a serving endpoint. See Create a Managed Inference Job.
Your Managed Inference API key, if the endpoint requires an authorization header.
The CosmicAC CLI installed and configured, for the CLI method. See Install the CLI.

Get your connection details

From the endpoint, note the endpoint ID, the model, and your API key. For curl or the SDK, you also need your deployment's inference URL.

Send a request

Use the CLI or any OpenAI-compatible client.

cosmicac inference chat \
  --endpoint-id <endpoint-id> \
  --model <model> \
  --api-key <api-key> \
  --message "Hello"

Omit --message for an interactive session, or add --stream for streaming output.

The endpoint returns the model's reply.

How to connect to a Managed Inference endpoint

Prerequisites

Get your connection details

Send a request

Next steps

On this page