How to create a Managed Inference Job (CosmicAC web interface)

Create a Managed Inference Job in the CosmicAC web interface, then call your model.

Create a Managed Inference Job from the web UI. You set the basics, select a model, configure the endpoint and hardware, then launch the job. The Job configuration reference describes every field.

Prerequisites

You need the following before you start.

A running CosmicAC deployment. See Installation.
Access to the CosmicAC web UI.

Open the new job form

On the Jobs page, click New Job.

Enter the basics and select Managed Inference

In the Basics section, fill in the Job name, Location, and Tags. In the What kind of job? section, select Managed Inference.

Select a model

In the Model to serve section, select a Model. Adjust the serving configuration as needed.

To find a model, browse the Hugging Face model hub or the vLLM supported models list.

Configure the endpoint and choose hardware

In the Endpoint section, set the Endpoint name and Replicas, and keep Require Authorization header enabled. In the Hardware section, select a GPU and the GPU count.

Review and create the job

In the Review & launch section, confirm the job spec is valid, then click Create job.

Open the endpoint

Wait for the job to go live, then click Open endpoint.

Call your model

Copy the endpoint URL, then send a request with your API key in the Authorization header. Use the example request shown on the endpoint as a starting point.