Job Configuration Reference

These are the fields you set when you create a job, in the web UI or with cosmicac jobs create. The job type determines which parameters apply. For the create flow, see Create a GPU Container Job and Create a Managed Inference Job.

Most fields appear in the web UI create form. Fields marked CLI only have no web UI control, and you set them from the CLI.

Common fields

These fields apply to every job type.

Field	Required	Description
Job name (Basics)	Yes	Job name.
Tags (Basics)	Yes	One or more labels for the job. The CLI accepts a comma-separated list.
Location (Basics)	Yes	Location code for where the job runs, for example `us` or `IN`.
Cost limit	No	Maximum cost for the job. CLI only.
Alerts	No	Alert settings for the job. CLI only.

GPU configuration

These fields select the hardware for the job. In the web UI, they appear in the Hardware section.

Field	Required	Description
GPU	Yes	GPU model, for example `GH100_H100_SXM5_80GB`.
GPU count	Yes	Number of GPUs. Must be a positive number.
CUDA / driver	No	GPU driver version, for example `CUDA 12.8`.
CPU cores per GPU	No	CPU cores allocated per GPU. CLI only.
Memory GB per GPU	No	Memory in GB allocated per GPU. CLI only.

GPU Container parameters

These fields apply to a GPU Container Job. In the web UI, they appear in the Image & access section.

Field	Required	Description
Base OS image	Yes	Base OS image for the container, for example `Ubuntu22.04/CUDA12.8.1`.
Disk (GB)	Yes	Root disk size in GB. Must be a positive number.

Managed Inference parameters

These fields apply to a Managed Inference Job. In the web UI, they appear in the Model to serve, Serving configuration, and Endpoint sections.

Field	Required	Description
Model	Yes	Hugging Face model ID to serve, for example `Qwen/Qwen3-32B`.
Runtime image (CUDA)	Yes	Serving runtime image, for example `vLLM 0.8.5`.
Data type	Yes	Model data type, for example `BF16` or `Auto`.
Quantisation	Yes	Quantization scheme, for example `FP8` or `INT8`.
Tensor parallel	Yes	Tensor parallel size. Must be a positive number.
GPU memory utilisation	Yes	Fraction of GPU memory to use, between `0` and `1`.
Max concurrent sequences	Yes	Maximum number of concurrent sequences. Must be a positive number.
Max model length	Yes	Maximum model context length. Must be a positive number.
Reasoning parser	Yes	Reasoning parser to apply.
Video & image input	Yes	Whether the model accepts multimodal input. `true` or `false`.
Endpoint name	Yes	Name of the inference endpoint.
Replicas	Yes	Number of endpoint replicas. Must be a positive number.
Require Authorization header	Yes	Whether callers must send an authorization header. `true` or `false`.

Example configuration

A complete job configuration has the following structure for each job type.

GPU Container Job

{
  "name": "my-gpu-job",
  "tags": ["experiment"],
  "location": "us",
  "type": "GPU_CONTAINER",
  "gpu": {
    "type": "GH100_H100_SXM5_80GB",
    "count": 1
  },
  "params": {
    "base_image": "Ubuntu22.04/CUDA12.8.1",
    "root_disk_size_gb": 100
  }
}

Managed Inference Job

{
  "name": "my-inference-job",
  "tags": ["inference"],
  "location": "us",
  "type": "MANAGED_INFERENCE",
  "gpu": {
    "type": "GH100_H100_SXM5_80GB",
    "count": 1
  },
  "params": {
    "model": "Qwen/Qwen3-32B",
    "runtime_image": "vLLM:0.6.3 + CUDA:13.0",
    "data_type": "BF16",
    "quantisation": "INT8",
    "tensor_parallel": 1,
    "gpu_memory_utilization": 0.9,
    "max_concurrent_sequences": 512,
    "max_model_length": 8000,
    "reasoning_parser": "default",
    "multimodal": false,
    "endpoint_name": "my-endpoint",
    "replicas": 1,
    "require_auth_header": true
  }
}