Job Configuration Reference
Fields you set when you create a GPU Container Job or a Managed Inference Job.
These are the fields you set when you create a job, in the web UI or with cosmicac jobs create. The job type determines which parameters apply. For the create flow, see Create a GPU Container Job and Create a Managed Inference Job.
Most fields appear in the web UI create form. Fields marked CLI only have no web UI control, and you set them from the CLI.
Common fields
These fields apply to every job type.
| Field | Required | Description |
|---|---|---|
| Job name (Basics) | Yes | Job name. |
| Tags (Basics) | Yes | One or more labels for the job. The CLI accepts a comma-separated list. |
| Location (Basics) | Yes | Location code for where the job runs, for example us or IN. |
| Cost limit | No | Maximum cost for the job. CLI only. |
| Alerts | No | Alert settings for the job. CLI only. |
GPU configuration
These fields select the hardware for the job. In the web UI, they appear in the Hardware section.
| Field | Required | Description |
|---|---|---|
| GPU | Yes | GPU model, for example GH100_H100_SXM5_80GB. |
| GPU count | Yes | Number of GPUs. Must be a positive number. |
| CUDA / driver | No | GPU driver version, for example CUDA 12.8. |
| CPU cores per GPU | No | CPU cores allocated per GPU. CLI only. |
| Memory GB per GPU | No | Memory in GB allocated per GPU. CLI only. |
GPU Container parameters
These fields apply to a GPU Container Job. In the web UI, they appear in the Image & access section.
| Field | Required | Description |
|---|---|---|
| Base OS image | Yes | Base OS image for the container, for example Ubuntu22.04/CUDA12.8.1. |
| Disk (GB) | Yes | Root disk size in GB. Must be a positive number. |
Managed Inference parameters
These fields apply to a Managed Inference Job. In the web UI, they appear in the Model to serve, Serving configuration, and Endpoint sections.
| Field | Required | Description |
|---|---|---|
| Model | Yes | Hugging Face model ID to serve, for example Qwen/Qwen3-32B. |
| Runtime image (CUDA) | Yes | Serving runtime image, for example vLLM 0.8.5. |
| Data type | Yes | Model data type, for example BF16 or Auto. |
| Quantisation | Yes | Quantization scheme, for example FP8 or INT8. |
| Tensor parallel | Yes | Tensor parallel size. Must be a positive number. |
| GPU memory utilisation | Yes | Fraction of GPU memory to use, between 0 and 1. |
| Max concurrent sequences | Yes | Maximum number of concurrent sequences. Must be a positive number. |
| Max model length | Yes | Maximum model context length. Must be a positive number. |
| Reasoning parser | Yes | Reasoning parser to apply. |
| Video & image input | Yes | Whether the model accepts multimodal input. true or false. |
| Endpoint name | Yes | Name of the inference endpoint. |
| Replicas | Yes | Number of endpoint replicas. Must be a positive number. |
| Require Authorization header | Yes | Whether callers must send an authorization header. true or false. |
Example configuration
A complete job configuration has the following structure for each job type.
GPU Container Job
{
"name": "my-gpu-job",
"tags": ["experiment"],
"location": "us",
"type": "GPU_CONTAINER",
"gpu": {
"type": "GH100_H100_SXM5_80GB",
"count": 1
},
"params": {
"base_image": "Ubuntu22.04/CUDA12.8.1",
"root_disk_size_gb": 100
}
}Managed Inference Job
{
"name": "my-inference-job",
"tags": ["inference"],
"location": "us",
"type": "MANAGED_INFERENCE",
"gpu": {
"type": "GH100_H100_SXM5_80GB",
"count": 1
},
"params": {
"model": "Qwen/Qwen3-32B",
"runtime_image": "vLLM:0.6.3 + CUDA:13.0",
"data_type": "BF16",
"quantisation": "INT8",
"tensor_parallel": 1,
"gpu_memory_utilization": 0.9,
"max_concurrent_sequences": 512,
"max_model_length": 8000,
"reasoning_parser": "default",
"multimodal": false,
"endpoint_name": "my-endpoint",
"replicas": 1,
"require_auth_header": true
}
}