CosmicAC Logo

Deploy CosmicAC

Deploy and operate the CosmicAC Docker Compose stack.

Deploy and operate the CosmicAC stack with Docker Compose. For how CosmicAC connects to your cluster and runs jobs, see Deployment architecture.

Stack

The stack runs six application services, plus Redis and Caddy.

  • cosmicac-wrk-ork
  • cosmicac-app-node (port 3000)
  • cosmicac-proxy-inference-{http,hrpc}
  • cosmicac-wrk-server-k8s-nvidia
  • cosmicac-ui
  • redis
  • caddy (port 5173)

The CosmicAC application services use private images published under ghcr.io/tetherto/ (ghcr.io/tetherto/<repo>:<tag>). Redis and Caddy use their official public images.

Prerequisites

The recommended host is Ubuntu 22.04 or 24.04 LTS on x86_64. On that host, you need the following.

  • Docker Engine and Docker Compose v2.
  • Task.
  • The jq, node, and kubectl command-line tools.
  • The CosmicAC deployment repo, which holds the Compose files and deploy scripts.
  • GitHub Container Registry (GHCR) credentials for private images.
    • Your GitHub username.
    • A GitHub personal access token (classic) with the read:packages scope, able to pull the private ghcr.io/tetherto images. If the tetherto org enforces SSO, authorize the token for the org. See Managing your personal access tokens.
  • A GPU Kubernetes cluster that already meets the Requirements. CosmicAC connects to this cluster but doesn't set it up.
  • A valid, readable kubeconfig file for the GPU Kubernetes cluster. Bootstrap stops immediately if it is missing or invalid.

The following commands install the base packages and Task on Ubuntu.

sudo apt-get update
sudo apt-get install -y ca-certificates curl jq
sh -c "$(curl -ssL https://taskfile.dev/install.sh)" -- -d -b "$HOME/.local/bin"
export PATH="$HOME/.local/bin:$PATH"

After you install Docker Engine, Compose v2, Node.js, and kubectl from their official repositories, enable Docker and verify the tools.

sudo systemctl enable --now docker
docker compose version
task --version
node --version
kubectl version --client

Deploy the stack

Set up the environment

Clone the deployment repo and change into it.

git clone https://github.com/tetherto/cosmicac-selfhosting-scripts.git
cd cosmicac-selfhosting-scripts

Create the .env file from the example.

cp .env.example .env

Set the variables in .env. See Environment variables in the configuration reference.

Generate the service config

Run these commands from the deployment directory to generate the per-service config files.

task login        # skip if GITHUB_PAT is set in .env
task config-init

config-init writes a config file for each service under services/<repo>/config/. Open these files and set the secrets your deployment needs. For the list of keys to set, see Service config in the configuration reference.

Run the first-time bootstrap

task bootstrap deploys the whole stack with one command, using TAG from .env.

Set KUBECONFIG_SRC to the absolute host path of your kubeconfig in .env before you bootstrap.

Verify the kubeconfig source.

test -s "$KUBECONFIG_SRC"
kubectl --kubeconfig "$KUBECONFIG_SRC" config current-context
kubectl --kubeconfig "$KUBECONFIG_SRC" cluster-info

Run the bootstrap.

task bootstrap

A missing kubeconfig path stops bootstrap with an error.

For the kubeconfig structure and how bootstrap consumes it, see Kubeconfig requirements in the configuration reference.

Verify the deployment

Run these checks. All services should report Up, and the authenticated API calls should return data.

task ps                                          # all services Up
curl -s4 -o /dev/null -w '%{http_code}\n' http://127.0.0.1:5173/   # 200
TOKEN=$(curl -s4 -X POST http://127.0.0.1:5173/api/login -H 'content-type: application/json' -d "{\"email\":\"${BOOTSTRAP_ADMIN_EMAIL}\",\"password\":\"${BOOTSTRAP_ADMIN_PASSWORD}\"}" | jq -r .token)
curl -s4 "http://127.0.0.1:5173/api/auth/servers?overwrite_cache=true" -H "ttr-token: $TOKEN"
curl -s4 "http://127.0.0.1:5173/api/auth/pricing?location=IN&type=gpu&hrs=2" -H "ttr-token: $TOKEN"
curl -s4 "http://127.0.0.1:5173/api/auth/jobs?page=1&pageSize=10" -H "ttr-token: $TOKEN"

The deployment is healthy when:

  • servers lists your GPUs.
  • pricing returns a total_cost instead of an error.
  • jobs responds.
Reference responses

Healthy results: servers{"data":[{"location":"IN","gpu":{"available":8}}]}, pricing{"total_cost":2}.

Stop and start the stack

Stop the containers without deleting them.

cd cosmicac-selfhosting-scripts
task stop

Start them again.

cd cosmicac-selfhosting-scripts
task start
task ps

To remove the containers and the Compose network, run task down, then recreate them with task up. For a new or reset installation, use task bootstrap, not task up.

For how these commands differ, see Stack lifecycle in the configuration reference.

Next steps

With CosmicAC running, install the CLI and create your first job.

On this page