Skip to main content

Documentation Index

Fetch the complete documentation index at: https://onecli.sh/docs/llms.txt

Use this file to discover all available pages before exploring further.

Overview

OneCLI connects AI agents to Google Cloud’s Vertex AI platform. Agents can call models hosted on Vertex AI, including Claude (via Anthropic’s Model Garden) and Gemini. The gateway injects Google Cloud credentials into requests to the Vertex AI API automatically. This is useful for agents that need to call AI models through your own Google Cloud project, using your billing and quotas.

Setup

1

Prepare your credentials

You need one of the following:
  • Service account key: Create a service account in the Google Cloud console with the Vertex AI User role. Download the JSON key file.
  • Authorized user credentials: Run gcloud auth application-default login on your machine and use the generated credentials file.
2

Connect in OneCLI

Open the OneCLI dashboard, go to Connections > Vertex AI, and provide:
  • Credentials: paste the JSON key or upload the credentials file
  • Project ID: your Google Cloud project ID
  • Region: the Vertex AI region (e.g., us-central1)

What agents can do

  • Send prompts to Claude models hosted on Vertex AI (Claude Sonnet, Claude Haiku, Claude Opus)
  • Send prompts to Gemini models
  • Use streaming or non-streaming inference
  • Pass structured messages with text and image inputs
  • Call models with tool/function calling enabled
  • Access any model available in your project’s Model Garden
  • Run batch prediction jobs

Controlling access with rules

Use OneCLI’s rules engine to limit what agents can do with Vertex AI. For example, you can restrict agents to specific model endpoints, or rate limit inference calls to control costs. Rules are evaluated before credential injection, so a blocked request never reaches Vertex AI.