Gemini CLI: Comprehensive Reference & Cheat Sheet
This document is a high-density, comprehensive reference for interacting with Google's Gemini models via the command line. It is intended for users who are already familiar with the basics and need a quick way to look up specific commands, API parameters, and advanced configurations.
This guide is structured in two parts:
- Interactive
gemini-cliReference: Covers the commands and features of the official, open-sourcegemini-clitool. - Vertex AI API Reference: Details the raw API endpoints, request/response bodies, and parameters for direct access using tools like
curl.
Part 1: Interactive gemini-cli Reference
This section covers the official gemini-cli tool, which provides a rich, conversational experience in the terminal.
Installation & Execution
The tool is run directly using npx, which requires Node.js v20+.
# Run the latest version of the Gemini CLI
npx https://github.com/google-gemini/gemini-cli
Authentication
Authentication is handled via the gcloud CLI and Application Default Credentials (ADC).
# Log in and set up ADC for your machine
gcloud auth application-default login
# Set your active Google Cloud project
gcloud config set project YOUR_PROJECT_ID
In-Tool Commands
Once the CLI is running, you can use these commands inside the > prompt.
| Command | Description |
|---|---|
/help | Displays a list of available commands and tools. |
/history | Shows the conversation history for the current session. |
/clear | Clears the current terminal screen and conversation history. |
/auth | Restarts the authentication flow to switch Google Cloud projects or authentication methods. |
/quit | Exits the Gemini CLI application. Ctrl+C also works. |
Built-in Tools
The interactive CLI comes with powerful tools that give it context about your local environment and the web.
| Tool | Usage | Description |
|---|---|---|
@file | @file path/to/your/file.js | Reads the content of a local file and adds it to the context of your prompt. You can reference multiple files. This is essential for asking questions about your code. |
@web | @web "latest news on AI" | Performs a web search and adds the results to the context. This allows the model to answer questions about current events or topics not in its training data. |
Example using tools:
>@file src/api.ts @file src/database.ts Based on these files, what could be causing the latency issue?
Configuration
The interactive gemini-cli is designed to be zero-config. Configuration, such as the Google Cloud project and region, is handled through the initial prompts on first run. To change these settings, you can use the /auth command to re-initialize the configuration.
Part 2: Vertex AI Gemini API Reference (curl)
This section provides a detailed reference for interacting directly with the Vertex AI Gemini API endpoint. This method is ideal for scripting, automation, and integration into other applications.
API Endpoint Structure
The generic endpoint for the Gemini API is:
https://{region}-aiplatform.googleapis.com/v1/projects/{project_id}/locations/{region}/publishers/google/models/{model_id}:{method}
{region}: The Google Cloud region for your request (e.g.,us-central1).{project_id}: Your Google Cloud Project ID.{model_id}: The specific model you want to use.gemini-1.5-pro-preview-0409(Latest Pro model)gemini-1.5-flash-preview-0514(Fastest Pro model)gemini-1.0-pro-vision(For multimodal prompts)gemini-1.0-pro(General purpose)
{method}: The API method to call.generateContent: For single-turn, non-streaming responses.streamGenerateContent: For streaming responses.
Authentication Token
Use gcloud to print a short-lived access token for the Authorization header.
# Command to generate the bearer token
gcloud auth application-default print-access-token
Master API Request Body
Below is a comprehensive example of a JSON request body, demonstrating most of the available top-level objects.
{
"contents": [
{
"role": "user",
"parts": [
{"text": "What is the weather like in Boston?"}
]
}
],
"tools": [
{
"function_declarations": [
{
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "OBJECT",
"properties": {
"location": {"type": "STRING", "description": "The city and state, e.g. San Francisco, CA"}
},
"required": ["location"]
}
}
]
}
],
"safetySettings": [
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"threshold": "BLOCK_LOW_AND_ABOVE"
}
],
"generationConfig": {
"temperature": 0.4,
"topP": 1.0,
"maxOutputTokens": 2048,
"response_mime_type": "application/json"
}
}
generationConfig Parameters
This object controls the generative output of the model.
| Parameter | Type | Description |
|---|---|---|
temperature | number | Controls randomness. Lower values (e.g., 0.2) are more deterministic. Higher values (e.g., 1.0) are more creative. Range: [0.0, 2.0] |
topP | number | Nucleus sampling. The cumulative probability of tokens to consider. Range: [0.0, 1.0] |
topK | integer | Top-k sampling. The number of most likely tokens to consider. |
maxOutputTokens | integer | The maximum number of tokens to generate in the response. |
stopSequences | array of strings | A list of sequences that will cause the model to stop generating. e.g., ["\n\n"] |
response_mime_type | string | Sets the output format. Use "application/json" to force the model to generate a valid JSON object. |
safetySettings Parameters
This object allows you to adjust the content safety filters.
Categories (category):
HARM_CATEGORY_HARASSMENTHARM_CATEGORY_HATE_SPEECHHARM_CATEGORY_SEXUALLY_EXPLICITHARM_CATEGORY_DANGEROUS_CONTENT
Thresholds (threshold):
BLOCK_NONE: Blocks nothing (with exceptions for severe harm).BLOCK_ONLY_HIGH: Blocks content with a high probability of being harmful.BLOCK_MEDIUM_AND_ABOVE: (Default) Blocks medium and high probability.BLOCK_LOW_AND_ABOVE: Blocks low, medium, and high probability.
tools and Function Calling
To enable function calling, provide a tools object containing function_declarations. The model will not execute the function, but will return a functionCall object in its response, which your code can then use to execute the function.
API Response Body Structure
A successful response from the generateContent endpoint will look like this:
{
"candidates": [
{
"content": {
"role": "model",
"parts": [
{"text": "The model's response text goes here."}
]
},
"finishReason": "STOP",
"safetyRatings": [
{"category": "HARM_CATEGORY_...", "probability": "NEGLIGIBLE"}
]
}
],
"usageMetadata": {
"promptTokenCount": 15,
"candidatesTokenCount": 25,
"totalTokenCount": 40
}
}
candidates: An array of possible responses. Usually contains one.finishReason: Why the model stopped.STOPis a normal completion.MAX_TOKENSmeans it hit the limit.SAFETYmeans it was blocked.safetyRatings: A report on the safety assessment of the response.usageMetadata: The number of tokens used for the prompt and response.
Multimodality Request Payloads
To send images or other non-text data, add more objects to the parts array.
Image via Base64:
{
"text": "Describe this image:",
"inline_data": {
"mime_type": "image/jpeg",
"data": "/9j/4AAQSkZJRgABAQ..."
}
}
File via Google Cloud Storage:
{
"text": "Summarize this PDF document:",
"file_data": {
"mime_type": "application/pdf",
"file_uri": "gs://your-bucket-name/document.pdf"
}
}
This reference provides the core details needed for advanced and scripted interactions with the Gemini API. For the most current list of models and parameters, always consult the official Google Cloud Vertex AI documentation.