Gemini CLI: A Comprehensive Quick Start Guide
Welcome to the command-line interface for Google's powerful Gemini family of models. This guide provides a comprehensive, step-by-step walkthrough for developers, researchers, and enthusiasts who want to harness the capabilities of Gemini directly from their terminal. Interacting with Gemini via a CLI is perfect for automation, scripting, quick queries, and integrating AI into your existing development workflows without leaving the keyboard.
This guide is split into two main parts:
- The Interactive Gemini CLI: This is the recommended starting point. We will walk through installing and using the official, open-source
gemini-clitool, which provides a rich, chat-like experience in your terminal. - Direct API Access with
curl: For more advanced users and scripting scenarios, we will cover how to interact with the Gemini API directly usingcurlandgcloudfor authentication. This method is ideal for automation and integration into other applications.
Part 1: The Interactive Gemini CLI (Recommended Method)
The official gemini-cli is an open-source project from Google that offers the most feature-rich and user-friendly way to chat with Gemini models. It supports conversation history, context awareness, and even has its own set of tools, like the ability to read files or search the web.
Prerequisites
- Node.js: You must have Node.js version 20 or higher installed on your system. You can download it from the official Node.js website.
- Google Cloud Account: You will need a Google Cloud account and a project with the Vertex AI API enabled.
Step 1: Enable the Vertex AI API
Before you can use Gemini, you must enable the Vertex AI API for your Google Cloud project.
- Go to the Vertex AI API page in the Google Cloud Console.
- Select the Google Cloud project you want to use.
- Click the Enable button. If you are prompted, you may also need to enable billing for your project.
Step 2: Install and Authenticate the gcloud CLI
The gemini-cli tool uses the gcloud command-line tool for secure authentication. This is a best practice known as Application Default Credentials (ADC).
-
Install
gcloud: Follow the official instructions to install the Google Cloud CLI on your operating system. -
Authenticate with ADC: Run the following command in your terminal. This will open a web browser, ask you to log in to your Google account, and grant permissions to the SDK.
gcloud auth application-default login
This command securely stores your credentials on your local machine, allowing tools like gemini-cli to use them automatically without you having to handle API keys directly.
Step 3: Run the Gemini CLI
There is no permanent installation step required for the gemini-cli itself. You can run it directly using npx, which is included with Node.js.
npx https://github.com/google-gemini/gemini-cli
This command fetches and runs the latest version of the gemini-cli tool.
On the first launch, the tool will prompt you to choose an authentication method. Select the Vertex AI option. It will automatically detect and use the credentials you set up in the previous step. You will also be asked to select your Google Cloud project and a region.
Step 4: Interactive Usage
Once running, you will see a > prompt. You can now start conversing with Gemini. The CLI maintains conversation history, so you can ask follow-up questions.
Basic Conversation:
>What are the key features of the Gemini 1.5 Pro model?
Code Generation:
>Write a Python script that uses the requests library to download the content of a webpage and save it to a file named "output.html".
Using Built-in Tools:
The gemini-cli has powerful built-in capabilities. It can read local files to add them to the context of the conversation.
-
First, create a file named
requirements.txtwith the following content:fastapi
uvicorn -
Now, in the Gemini CLI, ask a question that references the file:
>Read therequirements.txtfile and write a basicmain.pyfile to create a FastAPI server with a single "/" endpoint that returns{"hello": "world"}.
Because the CLI can read the file, it understands the context and can generate a relevant, working code example.
This interactive tool is the best way to get started with Gemini for most development tasks, debugging sessions, and general exploration.
Part 2: Direct API Access with curl and gcloud
For advanced use cases, such as automation, scripting, or integrating Gemini into a larger application, you may need to call the Gemini API directly. This method gives you full control over the request and response, without the interactive UI.
When to Use This Method
- Scripting: You want to write a shell script that calls the Gemini API.
- Automation: You need to integrate Gemini into a CI/CD pipeline or another automated workflow.
- No Node.js: Your environment does not have Node.js, making the interactive CLI unavailable.
- Full Control: You want to manually specify every parameter of the API request.
The Core Command Structure
The fundamental tool for this method is curl, a standard command-line utility for making HTTP requests. The key is to use gcloud to dynamically generate a temporary authentication token.
Here is the template for a curl command to the Gemini API:
curl \
-X POST \
-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json" \
https://us-central1-aiplatform.googleapis.com/v1/projects/YOUR_PROJECT_ID/locations/us-central1/publishers/google/models/gemini-1.5-pro-preview-0409:streamGenerateContent -d \
'{
"contents": {
"role": "user",
"parts": [{"text": "Why is the sky blue?"}]
}
}'
Deconstructing the Command:
curl -X POST: We are sending data, so we use thePOSTHTTP method.-H "Authorization: Bearer $(...)": This is the authentication header. The$(gcloud auth application-default print-access-token)part is a command substitution. Your shell first runs thisgcloudcommand, which securely generates and prints a short-lived access token.curlthen uses this token to authenticate your API request.-H "Content-Type: application/json": We are telling the API that the data we are sending is in JSON format.https://...: This is the API endpoint URL. You must replaceYOUR_PROJECT_IDwith your actual Google Cloud project ID. The model is specified in the URL (gemini-1.5-pro-preview-0409).-d '{...}': This is the data payload. It's a JSON object containing the prompt.contents: The main container for your prompt.parts: An array containing the different parts of your prompt. For simple text, it's an array with a single object.text: The actual prompt text.
Scripting Example with jq
Let's create a shell script that takes a filename as an argument, sends its content to Gemini to be summarized, and prints only the resulting text.
This script requires jq, a popular command-line JSON processor. You may need to install it (sudo apt-get install jq or brew install jq).
summarize.sh
#!/bin/bash
# Check if a filename was provided
if [ -z "$1" ]; then
echo "Usage: $0 <filename>"
exit 1
fi
# Read the file content and escape it for JSON
PROMPT_TEXT=$(cat "$1")
JSON_PAYLOAD=$(jq -n --arg text "Summarize the following text: $PROMPT_TEXT" \
'{contents: {parts: [{text: $text}]}}')
# Your Google Cloud Project ID
PROJECT_ID="your-gcp-project-id"
# Call the API and parse the response with jq
curl -s \
-X POST \
-H "Authorization: Bearer $(gcloud auth application-default print-access-token)" \
-H "Content-Type: application/json" \
"https://us-central1-aiplatform.googleapis.com/v1/projects/$PROJECT_ID/locations/us-central1/publishers/google/models/gemini-1.0-pro:generateContent" -d "$JSON_PAYLOAD" | \
jq -r '.candidates[0].content.parts[0].text'
Make the script executable (chmod +x summarize.sh) and run it: ./summarize.sh my_document.txt.
Advanced Example: Multimodal Prompts
One of Gemini's key strengths is multimodality. You can send both images and text in the same prompt. To do this with curl, you must provide the image data as a base64-encoded string.
Steps:
-
Base64-encode your image:
base64 my_image.jpg > my_image.b64 -
Construct the JSON payload: The
partsarray will now contain two objects: one for the text and one for the image data.{
"contents": {
"parts": [
{"text": "What is happening in this image?"},
{
"inline_data": {
"mime_type": "image/jpeg",
"data": "$(cat my_image.b64)"
}
}
]
}
}
You would then embed this JSON in your curl command. This demonstrates the full power and flexibility of direct API access.
Conclusion
You now have two powerful methods for interacting with Gemini from the command line. For daily development, the interactive gemini-cli offers a rich, user-friendly experience. For automation and deep integration, direct API access with curl provides unlimited flexibility.
From here, you can explore more advanced topics in the official Google Cloud documentation, such as tuning model parameters (temperature, top-p) and using function calling capabilities.