Gemini Phone AI - Android App

An Android application that integrates Google's Gemini AI to handle phone calls with real-time voice conversation capabilities. The app automatically processes incoming and outgoing calls using AI-powered natural language processing.

Features

AI-Powered Call Handling: Automatically answer and handle phone conversations using Gemini AI
Real-time Voice Processing: Convert speech to text and generate natural voice responses
Incoming Call Management: Auto-answer incoming calls with AI assistance
Outgoing Call Support: Make calls with AI conversation capabilities
Call Screening: Smart call filtering to block spam
Natural Conversation: Gemini AI provides contextual, human-like responses

Prerequisites

Android Studio (latest stable version)
Android SDK with minimum API level 23 (Android 6.0)
Android device or emulator for testing
Gemini API key from Google AI Studio

Setup Instructions

1. Get Gemini API Key

Visit Google AI Studio
Sign in with your Google account
Click "Create API Key"
Copy the generated API key (save it securely)

2. Clone and Open Project

Clone this repository or download the project files
Open Android Studio
Select "Open" and navigate to the GeminiPhoneAI folder
Wait for Gradle sync to complete

3. Configure API Key

Option 1: Build-time Configuration (Recommended for Development)

Edit gradle.properties in the project root:

GEMINI_API_KEY=your_actual_api_key_here

Option 2: Runtime Configuration

The app will prompt you to enter the API key on first launch. The key is stored securely in SharedPreferences.

4. Build and Run

Connect your Android device or start an emulator
Click "Run" in Android Studio or use:
```
./gradlew assembleDebug
```
Install the APK on your device

Required Permissions

The app requires the following permissions:

Phone Permissions: CALL_PHONE, ANSWER_PHONE_CALLS, READ_PHONE_STATE
Audio Permissions: RECORD_AUDIO, MODIFY_AUDIO_SETTINGS
Network Permissions: INTERNET, ACCESS_NETWORK_STATE
System Permissions: WAKE_LOCK, FOREGROUND_SERVICE

Usage Guide

Initial Setup

Grant Permissions: When prompted, grant all required permissions
Set as Default Dialer: Make the app your default dialer for full functionality
Configure API Key: Enter your Gemini API key in the settings
Test Connection: Use the "Test Gemini Connection" button to verify setup

Making Calls

Enter a phone number in the main screen
Tap "Call with AI Assistant"
The AI will handle the conversation automatically

Receiving Calls

Incoming calls are automatically answered by the AI
The AI processes the conversation in real-time
You can monitor the call status in the app

Project Structure

GeminiPhoneAI/
├── app/
│   ├── src/
│   │   ├── main/
│   │   │   ├── java/com/gemini/phoneai/
│   │   │   │   ├── MainActivity.kt           # Main app interface
│   │   │   │   ├── services/
│   │   │   │   │   ├── GeminiConnectionService.kt
│   │   │   │   │   ├── GeminiConnection.kt
│   │   │   │   │   ├── GeminiInCallService.kt
│   │   │   │   │   └── GeminiCallScreeningService.kt
│   │   │   │   ├── gemini/
│   │   │   │   │   └── GeminiLiveClient.kt  # Gemini API integration
│   │   │   │   └── audio/
│   │   │   │       └── CallAudioProcessor.kt # Audio processing
│   │   │   ├── res/
│   │   │   │   ├── layout/                  # UI layouts
│   │   │   │   ├── values/                  # Colors, strings, themes
│   │   │   │   └── drawable/                # Icons and graphics
│   │   │   └── AndroidManifest.xml         # App configuration
│   │   └── build.gradle                    # Module dependencies
│   └── proguard-rules.pro                 # ProGuard configuration
├── build.gradle                            # Project configuration
├── settings.gradle                         # Project settings
└── gradle.properties                       # Build properties

Key Components

GeminiConnectionService

Handles the telecommunication connection lifecycle for both incoming and outgoing calls.

GeminiLiveClient

Manages WebSocket connection to Gemini API for real-time audio/text processing.

CallAudioProcessor

Captures call audio, sends it to Gemini, and plays back AI-generated responses.

GeminiInCallService

Manages the in-call UI and call state changes.

Troubleshooting

Common Issues

API Key Not Working
- Verify the key is correct and active
- Check your Google Cloud project has the Gemini API enabled
- Ensure you have sufficient API quota
Permissions Denied
- Go to Settings > Apps > Gemini Phone AI > Permissions
- Enable all required permissions manually
Not Set as Default Dialer
- Go to Settings > Apps > Default apps > Phone app
- Select Gemini Phone AI
Audio Not Working
- Check device volume settings
- Ensure microphone is not muted
- Verify audio permissions are granted
Connection Failures
- Check internet connectivity
- Verify firewall/proxy settings
- Ensure WebSocket connections are not blocked

Development Notes

Testing

Test on real devices for best results (emulator telephony can be limited)
Use two devices to test incoming/outgoing calls
Monitor Logcat for debugging information

Security Considerations

Never commit API keys to version control
Use ProGuard for release builds
Implement proper error handling for production
Consider implementing call recording consent

Future Enhancements

Add conversation history storage
Implement custom AI personalities
Add multi-language support
Create call transcription features
Add voice customization options
Implement advanced call screening rules

API Reference

Gemini Live API

The app uses the Gemini Live API for real-time conversation:

WebSocket endpoint: wss://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-exp:bidiGenerateContent
Audio format: PCM 16-bit, 16kHz, mono
Response modalities: Audio and text

Contributing

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request

License

This project is provided as-is for educational purposes. Ensure you comply with:

Google's Gemini API Terms of Service
Android platform policies
Local telecommunications regulations
Privacy and data protection laws

Support

For issues, questions, or suggestions:

Open an issue in the repository
Check the documentation
Review Google's Gemini API documentation

Acknowledgments

Google Gemini API for AI capabilities
Android Telecom framework
OkHttp for WebSocket communication
Material Design components

Note: This app is a demonstration of AI integration with Android telephony. Always ensure compliance with local laws regarding call recording and automated calling systems.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.idea		.idea
app		app
gradle/wrapper		gradle/wrapper
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
README.md		README.md
SETUP_GUIDE.md		SETUP_GUIDE.md
TEST_PLAN.md		TEST_PLAN.md
build.gradle		build.gradle
create_gradle_props.ps1		create_gradle_props.ps1
gradle.properties.template		gradle.properties.template
gradlew.bat		gradlew.bat
push_to_github.bat		push_to_github.bat
quick_test.bat		quick_test.bat
settings.gradle		settings.gradle
test_api_models.py		test_api_models.py
test_api_simple.py		test_api_simple.py
test_gemini_api.py		test_gemini_api.py

Folders and files

Latest commit

History

Repository files navigation

Gemini Phone AI - Android App

Features

Prerequisites

Setup Instructions

1. Get Gemini API Key

2. Clone and Open Project

3. Configure API Key

Option 1: Build-time Configuration (Recommended for Development)

Option 2: Runtime Configuration

4. Build and Run

Required Permissions

Usage Guide

Initial Setup

Making Calls

Receiving Calls

Project Structure

Key Components

GeminiConnectionService

GeminiLiveClient

CallAudioProcessor

GeminiInCallService

Troubleshooting

Common Issues

Development Notes

Testing

Security Considerations

Future Enhancements

API Reference

Gemini Live API

Contributing

License

Support

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages