An Android application that integrates Google's Gemini AI to handle phone calls with real-time voice conversation capabilities. The app automatically processes incoming and outgoing calls using AI-powered natural language processing.
- AI-Powered Call Handling: Automatically answer and handle phone conversations using Gemini AI
- Real-time Voice Processing: Convert speech to text and generate natural voice responses
- Incoming Call Management: Auto-answer incoming calls with AI assistance
- Outgoing Call Support: Make calls with AI conversation capabilities
- Call Screening: Smart call filtering to block spam
- Natural Conversation: Gemini AI provides contextual, human-like responses
- Android Studio (latest stable version)
- Android SDK with minimum API level 23 (Android 6.0)
- Android device or emulator for testing
- Gemini API key from Google AI Studio
- Visit Google AI Studio
- Sign in with your Google account
- Click "Create API Key"
- Copy the generated API key (save it securely)
- Clone this repository or download the project files
- Open Android Studio
- Select "Open" and navigate to the GeminiPhoneAI folder
- Wait for Gradle sync to complete
Edit gradle.properties in the project root:
GEMINI_API_KEY=your_actual_api_key_hereThe app will prompt you to enter the API key on first launch. The key is stored securely in SharedPreferences.
- Connect your Android device or start an emulator
- Click "Run" in Android Studio or use:
./gradlew assembleDebug
- Install the APK on your device
The app requires the following permissions:
- Phone Permissions: CALL_PHONE, ANSWER_PHONE_CALLS, READ_PHONE_STATE
- Audio Permissions: RECORD_AUDIO, MODIFY_AUDIO_SETTINGS
- Network Permissions: INTERNET, ACCESS_NETWORK_STATE
- System Permissions: WAKE_LOCK, FOREGROUND_SERVICE
- Grant Permissions: When prompted, grant all required permissions
- Set as Default Dialer: Make the app your default dialer for full functionality
- Configure API Key: Enter your Gemini API key in the settings
- Test Connection: Use the "Test Gemini Connection" button to verify setup
- Enter a phone number in the main screen
- Tap "Call with AI Assistant"
- The AI will handle the conversation automatically
- Incoming calls are automatically answered by the AI
- The AI processes the conversation in real-time
- You can monitor the call status in the app
GeminiPhoneAI/
├── app/
│ ├── src/
│ │ ├── main/
│ │ │ ├── java/com/gemini/phoneai/
│ │ │ │ ├── MainActivity.kt # Main app interface
│ │ │ │ ├── services/
│ │ │ │ │ ├── GeminiConnectionService.kt
│ │ │ │ │ ├── GeminiConnection.kt
│ │ │ │ │ ├── GeminiInCallService.kt
│ │ │ │ │ └── GeminiCallScreeningService.kt
│ │ │ │ ├── gemini/
│ │ │ │ │ └── GeminiLiveClient.kt # Gemini API integration
│ │ │ │ └── audio/
│ │ │ │ └── CallAudioProcessor.kt # Audio processing
│ │ │ ├── res/
│ │ │ │ ├── layout/ # UI layouts
│ │ │ │ ├── values/ # Colors, strings, themes
│ │ │ │ └── drawable/ # Icons and graphics
│ │ │ └── AndroidManifest.xml # App configuration
│ │ └── build.gradle # Module dependencies
│ └── proguard-rules.pro # ProGuard configuration
├── build.gradle # Project configuration
├── settings.gradle # Project settings
└── gradle.properties # Build properties
Handles the telecommunication connection lifecycle for both incoming and outgoing calls.
Manages WebSocket connection to Gemini API for real-time audio/text processing.
Captures call audio, sends it to Gemini, and plays back AI-generated responses.
Manages the in-call UI and call state changes.
-
API Key Not Working
- Verify the key is correct and active
- Check your Google Cloud project has the Gemini API enabled
- Ensure you have sufficient API quota
-
Permissions Denied
- Go to Settings > Apps > Gemini Phone AI > Permissions
- Enable all required permissions manually
-
Not Set as Default Dialer
- Go to Settings > Apps > Default apps > Phone app
- Select Gemini Phone AI
-
Audio Not Working
- Check device volume settings
- Ensure microphone is not muted
- Verify audio permissions are granted
-
Connection Failures
- Check internet connectivity
- Verify firewall/proxy settings
- Ensure WebSocket connections are not blocked
- Test on real devices for best results (emulator telephony can be limited)
- Use two devices to test incoming/outgoing calls
- Monitor Logcat for debugging information
- Never commit API keys to version control
- Use ProGuard for release builds
- Implement proper error handling for production
- Consider implementing call recording consent
- Add conversation history storage
- Implement custom AI personalities
- Add multi-language support
- Create call transcription features
- Add voice customization options
- Implement advanced call screening rules
The app uses the Gemini Live API for real-time conversation:
- WebSocket endpoint:
wss://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-exp:bidiGenerateContent - Audio format: PCM 16-bit, 16kHz, mono
- Response modalities: Audio and text
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
This project is provided as-is for educational purposes. Ensure you comply with:
- Google's Gemini API Terms of Service
- Android platform policies
- Local telecommunications regulations
- Privacy and data protection laws
For issues, questions, or suggestions:
- Open an issue in the repository
- Check the documentation
- Review Google's Gemini API documentation
- Google Gemini API for AI capabilities
- Android Telecom framework
- OkHttp for WebSocket communication
- Material Design components
Note: This app is a demonstration of AI integration with Android telephony. Always ensure compliance with local laws regarding call recording and automated calling systems.