AI Voice Assistant

Description
Installation
1. The backend service
2. The frontend service
Screenshots

Description

An easy-to-use framework for testing large language models and voice recognition systems. It provides a complete system for parsing, analyzing and responding to user voice queries in natural language.

Features

Cross-platform: Supports both web and mobile platforms
Fast communication: Utilizes streaming technologies for delivering the response as fast as possible
Silence removal: Removes silence from user voice recordings, ensuring seamless and smooth voice transcription
User-friendly interface: Provides a user-friendly interface for interacting with the system, allowing users to easily understand the system's responses
Easy to customize: Allows users to easily customize the system by adding support for new large language models

Architecture

The system is composed of two primary components: a backend service responsible for processing the user's voice input and a frontend service that provides a cross-platform and user-friendly interface for interacting with the system.

Data Flow

The user's voice is recorded on-device by the frontend application and sent to the backend service for further analysis. The backend service is responsible for removing silence from the recording with FFMPEG and pydub Python library, transcribing the user's query, and generating a response. The response is then streamed back with WebSockets to the frontend service, which is responsible for presenting the information to the user in an easy-to-understand manner, both visually and audibly. An expo-av library is used for text-to-speech playback. The system prioritizes fast communication and seamless user experience.

Technologies

External Services

As of 06.2024, the system relies on Clarin, a pan-European scientific infrastructure, for transcribing the user's voice queries and generating a response. Voice transcription is performed with the use of Whisper, OpenAI's open-source voice recognition model, while the response is generated with the use of OpenChat, an open-source LLM.

Communication with Clarin API is based on OpenAPI 3.0 specification.

Installation

As established above, the project is composed of two primary components: a cross-platform frontend service and a backend service.

In order to run both services execute the following commands in the project directory:

docker-compose up --build

Please keep in mind that a running instance of Docker Engine 20.10+ is required.

After the services are up and running, you can access the frontend service by navigating to localhost:8081 in your web browser of choice. Additionally, it can be accessed from a mobile device by navigating to {ip}:8081, where ip stands for the local IP address of the machine running the services.

The backend service

Note

In order to start the backend service, you need to set the CLARIN_API_KEY environment variable with your Clarin key. Please refer to provided .env.example file for more details. A valid environment file should be named .env and placed in the root directory of the project.

Built with FastAPI and Python 3.12
The service is available on port 8000

The frontend service

Note

In order to start the frontend service, you need to set the EXPO_PUBLIC_API_URL environment variable with the address of your running backend service instance. Please refer to provided .env.example file for more details. A valid environment file should be named .env and placed in the root directory of the project.

Warning

Currently, the web browser experience is only supported on Chrome version 125 and above.

Built with Expo and TypeScript
The service is available on port 8081

Screenshots

Example question and response	Response parsing	Choosing a large language model

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
.vscode		.vscode
apps		apps
images		images
.editorconfig		.editorconfig
.gitignore		.gitignore
.nvmrc		.nvmrc
README.md		README.md
docker-compose.yml		docker-compose.yml
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Voice Assistant

Description

Features

Architecture

Data Flow

Technologies

External Services

Installation

The backend service

The frontend service

Screenshots

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Voice Assistant

Description

Features

Architecture

Data Flow

Technologies

External Services

Installation

The backend service

The frontend service

Screenshots

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages