Multimodal Data Analysis Using OpenAI

Overview

The "Multimodal Data Analysis Using OpenAI" project provides a comprehensive framework for analyzing and processing text, audio, and image data via OpenAI's advanced models. It serves as a versatile toolset for executing complex data processing tasks including text analysis, audio transcription, and image interpretation with streamlined database interaction. This project enables users to leverage the power of OpenAI models for advanced data analysis, making it an ideal solution for a wide range of applications, from research and development to business intelligence and beyond.

Key Features

Multimodal Data Handling: Comprehensive support for text, audio, and image data modalities.
OpenAI Integration: Harnesses the power of OpenAI models for advanced data analysis.
Database Management: Efficient data management and retrieval powered by robust database utilities.
Text Analysis: Utilize openai_text.py for sentiment analysis of text or CSV files.
Audio Processing: Leverage openai_audio.py to transcribe, translate, or analyze sentiment of audio files.
Image Analysis: Use openai_image.py for captioning or answering questions about images.
Database Utilities: Query a SQLite database using natural language with openai_database.py.

Architecture / How It Works

The project architecture is based on a microservices approach, where each modality (text, audio, image) is handled by a separate module. The main.py script serves as the entry point, orchestrating the different modules and providing a unified interface for the user.

+---------------+
|  main.py     |
+---------------+
         |
         |
         v
+---------------+
|  openai_text  |
|  openai_audio|
|  openai_image|
+---------------+
         |
         |
         v
+---------------+
|  openai_database|
+---------------+

Project Structure

./
├── .env
├── .python-version
├── README.md
├── db_utils.py
├── main.py
├── openai_audio.py
├── openai_database.py
├── openai_image.py
├── openai_text.py
├── pyproject.toml
├── sync_hooks.sh
├── data/
│   ├── README.md
│   ├── data.db
│   ├── harvard.wav
├── agents/
│   ├── README.md
│   ├── agents/readme_agents/
│   │   ├── README.md
│   │   ├── generate_readme.py
│   │   ├── pre_commit_readme.py

Prerequisites

Python 3.14+
OpenAI API key
pip package manager
uv package (for MCP server)

Installation & Setup

# Clone the repository
git clone https://github.com/RampageousRJ/Multimodal-Data-Analysis-Using-OpenAI.git

# Navigate to the project directory
cd Multimodal-Data-Analysis-Using-OpenAI

# Create a virtual environment
python -m venv .venv

# Activate the virtual environment
source .venv/bin/activate

# Install the required dependencies
pip install -r requirements.txt

# Run the sync_hooks.sh script to set up the MCP server
./sync_hooks.sh

Configuration

Environment Variable	Description
`OPENAI_API_KEY`	OpenAI API key
`MCP_SERVER_PORT`	MCP server port
`MCP_SERVER_HOST`	MCP server host

Usage

Text Analysis

# Analyze single text
python openai_text.py -text "This is a great product!"

# Analyze CSV file (must have 'text' column)
python openai_text.py -path data/reviews.csv

Audio Processing

# Transcribe audio
python openai_audio.py -path data/audio.wav

# Translate audio to target language
python openai_audio.py -path data/audio.wav -translate Spanish

# Analyze sentiment of audio content
python openai_audio.py -path data/audio.wav -sentiment

Image Analysis

# Caption a local image
python openai_image.py -image_path data/image.jpg

# Answer a question about a local image
python openai_image.py -image_path data/image.jpg -question "What color is the car?"

# Analyze an image from URL
python openai_image.py -image_url https://example.com/image.jpg

Database Utilities

python openai_database.py -path data/data.db -question "Show me all users from New York"

Contributing

Contributions are welcome! Please submit a pull request with your changes and a brief description of what you've added or fixed.

License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multimodal Data Analysis Using OpenAI

Overview

Key Features

Architecture / How It Works

Project Structure

Prerequisites

Installation & Setup

Configuration

Usage

Text Analysis

Audio Processing

Image Analysis

Database Utilities

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
agents		agents
data		data
.gitignore		.gitignore
README.md		README.md
db_utils.py		db_utils.py
main.py		main.py
openai_audio.py		openai_audio.py
openai_database.py		openai_database.py
openai_image.py		openai_image.py
openai_text.py		openai_text.py
pyproject.toml		pyproject.toml
sync_hooks.sh		sync_hooks.sh
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Multimodal Data Analysis Using OpenAI

Overview

Key Features

Architecture / How It Works

Project Structure

Prerequisites

Installation & Setup

Configuration

Usage

Text Analysis

Audio Processing

Image Analysis

Database Utilities

Contributing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages