A Python tool for processing CSV and Google Sheets data using Perplexity and OpenAI APIs. This tool supports both CLI and API modes, allowing for flexible integration into various workflows.
- Process CSV files and Google Sheets
- Support for both Perplexity and OpenAI APIs
- Interactive CLI mode
- RESTful API mode
- Asynchronous processing for better performance
- Column selection and custom prompts
- Automatic output column naming
- Clone the repository:
git clone <repository-url>
cd <repository-name>- Create and activate a virtual environment:
python -m venv env
source env/bin/activate # On Windows: env\Scripts\activate- Install dependencies:
pip install -r requirements.txtCreate a .env file with your API keys and Google Sheets credentials:
PERPLEXITY_API_KEY=your_perplexity_api_key
OPENAI_API_KEY=your_openai_api_key
GOOGLE_SHEETS_CREDENTIALS_PATH=path/to/your/credentials.json- Interactive mode:
python -m src- Command line mode:
python -m src -i "https://docs.google.com/spreadsheets/d/YOUR_SHEET_ID" -c "col1,col2" -p "Your prompt"Options:
-i, --input: Input CSV file or Google Sheets URL-c, --columns: Comma-separated list of columns to use-p, --prompt: Prompt to process--credentials: Path to Google Sheets credentials JSON file--openai: Use OpenAI instead of Perplexity--output-column-name: Name for the output column
- Start the server:
python -m src api- API Endpoints:
POST /initialize: Initialize sheet connectionGET /columns: Get available columnsPOST /process: Process data with configuration
Example API usage:
import requests
# Initialize sheet
response = requests.post("http://localhost:8000/initialize", json={
"input_source": "https://docs.google.com/spreadsheets/d/YOUR_SHEET_ID",
"credentials_path": "path/to/credentials.json"
})
# Get columns
columns = requests.get("http://localhost:8000/columns").json()
# Process data
response = requests.post("http://localhost:8000/process", json={
"columns": ["col1", "col2"],
"prompt": "Your prompt",
"use_openai": False,
"output_column_name": "result"
})The project structure is organized as follows:
src/
├── __init__.py # Main entry point
├── clients/ # API clients
├── models/ # Data models
├── services/ # Business logic
├── utils/ # Utilities
├── cli/ # CLI interface
└── api/ # API interface
MIT License