A Model Context Protocol (MCP) server for loading, analyzing, and visualizing CSV data. This server enables loading CSV files into memory, performing various types of data analysis through natural language queries, and creating data visualizations.
- Load CSV files with configurable options
- Automatic detection of column types
- Support for date column parsing
- Metadata storage for loaded datasets
- Mean/average calculations
- Median calculations
- Standard deviation
- Quantile/percentile analysis
- Distribution analysis
- Time series analysis
- Trend detection
- Monthly averages for date columns
- Correlation matrices for numerical columns
- Relationship detection between variables
- Value counts and distributions
- Category frequency analysis
- Missing value detection
- Outlier detection using IQR method
- Data completeness reporting
- Bar charts
- Line charts
- Scatter plots
- Pie charts
- Heatmaps
- Box plots
- Histograms
- Group-by functionality for aggregated visualizations
- Multiple column support
- Customizable options (bins, titles, etc.)
- JSON-based chart data format for easy integration
- Clone the repository:
git clone <repository-url>
cd data-parser- Install dependencies using Poetry:
poetry installAdd the server to your Claude Desktop configuration:
{
"mcpServers": {
"mcp-server-data-parser": {
"command": "/Path/to/.local/bin/uv",
"args": [
"--directory",
"path/to/directory",
"run",
"mcp-server-data-parser"
]
}
}
}Loads a CSV file into memory for analysis.
Parameters:
file_path: Path to the CSV filedataset_name: Name to reference this datasetoptions: (Optional)separator: CSV separator (default: ",")encoding: File encoding (default: "utf-8")skiprows: Number of rows to skip (default: 0)date_columns: List of column names to parse as dates
Example:
await call_tool("load-csv", {
"file_path": "data.csv",
"dataset_name": "my_data",
"options": {
"date_columns": ["transaction_date"]
}
})Analyzes loaded dataset based on natural language questions.
Parameters:
dataset_name: Name of the dataset to analyzequestion: Question about the data to analyzecolumns: (Optional) Specific columns to analyzegroup_by: (Optional) Column to group by for analysis
Example:
await call_tool("analyze-data", {
"dataset_name": "my_data",
"question": "What is the correlation between numeric columns?",
"columns": ["price", "quantity", "total"],
"group_by": "category"
})Creates various types of data visualizations.
Parameters:
dataset_name: Name of the dataset to visualizevisualization_type: Type of visualization to create- Options: "bar", "line", "scatter", "pie", "heatmap", "boxplot", "histogram"
columns: List of columns to include in visualizationgroup_by: (Optional) Column to group byoptions: (Optional)bins: Number of bins for histogramtitle: Chart title
Examples:
# Bar chart
await call_tool("visualize-data", {
"dataset_name": "sales_data",
"visualization_type": "bar",
"columns": ["revenue"],
"group_by": "region"
})
# Scatter plot
await call_tool("visualize-data", {
"dataset_name": "sales_data",
"visualization_type": "scatter",
"columns": ["price", "quantity"]
})
# Histogram
await call_tool("visualize-data", {
"dataset_name": "sales_data",
"visualization_type": "histogram",
"columns": ["revenue"],
"options": {
"bins": 30,
"title": "Revenue Distribution"
}
})-
Bar Charts
- Compare values across categories
- Show distributions of categorical data
- Display aggregated values by group
-
Line Charts
- Show trends over time
- Track changes in metrics
- Compare multiple series
-
Scatter Plots
- Identify correlations between variables
- Spot patterns and clusters
- Detect outliers
-
Pie Charts
- Show composition of a whole
- Display percentage distributions
- Compare parts of a total
-
Heatmaps
- Visualize correlation matrices
- Show patterns in dense data
- Display cross-tabulations
-
Box Plots
- Show distribution characteristics
- Identify outliers
- Compare distributions across groups
-
Histograms
- View data distributions
- Identify patterns and skewness
- Check for normality
The server understands various types of analysis questions including:
-
Statistical Analysis
- "What is the mean of the numeric columns?"
- "Show me the median values"
- "Calculate the standard deviation"
-
Distribution Analysis
- "Show me the distribution of values"
- "What are the quantiles?"
- "Show me the percentiles"
-
Temporal Analysis
- "Show me the trend over time"
- "What is the time series pattern?"
- "How does it change over time?"
-
Correlation Analysis
- "What is the correlation between columns?"
- "Show me how variables correlate"
-
Category Analysis
- "Show me category distributions"
- "What are the categorical counts?"
-
Data Quality
- "Show me missing values"
- "Find outliers in the data"
- "How many null values are there?"
- Python 3.9+
- pandas
- numpy
- matplotlib
- seaborn
- MCP SDK
To contribute or modify the server:
- Fork the repository
- Create a feature branch
- Add tests for new features
- Submit a pull request
[Your chosen license]
[Your contact information]