This Python application creates an interactive web-based visualization tool for exploring atmospheric aerosol properties from PACE (Plankton, Aerosol, Cloud, ocean Ecosystem). PACE intstruments are HARP2 (polarimeter), SPEXone (polarimeter), and OCI and the Microphysical Aerosol Properties from Polarimetry (MAPP) retrieval framework (Stamnes et al., 2023).
The tool displays three interconnected plots that help users analyze the spatial distribution of aerosol properties and the "quality" of the retrieval, or how well we modeled actual measurements of Intensity and Degreee of Linear Polarization (DoLP)
The application creates a dashboard with three main components:
- Geographic Map (left): Shows aerosol properties overlaid on a map
- Intensity vs. Viewing Angle Plot (middle): Displays how light intensity varies with viewing angle
- Degree of Linear Polarization (DoLP) vs. Viewing Angle Plot (right): Shows polarization measurements
When you click on a point on the map, the other two plots update to show the measured/modeled Intensity/DoLP for that specific location.
- Dash: A Python web framework for building interactive web applications
- Plotly: Creates interactive plots and visualizations
- h5py: Reads HDF5 scientific data files
- NumPy: Handles numerical computations and array operations
Before running this code, you need to create a virtual environment:
python3 -m venv myenvactivate the environment:
source myenv/bin/activateinstall the required Python packages from requirements.txt
pip install -r requirements.txtThe application is run from the command line with a directory path containing your data files:
python plotPACEMAPP_plotly.py --directory '/path/to/your/data/files'Example: running with provided testData
python plotPACEMAPP_plotly.py --directory './testData'The directory should contain .h5 or .nc files with PACE-MAPP retrieval
results.
The code is organized into several key sections:
import numpy as np
import h5py
import plotly.graph_objects as go
from dash import Dash, dcc, html, Input, Output, StateThese import necessary libraries for data handling, plotting, and web interface creation.
scan_directory_for_files(): Finds all .nc of .h5 data files in a directoryfind_nearest_point(): Locates the closest data point to clicked coordinatesdetermine_retrieval_scenario(): Identifies which instruments were used in the retrieval
read_hdf5_variables(): Loads data from HDF5 files into Python dictionariesfilter_by_cost(): Removes poor-quality data points based on cost function valuesget_channel_intensity_dolp_vza(): Extracts optical measurements for specific locations
create_scatter_plot_only(): Creates the geographic map with aerosol propertiescreate_combined_intensity_dolp_plots(): Creates the intensity and dolp plotscreate_export_figure(): Prepares a plot for .png export
Defines the HTML structure and styling for the web interface, including:
- Dropdown menus for selecting retrieval file and aerosol properties
- Three-column layout for the plots
- Controls for cost filtering data
These functions handle user interactions:
- When you select a different aerosol property, the map updates
- When you click a point on the map, the Intenstiy/DoLP plots populate
- When you change the cost filter, pixels not meeting this criteria are removed
run_app(): Sets up and launches the web application- Command-line argument parsing for specifying data directory
Aerosols are tiny particles suspended in the atmosphere (like dust, smoke, or sea salt). We are interested in their properties because they:
- Affect climate by scattering and absorbing sunlight
- Impact air quality and human health
- Interacts with clouds
PACE the instruments measure the intensity and polarization of light at the top of the atmosphere at multiple wavelengths and viewing angles.
PACE-MAPP this refers to the retrieval algorithm that infers aerosol properties from the above measurements.
The tool displays various aerosol properties:
- Aerosol Optical Depth (AOD): The extinction of light due to aerosols (higher AOD = more extinction)
- Single Scattering Albedo: How much light aerosols scatter vs. absorb
- Asymmetry Parameter: Describes the directional scattering pattern
- Refractive Index: Optical properties of the aerosol material
- Size Distribution: Range of particle sizes present
The cost function indicates how well the retrieval algorithm performed:
- Lower values = higher confidence in the results
- Higher values = more uncertainty
- The tool allows filtering out high-cost (low-quality) retrievals
-
Left Panel (25% width): Controls and information
- File selector dropdown
- Aerosol property selector
- Cost function filter slider
- Clicked point information
- Properties table
-
Middle Panel (31% width): Geographic scatter map
- Each point represents a measurement location
- Colors indicate aerosol property values
- Click points to see detailed measurements
-
Right Panel (42% width): Intensity and DoLP plots
- Shows how measurements vary with viewing angle
- Updates when you click on map points
- Helps validate retrieval quality
- Click on map points: Updates the intensity/DoLP plots for that location
- Change aerosol property: Updates the map colors and data display
- Adjust cost filter: Hides low-quality data points
- Export functionality: Save plots as images
Callbacks are functions that automatically run when users interact with the interface:
@app.callback(
Output('aerosol-plot', 'figure'), # What gets updated
Input('property-selector', 'value') # What triggers the update
)
def update_plot(selected_property):
# Function that runs when property changes
return new_figureThe code uses several important patterns:
- Dictionary-based data storage: All arrays stored in
data_dict - 2D array indexing: Geographic data stored as
[latitude_index, longitude_index] - Masking for quality control: Using boolean arrays to filter data
- Flattening for plotting: Converting 2D geographic grids to 1D arrays
The code includes extensive error checking:
- File existence validation
- Data shape verification
- Graceful handling of missing data
- Debug print statements for troubleshooting
To display additional variables from the data files:
- Find the variable name in the HDF5 file structure
- Add it to the reading function in
read_hdf5_variables() - Update the dropdown options in
create_dropdown_options()
To modify how the geographic data appears:
- Edit
create_scatter_plot_only()function - Change color scales by modifying the
colorscaleparameter - Adjust point sizes by changing the
sizeparameter
To create additional visualization panels:
- Create a new plotting function following the pattern of existing ones
- Add the plot to the layout in the app layout section
- Create callbacks to handle user interactions
To change how data quality filtering works:
- Edit
filter_by_cost()function - Adjust threshold values for what constitutes "good" data
- Add new filtering criteria beyond just cost function
- "No module named" errors: Install missing packages with pip
- File not found: Check that your data directory path is correct
- Empty plots: Verify your data files contain the expected variables
- Slow performance: Reduce the number of data points or increase filtering
The code includes a debug variable at the top. Set debug = 2 for verbose output that shows:
- Data loading progress
- Array shapes and sizes
- Processing steps
- Error details
Add print statements to understand data flow:
print(f"Data shape: {data_array.shape}")
print(f"Min/max values: {np.min(data_array):.3f} / {np.max(data_array):.3f}")
print(f"Number of valid points: {np.sum(np.isfinite(data_array))}")- Add statistical analysis: Calculate correlations between different aerosol properties
- Implement data export: Allow downloading filtered datasets as CSV files
- Create time series plots: If you have multiple files, show how properties change over time
- Add machine learning: Use aerosol properties to classify different aerosol types
- Improve visualization: Add 3D plots, animation, or additional map layers
- Performance optimization: Implement data caching or more efficient plotting
- Test with small datasets first: Use a subset of data while developing
- Comment your code extensively: Explain what each section does
- Use version control: Keep track of your changes with git
- Validate your results: Compare outputs with known good data
- Handle edge cases: What happens with missing data or unusual values?
The main file plotPACEMAPP_plotly.py contains:
- Lines 1-50: Imports, constants, and utility functions
- Lines 50-500: Data reading and processing functions
- Lines 500-800: Plotting and visualization functions
- Lines 800-1000: Dash app layout definition
- Lines 1000-1200: Interactive callback functions
- Lines 1200+: Main application setup and command-line interface
Understanding this structure will help you navigate the code and make targeted modifications for your specific needs.
If you encounter issues:
- Check the debug output: Set
debug = 2and look for error messages - Verify your data files: Make sure they contain the expected variables
- Test with different datasets: Some files might have different structures
- Read the error messages carefully: They often indicate exactly what's wrong
- Use online resources: Dash and Plotly have excellent documentation