A Python library for automatic calculation of quantum chemical descriptors for QSAR analysis using ORCA quantum chemistry software.
pip install orca-descriptorsNote: After installation, the orca_descriptors command-line tool will be available in your PATH. If you installed with pip install --user, you may need to add ~/.local/bin to your PATH:
# For bash/zsh (add to ~/.bashrc or ~/.zshrc)
export PATH="$HOME/.local/bin:$PATH"
# For fish shell (add to ~/.config/fish/config.fish)
set -gx PATH $HOME/.local/bin $PATHAfter adding to PATH, restart your terminal or run source ~/.bashrc (or source ~/.zshrc).
poetry installfrom orca_descriptors import Orca
from rdkit.Chem import MolFromSmiles, AddHs
# Initialize ORCA calculator
orca = Orca(
script_path="orca",
functional="PBE0",
basis_set="def2-SVP",
method_type="Opt",
dispersion_correction="D3BJ",
solvation_model="COSMO(Water)",
n_processors=8,
)
# Create molecule from SMILES using RDKit
mol = AddHs(MolFromSmiles("C1=CC=CC=C1"))
# Calculate descriptors
homo = orca.homo_energy(mol)
lumo = orca.lumo_energy(mol)
gap = orca.gap_energy(mol)
# Additional descriptors
homo_minus_1 = orca.mo_energy(mol, index=-2) # HOMO-1 energy
min_h_charge = orca.get_min_h_charge(mol) # Minimum H charge
xy_shadow = orca.xy_shadow(mol) # XY projection area
meric = orca.meric(mol) # Electrophilicity index
logp = orca.m_log_p(mol) # Log P coefficient
nrot = orca.num_rotatable_bonds(mol) # Rotatable bonds
wiener = orca.wiener_index(mol) # Wiener index
sasa = orca.solvent_accessible_surface_area(mol) # SASAAfter installation, you can use orca_descriptors as a command-line tool:
Calibrate time estimation by running a benchmark calculation. The benchmark uses benzene (C1=CC=CC=C1) as a standard test molecule for machine calibration:
orca_descriptors run_benchmarkEstimate calculation time for a molecule without running the actual calculation:
orca_descriptors approximate_time --molecule C1=CC=CC=C1Automatic Parameter Scaling: The time estimation automatically scales benchmark data for different parameters (number of processors, functional, basis set). You don't need to re-run the benchmark if you change these parameters - the system will automatically recalculate the estimated time based on the existing benchmark data.
For example:
- If benchmark was run with 1 processor, estimation for 4 processors will automatically account for parallel efficiency
- If benchmark used
def2-SVP, estimation fordef2-TZVPwill scale based on basis set size (O(N^3.5) scaling) - Different functionals are scaled based on their relative computational costs
All parameters from the Orca class are available as command-line arguments:
--script_path: Path to ORCA executable (default: 'orca')--working_dir: Working directory for calculations (default: current directory)--output_dir: Directory for output files (default: current directory)--functional: DFT functional (default: PBE0)--basis_set: Basis set (default: def2-SVP)--method_type: Calculation type: Opt, SP, or Freq (default: Opt)--dispersion_correction: Dispersion correction, e.g., D3BJ (default: D3BJ). Use 'None' to disable.--solvation_model: Solvation model, e.g., 'COSMO(Water)' (default: None). Use 'None' to disable.--n_processors: Number of processors (default: 1)--max_scf_cycles: Maximum SCF cycles (default: 100)--scf_convergence: SCF convergence threshold (default: 1e-6)--charge: Molecular charge (default: 0)--multiplicity: Spin multiplicity (default: 1)--cache_dir: Directory for caching results (default: output_dir/.orca_cache)--log_level: Logging level: DEBUG, INFO, WARNING, ERROR (default: INFO)--max_wait: Maximum time to wait for output file creation in seconds (default: 300)
# Run benchmark with custom parameters (uses benzene as standard test molecule)
orca_descriptors run_benchmark \
--functional PBE0 \
--basis_set def2-SVP \
--n_processors 4 \
--working_dir ./calculations
# Estimate time for optimization calculation
orca_descriptors approximate_time \
--molecule CCO \
--method_type Opt \
--n_opt_steps 20 \
--functional PBE0 \
--basis_set def2-TZVP \
--n_processors 8- Python >= 3.10
- ORCA 6.0.1 installed and available in PATH
- RDKit >= 2023.0.0
See LICENSE.md
