All notable changes to this project will be documented in this file.
- New validation error
AnnDataMultipleOntologyIDs, which requires each ontology term ID to be a single term. - New validation error
AnnDataInvalidDiseaseOntologyForHuman, which requires the use ofMONDO:andPATO:disease ontology terms for Homo sapiens datasets.
- Minor changes in error message formatting
- More dynamic error messages added.
- New genes were added to Gene Map file for Homo Sapiens
- New error message was added for CSC matrices in
adata.Xoradata.raw.X
- New genes were added to Gene Map file for Homo Sapiens
- New
find_missing_genes()->pd.DataFramefunction was added to UploadValidator to help users identify missing genes in their AnnData files. - More gene ids were added to Homo Sapiens Gene Map to cover missing genes reported by users from outdated ensembl releases.
- Gene Map file for Homo Sapiens was updated to support some outdated ENSEMBL ids from the Issue
- Gene Map file for Homo Sapiens was updated from GENCODE Release 49
- Gene Map file for Mus Musculus was updated from GENCODE Release M38
- Added gene validation for cases when organisms are specified using ontology term IDs instead of names.
- New exception
AnnDataNoneInGeneralMetadatato handle cases where required metadata fields contain None or empty values.
- General metadata check in
UploadValidatornow checks if any of general metadata or its ontology term ID exists in theobsdataframe. For example, the validator will pass if eithertissueortissue_ontology_term_idexists and non empty. Otherwise, the validator will raiseAnnDataMissingObsColumns.
- Strict requirement to have dense matrix with embeddings in obsm. Data Frames and sparse matrices will be ignored.
- CLI interface
UploadValidatorclass to validate AnnData files- Gene mapping support for Homo sapiens and Mus musculus
- Custom error handling with
CapMultiException - Basic unit tests for core functionality
- This is the first public version of the package.