Deep Mushroom Spain is a fungal classification project focused on Spanish mushroom observations and inspired by the original DeepMushroom repository.
This fork keeps the original idea as a reference point, but adapts the dataset workflow, project structure, and collection pipeline to support a cleaner Spain-focused training workflow.
The original project that motivated this fork is:
If you want to compare the initial layout, dataset assumptions, or earlier modeling approach, that repository is the right baseline.
Compared with the original repository, this project has already been re-structured around dataset lifecycle stages and execution concerns:
data/
raw/inaturalist/ # downloaded observation exports and raw API snapshots
interim/ # downloaded images and temporary working assets
processed/ # model-ready training datasets
docs/
assets/ # figures used by the documentation
src/
collection/ # data acquisition scripts
training/ # model training entry points
tools/ # one-off utilities and maintenance scripts
The goal of this re-structuration is to make the collection, preparation, and training stages easier to evolve independently than in the original repo layout.
iNaturalist.org is a citizen science platform where users upload organism observations and the community helps identify them. In this fork, the working scope is Spanish mushroom observations between 2000-01-01 and 2026-03-30.
The historical CSV exports are stored under data/raw/inaturalist/. The image download utility lives in src/collection/download_images.go, and the FastAI training entry point lives in src/training/train_fastai.py.
This is the exporter query used to download Spanish fungal observations from 2000-01-01 to 2026-03-30, restricted to the species rank and excluding the lichen class Lecanoromycetes:
quality_grade=any&identifications=any&iconic_taxa[]=Fungi&place_id=6774&without_taxon_id=54743&rank=species&d1=2000-01-01&d2=2026-03-30
The key constraints in that exporter query are:
place_id=6774limits results to Spainrank=speciesexcludes genus-level and variety-level observationswithout_taxon_id=54743excludesLecanoromycetes, which are the lichen classiconic_taxa[]=Fungikeeps the export within fungi
Yes. The official observations API supports filtering by place, taxon, date range, and photo availability, so the website export page is not required for this workflow.
Relevant identifiers validated for this fork:
place_id=6774for Spaintaxon_id=50814for Agaricomycetes when you want a mushroom-oriented subsettaxon_id=47170for all fungi if you want the broader fungal kingdomtaxon_id=54743forLecanoromycetes, which can be excluded in exporter-based workflows
Example API query for Spanish mushrooms in the requested date window:
https://api.inaturalist.org/v1/observations?place_id=6774&taxon_id=50814&d1=2015-01-01&d2=2026-01-01&photos=true&verifiable=true&per_page=200
Notes:
- The API returns paginated JSON, not a CSV export.
- The public API is rate-limited. iNaturalist documents a hard cap of 100 requests per minute and asks clients to stay at 60 requests per minute or lower and under 10,000 requests per day.
- A broader fungi query with
taxon_id=47170also works for Spain and the same date range.
Not all of the current CSV columns are required for the image-only workflow.
The current downloader only needs a small subset of fields such as:
idimage_urlscientific_name
The remaining fields are intentionally kept so the dataset can support future models that may use metadata beyond the image itself, such as location, coordinates, observation date, or other contextual signals.
This distribution view is part of the original DeepMushroom line of work and should be read in the context of the original Olament/DeepMushroom project that inspired this fork.
The data distribution is heavily skewed toward a relatively small number of common species. Species with fewer than 10 images are removed for two reasons:
- A very small number of observations usually indicates that the species is not common enough to provide much practical value in the current identification workflow.
- There is not enough image data to train the classifier reliably, and those classes tend to reduce overall model quality.
Since the images from MushroomExpert were identified by mycologists, they can be used as a reliable external validator when testing the performance of the model.
At this stage, the project still uses the fast.ai library as the main experimentation layer. Over time, the training stack may move toward more custom models built directly on PyTorch.
| Architecture | Validation Accuracy | Validation Top-5 Accuracy | Test Accurarcy | Test Top-5 Accuracy |
|---|---|---|---|---|
| ResNet34 | 70.68 | 86.36 | 31.94 | 48.11 |
| ResNet50 | 79.67 | 91.76 | 38.77 | 59.14 |
| ResNet50+Focal Loss | 80.24 | 92.32 | 39.48 | 60.45 |
| Prediction | Ground Truth |
|---|---|
| Fomitopsis mounceae | Fomitopsis pinicola |
| Pleurotus pulmonarius | Pleurotus ostreatus |
| Dacrymyces chrysospermus | Tremella mesenterica |
| Tremella mesenterica | Dacrymyces chrysospermus |
| Laetiporus gilbertsonii | Laetiporus sulphureus |
| Stereum hirsutum | Stereum complicatum |
| Tremella aurantia | Tremella mesenterica |
| Ganoderma megaloma | Ganoderma applanatum |
| Laetiporus cincinnatus | Laetiporus sulphureus |
| Ganoderma applanatum | Ganoderma brownii |
This repository is distributed under the GNU General Public License v3.0. See LICENSE for the full text.
Thanks to Olament, the owner of the original DeepMushroom repository, for creating such a strong reference project.
That work caught my eye when the Reddit bot was up and it became the clearest reference point for building this Spain-focused fork. I recall generating a small model for Spain back in the day but now I would like to make it serious.
