Free Data Dumps from AppGoblin. AppGoblin is a resource for free android and ios app intelligence. AppGoblin has the biggest free dashboard to browse mobile apps and the SDKs they use for advertising, product analytics and other open source libraries. AppGoblin also has live advertising information for which apps are currently advertising.
Due to 100mb limits, files are compressed and do not contain 'all' the data. If you don't see data you're looking for, feel free to ask.
This repo includes these files in /data:
live_store_apps.tsv.xz: Apps that are currently live on the Google Play and Apple App Store. This TSV includes names and categories.store_apps.tsv.xz: All Appgoblin's known 4m+ Android and iOS app store ids. Many of which are no longer live on the app stores.store_apps_metrics.tsv.xz(limited): Live apps with metrics such as installs, rating, and review count from AppGoblin.
A fuller store_apps list, fuller metrics export, and descriptions dataset are available for free at https://appgoblin.info/free-app-datasets.
Apps-per-company data is available on AppGoblin in the B2B datasets.
store: The app store (e.g., "google" for Google Play Store)store_id: Unique identifier/package name for the appapp_name: Display name of the appapp_category: Category of the app (e.g., tools, video_players)
developer_id: Unique identifier for the developerdeveloper_name: Name of the developer/company
review_count: Number of written reviewsrating_count: Total number of ratingsinstalls: Number of installations
free: Boolean indicating if the app is freeprice: Price of the app (0 for free apps)minimum_android: Required Android versionad_supported: Boolean indicating if app contains adsin_app_purchases: Boolean indicating if app has IAPseditors_choice: Boolean indicating if selected as editor's choice
store_last_updated: When the app was last updated in the storerelease_date: Initial release dateappgoblin_created_at: When the app was first added to AppGoblinappgoblin_updated_at: When the app was last updated in AppGoblinlast_crawl_result: Status of the most recent data crawl
store_id: Unique identifier/package name for the appapp_category: Category of the app (e.g., game_action, tools)company_domain: Primary web domain of the companycompany_name: Name of the companyparent_company_name: Parent company or holding company nametag_source: Source of the company data (app_ads_direct, app_ads_reseller, or sdk)
The data comes from three tag_sources:
- app_ads_direct: Direct advertising relationships
- app_ads_reseller: Indirect advertising through resellers
- sdk: Software Development Kits integrated in the app
| store_id | app_category | company_domain | company_name | parent_company_name | tag_source |
|---|---|---|---|---|---|
| com.example.app | game_action | yandex.com | Yandex | Yandex | app_ads_direct |
| com.example.app | game_action | yandex.com | Yandex | Yandex | app_ads_reseller |
| com.example.app | game_action | ironsrc.com | ironSource | Unity Ads | app_ads_direct |
| com.example.app | game_action | yahoo.com | Yahoo! | Yahoo! | app_ads_direct |
| com.example.app | game_action | yahoo.com | Yahoo! | Yahoo! | app_ads_reseller |
| com.example.app | game_action | verve.com | Verve Group | Verve Group | app_ads_reseller |
The export pipeline supports monthly, versioned CSV archives for:
descriptionsstore-apps-metrics
Each monthly file is generated as:
YYYY_MM_01_<dataset>.tsv.xz
Examples:
2026_03_01_descriptions.tsv.xz2026_03_01_store-apps-metrics.tsv.xz
Files are uploaded to:
downloads/<dataset>/year=YYYY/month=MM/YYYY_MM_01_<dataset>.tsv.xz
Run both datasets for the current month:
python -m agdata.upload_to_object_storage monthlyRun both datasets for a specific month:
python -m agdata.upload_to_object_storage monthly --year 2026 --month 3Run one dataset only:
python -m agdata.upload_to_object_storage monthly --year 2026 --month 3 --datasets descriptions
python -m agdata.upload_to_object_storage monthly --year 2026 --month 3 --datasets store-apps-metricsOverwrite existing objects (if needed):
python -m agdata.upload_to_object_storage monthly --year 2026 --month 3 --forceExample crontab entry to run at 03:20 UTC on the first day of each month:
20 3 1 * * cd /home/james/appgoblin-data && /usr/bin/python3 -m agdata.upload_to_object_storage monthly >> /home/james/.config/appgoblin/logs/monthly_exports.log 2>&1Notes:
- Exports use chunked database reads and chunked CSV writes to avoid high memory usage.
- Uploads are idempotent by default (existing monthly object keys are skipped).
- Public bucket ACL changes are not required (bucket-level public access is assumed).