This code is an accompaniment for Full Guide: Creating Flask Callback Server to Store LinkedIn Profiles in MySQL using Crawlbase Crawler blog.
- Create a user
CREATE USER 'linkedincrawler'@'localhost' IDENTIFIED BY 'linked1nS3cret';- Create a database
CREATE DATABASE linkedin_crawler_db;- Grant permission
GRANT ALL PRIVILEGES ON linkedin_crawler_db.* TO 'linkedincrawler'@'localhost';- Set current database
USE linkedin_crawler_db;- Create tables
CREATE TABLE IF NOT EXISTS `crawl_requests` (
`id` INT AUTO_INCREMENT PRIMARY KEY,
`url` TEXT NOT NULL,
`status` VARCHAR(30) NOT NULL,
`crawlbase_rid` VARCHAR(255) NOT NULL
);
CREATE INDEX `idx_crawl_requests_status` ON `crawl_requests` (`status`);
CREATE INDEX `idx_crawl_requests_crawlbase_rid` ON `crawl_requests` (`crawlbase_rid`);
CREATE INDEX `idx_crawl_requests_status_crawlbase_rid` ON `crawl_requests` (`status`, `crawlbase_rid`);
CREATE TABLE IF NOT EXISTS `linkedin_profiles` (
`id` INT AUTO_INCREMENT PRIMARY KEY,
`crawl_request_id` INT NOT NULL,
`title` VARCHAR(255),
`headline` VARCHAR(255),
`summary` TEXT,
FOREIGN KEY (`crawl_request_id`) REFERENCES `crawl_requests`(`id`)
);
CREATE TABLE IF NOT EXISTS `linkedin_profile_experiences` (
`id` INT AUTO_INCREMENT PRIMARY KEY,
`linkedin_profile_id` INT NOT NULL,
`title` VARCHAR(255),
`company_name` VARCHAR(255),
`description` TEXT,
`is_current` BIT NOT NULL DEFAULT 0,
FOREIGN KEY (`linkedin_profile_id`) REFERENCES `linkedin_profiles`(`id`)
);In PROJECT_FOLDER/settings.yml, configure the token and crawler values on what you set in Crawlbase Crawler
Example:
# PROJECT_FOLDER/settings.yml
token: mynormalcrawlbasetoken
crawler: linkedin-profile-crawler
Then make sure you have entries of urls in PROJECT_FOLDER/urls.txt.
Note that each line corresponds to a valid url.
By default it is configured with 5 top most followed people in LinkedIn
PROJECT_FOLDER$ python3 -m venv .venvPROJECT_FOLDER$ . .venv/bin/activatePROJECT_FOLDER$ pip install -r requirements.txtOpen a new terminal and run the command below:
$ ngrok http 5000Then remember the forwarding url that looks like below:
Forwarding https://4e15-180-190-160-114.ngrok-free.app -> http://localhost:5000
What we need is the https://4e15-180-190-160-114.ngrok-free.app value and we will use this later.
Open a new terminal and run the command below to activate the python virtual environment for this terminal
PROJECT_FOLDER$ . .venv/bin/activateThen run the callback server
PROJECT_FOLDER$ python callback_server.pyOn a new terminal, run the following:
$ curl -i -X POST 'http://localhost:5000/crawlbase_crawler_callback' -H 'RID: dummyrequest' -H 'Accept: application/json' -H 'Content-Type: gzip/json' -H 'User-Agent: Crawlbase Monitoring Bot 1.0' -H 'Content-Encoding: gzip' --data-binary '"\x1F\x8B\b\x00+\xBA\x05d\x00\x03\xABV*\xCALQ\xB2RJ)\xCD\xCD\xAD,J-,M-.Q\xD2QJ\xCAO\xA9\x04\x8A*\xD5\x02\x00L\x06\xB1\xA7 \x00\x00\x00' --compressedIf running normally then you should see a message in the console:
[app][2023-08-10 17:42:16] Callback server is working
4. Finally register the ngrok path to Crawlbase Crawler dashboard
Example:
https://4e15-180-190-160-114.ngrok-free.app/crawlbase_crawler_callback
Open a new terminal and run the command below to activate the python virtual environment for this terminal
PROJECT_FOLDER$ . .venv/bin/activateThen run the processor
PROJECT_FOLDER$ python process.pyThis will keep on looping and waiting for a data to be processed coming from Crawlbase.
Open a new terminal and run the command below to activate the python virtual environment for this terminal
PROJECT_FOLDER$ . .venv/bin/activateThen run the crawl script.
PROJECT_FOLDER$ python crawl.pyCopyright 2026 Crawlbase