Collects posts/pages from a CSV list of Wordpress URLs, spin's them, then prepares them in a JSON file.
This set of scripts is specifically designed to run on:
- Python 3
- Windows 10 (although it should work on Vista, 7 and 8)
- MacOS Monterey
- Install Python for Windows
- From the project root, run
python setup.py - Add appropriate values to the
.envfile
This is done in 3 parts...
- Compile a list of all URL articles or pages you want to pull content from
- Add CSV file with list of all URLs to the
./sourcesfolder
- Using
terminal,bash,PowerShellor similar, navigate to./scrapers - Run
python scrape-press.py - Wait for the script to finish compiling the JSON file to the
./datafolder
- Install a processor / importer on your blogging platform (if you're using WordPress, WP All Import is brilliant)
- Upload the
./data/____.jsonfile to the importer - Map the appropriate fields
- Run your importer