Traditional print media is a treasure trove of untapped potential. Charpie brings structure to this unstructured treasure and sets it free for the digital age. This freedom brings availability and accessibility.
produce and USE a ton of information. What about the times before our age? A few decades ago there were no opportunities like these days. There was no method to store the data efficiently but it doesn’t mean that knowledge from the past should pass. That’s why we have developed a platform to stop the old information from vanishing completely.
charpie is a platform which scan, store, index and share the data from the many newspapers and magazines. It’s a powerful tool to make unobtainable data accessible to everyone in the most convenient way. We believe that our platform can help people do more accurate research and make it easier for our humanity to learn from the past.
Back-end Infra: Python, Django (server) Content Extraction: OpenCV, PyTesseract, PDFminer Search Engine: Whoosh Front-end Infra: HTML5, Vanilla.js. The website was matchedto match with Galledia’s website who is the organizer of the challenge we took.
Hosted on: Azure (thanks for the credits!)
The next goal for charpie is to:
- open it for the broadest range of formats (Calibre is a huge inspiration as an accessible Free and Open source software)
- Share the wisdom extracted from these treasured archives.
- Give the books a second life in the cyber space.