charpie - ππππ

Inspiration

Traditional print media is a treasure trove of untapped potential. Charpie brings structure to this unstructured treasure and sets it free for the digital age. This freedom brings availability and accessibility. Nowadays, we produce and USE a ton of information. What about the times before our age? A few decades ago there were no opportunities like these days. There was no method to store the data efficiently but it doesn’t mean that knowledge from the past should pass. That’s why we have developed a platform to stop the old information from vanishing completely.

What it does

charpie is a platform which scan, store, index and share the data from the many newspapers and magazines. It’s a powerful tool to make unobtainable data accessible to everyone in the most convenient way. We believe that our platform can help people do more accurate research and make it easier for our humanity to learn from the past.

How we built it

Back-end Infra: Python, Django (server) Content Extraction: OpenCV, PyTesseract, PDFminer Search Engine: Whoosh Front-end Infra: HTML5, Vanilla.js. The website was designed to match Galledia’s website who is the organizer of the challenge we took.

Hosted on: Azure (thanks for the credits!)

What's next for charpie

The next goal for charpie is to:

  • open it for the broadest range of formats (Calibre is a huge inspiration as an accessible Free and Open source software)
  • Share the wisdom extracted from these treasured archives.
  • Give the books a second life in the cyber space.
Share this project:

Updates