Manga Reader
Azure text read API + box middle/text height
Grouping the text boxes into blobs + smart BBOX formation

Manga-Reader

Detects hovered text in images, translates it and reads it aloud. Useful for reading manga in Japanese, learning Japanese, kids learning English by reading manga. Or also any other type of image that you need to extract the text out of and have it read aloud.

Work principle:

Chrome extension, written in Javascript, which sends the urls of all images on your current webpage to a local python server. This also tracks the mouse position on the webpage.
The python server calls modules, which communicate with Azure in order to obtain the text in the images.
These modules use the Microsoft Read Text and OCR APIs to extract the bounding boxes and text out of a given image.
These boxes are grouped into "blobs", which are the structures, containing separate pieces of text.
The blobs are passed onto a service, which determines whether the mouse hovers over one of them. If yes, an audio file, which is the text-to-speech of the blob's data is played. This audio file is obtained by the Azure Speech API.

Features:

Heavily optimized preprocessing - existing images' data is cached and saved locally, so that if that url is ever encountered again, all the blobs and their responses are precomputed.
Supports reading in English and Japanese and translation between the two - you can read in English, but the speech will be Japanese or vice versa.