Currently the program includes tesseract binaries but does not include traineddata files.
Ref - #19
https://github.com/tesseract4java/tesseract4java/wiki/Usage says
Before you can start creating new projects, you have to set the environment variable TESSDATA_PREFIX, which points at the directory that contains your "tessdata" directory. Usually this environment variable is set when you install Tesseract.
This requires the user to either install Tesseract or download "tessdata" directory and set the the environment variable TESSDATA_PREFIX.
Installing Tesseract separately can lead to conflict while running program. Asking the user to download "tessdata" is also not straightforward - eg.
Hence I would suggest that at a minimum osd.traineddata, eng.* files and other required tessdata files be included with the program and kept in tessdata subdirectory relative to the binaries for the program.
Secondly, I would suggest that program add the option of downloading tessdata files for selected languages (a few other languages have multiple files in tessdata - eg. ara, hin etc) that user can use.
Both VietOCR and Tesseract from UB Mannaheim include these options - please see
Currently the program includes tesseract binaries but does not include traineddata files.
Ref - #19
https://github.com/tesseract4java/tesseract4java/wiki/Usage says
This requires the user to either install Tesseract or download "tessdata" directory and set the the environment variable TESSDATA_PREFIX.
Installing Tesseract separately can lead to conflict while running program. Asking the user to download "tessdata" is also not straightforward - eg.
Hence I would suggest that at a minimum osd.traineddata, eng.* files and other required tessdata files be included with the program and kept in tessdata subdirectory relative to the binaries for the program.
Secondly, I would suggest that program add the option of downloading tessdata files for selected languages (a few other languages have multiple files in tessdata - eg. ara, hin etc) that user can use.
Both VietOCR and Tesseract from UB Mannaheim include these options - please see