Inspiration
Immigration is an issue at the focal point of much of the news today, and according to a recent Gallup poll, 76% of Americans believe that immigration is a positive thing for our country. However, the road to asylum or citizenship is long and filled with difficult obstacles. Requirements include lengthy forms that must be submitted in English, and it can cost a significant amount of money to hire an attorney to proofread these all-important documents. The name Kuratoro comes from Esperanto, a language intended for universal use, and translates to "curator" or "helper"- as through this application, we want to help alleviate at least some of the obstacles that come from the immigration process, in the hopes of giving everyone a fair chance at experiencing our great country.
What it does
Our application translates two major immigration forms (I-589 and I-485) into any language supported by the Google Cloud Platform, allows the user to fill it out in their native language, then translates the document back into the proper form for submission to immigration services. Additionally, Kuratoro allows the user to upload their own form, and its computer vision classification models do their best to reproduce the form and allow input in the selected language
How we built it
Kuratoro uses AWS Textract optical character recognition in order to detect text and patterns in the documents, then the extracted text is structured into a standardized format. This text is sent our back-end API to be organized into individual questions and sub fields, then those are sent to the Google Translate API to be translated into the appropriate language, and the resultant text is formatted into a web form for the user to complete. The user's response is translated back into English, then added to the PDF form, which can be downloaded onto the user's device. The back-end, API, and classification models are built with python (flask), AWS, and GCP, while the front-end was made with Bootstrap CSS.
Challenges we ran into
The biggest challenge was trying to accurately and automatically map the user responses back onto the original document. PDF is a difficult file format to work with, and immigration forms are not very standard in their stucture
Accomplishments that we're proud of
Pulling the entire project together into a sleek and streamlined application, with relatively low load times and a simple user interface. Also, the classification and automatic parsing of PDF forms is what sets our application apart. It would would be relatively easy to build a program to modify an interactive PDF, but the ability to potentially take in any form and make sure that it reaches whoever it needs to makes this project unique
What we learned
John: I learned much more about back-end development in flask. To make this project possible, I had to learn how to use cookies, create an API endpoint for asynchronous calls, and manage files server-side. Although it took a long time and a lot of effort to get there, it works reliably and with all that I have learned, I'm confident I could build a similar project even more efficiently now.
Edward: Being somewhat of a beginner to web development, I learned a lot about Bootstrap and HTML as we used it throughout our application. Now having a finished and working product, I feel confident in my skills to continue in front-end development as I grow beyond the hackathon. Overall, I believe this project was an incredible learning experience, and coming from a family of immigrants, hope that its use can help those facing issues in the U.S. immigration system.
What's next for Kuratoro
A wider diversity of text classification templates. Because of the limited time that comes with a hackathon environment, we didn't have time to expand the number of types of forms this program could translate. Further development of the classification system would result in much better accuracy, especially for forms unrelated to immigration.
Log in or sign up for Devpost to join the conversation.