Speech23d

example of usage
example of usage

Inspiration

Sometimes it’s hard to visualize your thoughts, but you want to make your ideas tangible. It's a hard task, but we decided to make a little step towards solving it.

What it does

This is a web-site where you can draw and edit 3d models using voice and text commands. For instance, you can say "draw a cat", "create a dinosaur", "make dinosaur bigger", "move cat to the left", "put a cat on a table" and so on. Also there is a possibility to download drawn 3d model.

How we built it

We use JS at the frontend and Flask at the backend. Backend is running on the google could Virtual Machine. At the frontend Recorder.js is used to record a sound, the result is passed to the backend, where our system uses google-speech API to convert voice to text. Then google-language API is used to parse text and create a command. Next step is to return this command to the frontend where we either edit existing models: moving, rotating, color changing, rescaling either create new ones. We take 3d models from google-poly API which allows to search 3d models by text. To render 3d models we use three.js.

Challenges we ran into

One of the main problems is that nobody in our team don’t keen on the frontend. So, we had to use stackoverflow a lot, even more than usually :)

Weak sentence parsing API is another serious issue. To build a command we need to parse a voice command: which object to target, what to do with it, how exactly and which other objects should take part in the action. But APIs which we’ve tried to use didn’t work properly, so we wrote this complicated logic by ourselves.

Besides that, there were of course a bunch of technical issues especially with setting up HTTPS connection instead of HTTP. Also we delayed making an input text form, so we had some problems while testing our speech2text in noisy room :)

Accomplishments that we're proud of

We create an application where you can draw a 3d models via voice commands! We also came up with an interesting solution for object movement. We have commands to put an objects for example in front of another object, or on top of it. For instance, you can easily put a lamp on a table or a hat on a man. In case 3d printing this idea becomes even more useful, because you care a lot about space and want objects to be as close to each other as possible.

What we learned

We learned how to work with several cool libraries and frameworks like three.js and Flask, and how to work as a team! We supported each other, broke for all the difficulties and at the end create a quite solid project.

What's next for Search23d

We could create an ML algorithm to generate 3d models. We could train it on examples from google-poly. Also we could use ML for text-commands parsing, that could help us to improve command understanding and to increase variability of supported phrases.