Skip to content

Speech to text service, also available to other apps#109

Merged
Stypox merged 8 commits intomasterfrom
stt-service
Dec 20, 2022
Merged

Speech to text service, also available to other apps#109
Stypox merged 8 commits intomasterfrom
stt-service

Conversation

@Stypox
Copy link
Owner

@Stypox Stypox commented Dec 13, 2022

Speech to text service

This PR implements a Speech To Text service available to apps, fixing #54. Here is a preview of the feature, after pressing on the microphone button in Google Maps:

It is possible to also open the service from Dicio's navigation drawer, allowing the user to take dictation, copy to clipboard and share, fixing #33.

Testing APK

app-debug.zip

Technical details

This PR supersedes #100 by @nebkrid. #100 implemented the service as a skill, while this PR implements it as its own activity. The research done in #100 was really helpful though! I also kept the TODOs left behind there for later: for example, the result intent from the activity might contain multiple speech interpretations each with some different accuracy, and while Vosk does provide such information, it is currently not added to the result intent for simplicity.

Implemented export of Speech-To-Text functionality for other Apps, which can call this by startActivityForResult with an Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH)

Extra RecognizerIntent.EXTRA_PROMPT is implemented

This PR includes #111, thanks to @nebkrid again :-)

  • prompt message shows up as hint (if none is provided, default is still "Say something...")
  • Auto finish preference setting added: Reason: Vosk is good, but at least in German it is not perfect. Therefore it is easier (and faster: avoid waiting for loading vosk model again) if user gets the possibility to confirm or speak anew before reporting the result back to requesting app.
  • added the TODO from Implementing Speech-To-Text Service for other Apps #100 for optional (and seldom used, if ever) extras like EXTRA_BIASING_STRINGS, EXTRA_LANGUAGE for future reference and remind, which extras may be helpful for vosk recognition to improve the results

This PR also fixes a random crash when cleaning up Vosk, and sets the theme color used in e.g. button texts to a sensible value.

@Stypox Stypox merged commit 2ab3251 into master Dec 20, 2022
@Stypox Stypox deleted the stt-service branch December 20, 2022 15:10
@sudomain
Copy link

Is there an example of starting this activity using am? I've tried many variations of the following, but to no avail:

$ am start -a RecognizerIntent.ACTION_RECOGNIZE_SPEECH -e RecognizerIntent.EXTRA_PROMPT test
Starting: Intent { act=RecognizerIntent.ACTION_RECOGNIZE_SPEECH (has extras) }
Error: Activity not started, unable to resolve Intent { act=RecognizerIntent.ACTION_RECOGNIZE_SPEECH flg=0x10000000 (has extras) }

@RokeJulianLockhart
Copy link

@nebkrid
Copy link
Contributor

nebkrid commented May 25, 2023

@sudomain I have no experience with am, but guessing from Error: Activity not started, unable to resolve Intent { act=RecognizerIntent.ACTION_RECOGNIZE_SPEECH flg=0x10000000 (has extras) }: May you have to use directly the string "android.speech.action.RECOGNIZE_SPEECH" (like in the activity's manifest definition)? This is the actual value of RecognizerIntent.ACTION_RECOGNIZE_SPEECH

@MakeMeCookie MakeMeCookie mentioned this pull request Jul 6, 2023
@rkagerer
Copy link

rkagerer commented Dec 3, 2024

Rather than a wakeword, I'd like to set up Dicio to start listening when I hit the Bixby button on my S10+. I've installed Button Mapper Pro and run the required ADB steps to allow it to control that button. Could someone help me figure out what fields to enter below to trigger the correct Dicio intent?

image

@Stypox
Copy link
Owner Author

Stypox commented Dec 3, 2024

The activity you need to start is https://github.com/Stypox/dicio-android/blob/master/app%2Fsrc%2Fmain%2Fkotlin%2Forg%2Fstypox%2Fdicio%2Fio%2Finput%2Fstt_popup%2FSttPopupActivity.kt, so I think you just need to put org.stypox.dicio.io.input.stt_popup.SttPopupActivity under "package"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use Dicio as system STT / voice recognition service [Feature Request]: Take dictation

5 participants