Using Voice Assist pipeline via the HASS cloud subscription works a heck of a lot better than locally. Locally it takes about 15 seconds to respond, via the Nabu Casa server it’s about 1 second. I’ve considered dedicating a box to the containers it’s instantiating to do this to get faster response.
The M5stack atom echo. The hardware is the same, but if you change the pipeline in the back end between the two, that’s where the delay happens. You can run the Whisper stack locally or on another box locally but I think you’d want a good GPU on it to offload the NL processing to. Which is probably what happens when you’re using the Nabu Casa pipeline.
Do you think throwing a coral TPU on there would help?
I saw it helps a ton with Frigate facial recognition.
I was planning to do that on my Yellow once I can get the display thing that’s pictured in the article.
Using Voice Assist pipeline via the HASS cloud subscription works a heck of a lot better than locally. Locally it takes about 15 seconds to respond, via the Nabu Casa server it’s about 1 second. I’ve considered dedicating a box to the containers it’s instantiating to do this to get faster response.
What hardware is it running on that takes 15 seconds? I’ve not actually tried it myself as I’ve got a poor little RPi 3, and I don’t want to scare it.
The M5stack atom echo. The hardware is the same, but if you change the pipeline in the back end between the two, that’s where the delay happens. You can run the Whisper stack locally or on another box locally but I think you’d want a good GPU on it to offload the NL processing to. Which is probably what happens when you’re using the Nabu Casa pipeline.
Do you think throwing a coral TPU on there would help?
I saw it helps a ton with Frigate facial recognition.
I was planning to do that on my Yellow once I can get the display thing that’s pictured in the article.
Idk if any LLMs are set up to operate on anything except GPUs, its an interesting question.