Interesting approach using Signal for the transport layer. I've been working with real-time audio pipelines (chrome.tabCapture → Whisper) and the latency tradeoff between STT chunk size and accuracy is always tricky. What's the end-to-end latency like on a video call?
Title being updated from its current form to add that this is specifically voice and video would make it quite a bit more interesting. Using text to invoke an AI from Signal is not really interesting at all, that’s basically OpenClaw. Using voice and video to do so, however, is quite a bit more interesting
I think you are right. I thought "Call" would clearly get the idea of voice across at least, but it can be confused with function call, or simply to invoke.
I dont have the ability to change the title but if someone else wants to:
Interesting approach using Signal for the transport layer. I've been working with real-time audio pipelines (chrome.tabCapture → Whisper) and the latency tradeoff between STT chunk size and accuracy is always tricky. What's the end-to-end latency like on a video call?
Title being updated from its current form to add that this is specifically voice and video would make it quite a bit more interesting. Using text to invoke an AI from Signal is not really interesting at all, that’s basically OpenClaw. Using voice and video to do so, however, is quite a bit more interesting
Hi! thanks for the feedback!
I think you are right. I thought "Call" would clearly get the idea of voice across at least, but it can be confused with function call, or simply to invoke.
I dont have the ability to change the title but if someone else wants to:
"Video and Voice Call an AI from Signal"
Else maybe I will submit it again in a few days.
Thanks