How we prototyped Voice Chat (cross-browser, realtime) in...

Blog

Edited on 11/30/22

threads-blog

How we prototyped Voice Chat (cross-browser, realtime) in a weekend

Hi, I’m Mehdi! I’m an engineer at Threads. 
I want to tell you the story of how we built out Voice Chat. It started out as a massive, daunting project that no engineer on the team wanted to touch. We brought it to reality using rapid prototyping over a weekend, and I'm going to dive into how we made it happen.

ContextEarlier this year, we kept hearing consistent feedback from our customers trying to replace Slack. A few of them kept going back to Slack because they needed Huddles for synchronous audio chats, something they had learned to rely on. 
Given that we only had text chats at the time, we needed to support some type of voice chat, otherwise teams that wanted to remain on Threads would have to revert back to Slack and miss out on the asynchronous benefits we also offer. 

The ProblemThe thing is, voice chat is a huge and complex feature. 
With voice chat, people from all over the world should able to speak with each other in realtime. Not only do you need to support multiple browsers, you need to also include support for multiple microphone and speaker devices, too. 
Additionally, it needs to scale to large groups, and the audio quality needs to be high.

With a small team of engineers at a startup, it seemed nearly impossible to tackle and maintain. We had long, open debates about when we should build it. And as I mentioned earlier, we thought it’d take months to build, and add a ton of complexity to our frontend and backend. 

The thing was, we kept hearing that customers wanted it, so I decided to test our assumptions. Was it true that it’d take months to build? What exactly were the long poles? What were the riskiest parts, and the parts that’d be hardest to maintain?

The PlanThe best way to find out was to stop theorizing and to just dig in. I planned out what a Minimum Viable Product (MVP) would look like:
Two people initiate and end a voice chat inside of our product
Membership of the voice chat is constrained to a thread, a channel, or a chat
The audio quality is on par with other products

We didn’t need to support screen recording. We didn’t need a plan for Mobile yet. We didn’t need to support more than two people. If I could build that, then I could start answering questions like what infrastructure we’re missing, how long it’d take to get it to production, and what the hardest parts were.

In addition, my coworker Suman had posted about a voice chat API called daily.co. That was my first point of investigation. It fit into our stack well and was a stable, well-known service.
I read their blog posts and documentation to understand how WebRTC works, and what the pitfalls are.
My confidence in building a prototype was increasing... 

The PrototypeI freed up a weekend and did a two-day sprint on the plan.
About 60% of my time-spent was reading documentation, articles, learning more about WebRTC, pitfalls, and cross-browser support.
Another 10% was getting a test setup working.
And about another 10% was figuring out how Voice Chats should work in our system, what is the data model, and how membership and privacy should work.
The final 20% was writing code, testing it out, and integrating all of the pieces.

At the end of the weekend, I had a voice-call working! I’d executed on the plan above. 
Here’s what I’d done:
Progress: NaN%

Getting Buy-In After I got the prototype working, I started demo’ing it to stakeholders at Threads. We jumped on voice chats to show the experience, quality, and membership/privacy rules.
It was much easier to understand how it fit into into our system and how valuable it’d be, since we had something tangible to play with and talk about.

I made a list of what else we’d need to build to bring it to production. The list didn’t have many unknowns; we knew we could execute on it. That let us build confidence that the prototype was close, and we made time to finish out the remaining list of work.

Delivering ValueWe first opened up voice chats to one of our customers, Panther. They were extremely excited that we built it out—the big value was that they could stay within the same product for all of their communication needs: sync, async, voice, and text.
We continued dogfooding with them and fixed a long tail of bugs. Finally, we released it to all of our customers. 

As we’re growing, it’s crazy to see how many people are using voice chats. A team had a 3-hour voice chat across 6 people, and calls are spanning our desktop app and the browser, across operating systems.

In Summary...We might not have prioritized building voice chats for another half a year, because it seemed too daunting and too un-scoped to even fathom. By taking the initiative to prototype an MVP, we de-risked the project and better understood the technical hurdles and pitfalls. 
Our customers love it, and we can convert even more customers from Slack to Threads. And it's all thanks to rapid prototyping. 

Interested in trying it out for yourself? Sign up for early access at threads.com. 

Blog Twitter Log in

Made with 💜 in SF, NYC, TOR, DEN, SEA, AA