I started this as a white paper on our tech stack that we would email out to interested parties. It turned into this outline.

Tom Andersen, CTO HearMeCheer
HearMeCheer uses a patent-pending technology stack to deliver audio processing on a scale never before seen in the audio industry. This allows HearMeCheer to provide chat rooms with up to thousands of people contributing at the same time, with live microphone input, – all with only one audio stream coming in and out of each client’s computer. The server also sends back customized streams for each client, with feedback cancellation applied for each client. We do this with WebRTC and our patent-pending algorithms running on extremely high performance compiled code. This is not another JS or Java application.
Why thousands?
Thousands (or hundreds) of people with their microphones enabled allow crowd feedback such as cheering, clapping, general attitude, etc. It allows the audience to take part. Think of a music concert – fan feedback in real time. How would a remote Comedy act work without aggregate crowd audio?
Latency
Total Latency as measured via external microphone is ~500ms or better (~280ms server latency). Latency measures by external, independent audio recorders are the definitive measure.
Audio Quality
Our technology also exhibits high quality as the connection from the client is to a server in a major (AWS) nearby server stack. Better than peer to peer WebRTC.
Privacy
Our servers do not record anything and thus meet many privacy requirements. In addition, we offer bespoke servers configured to run your authentication and under your IT. You have control.

Summary

We know we have a solid, stable platform to build your ideas on. Please contact me for more information
Tom Andersen CTO HearMeCheer

Comparisons

Agora

Limit to 17 active microphones at once in a room.

Our tests with a top tier Agora client indicate good latency on sparsely populated Agora voice chat rooms, with latency as indicated below.

FeatureAgoraHearMeCheer
Speaker capacity17Unlimited (tested to 5,000)
Latency on the audience’s client800ms400ms

If the number of users sending streams concurrently exceeds the recommended value, each user in the channel can only see or hear a random group of users who are sending streams. For example, if 18 hosts are sending streams concurrently in a live streaming channel, each user cannot see or hear a random one of the 18 hosts.

Google Meet / WebRTC peer to peer:

Bandwidth: They launch audio streams for each person talking, so after about 10 people making sounds at once, one ends up with 10 audio streams coming in and out of each client

Quality: Google Meet deals with these bandwidth issues by picking winners – which results in a choppy audio experience. Other peer to peer solutions just let latency build causing the ‘frog voice’ problem among others.

Twilio Voice

Twilio voice is a thin layer on top of WebRTC, with the issues of a peer to peer network combined with a very high cost.