SandLogic Introduces

Speech-to-Text Accuracy Benchmark

Speech recognition uses AI to transcribe human speech into text for a wide range of use cases, including voice assistants, sentiment analysis and more.

Enterprises frequently inquire about which ASR (Automatic Speech Recognition) service offers the highest accuracy, and this time, they have just hit the goldmine!

SandLogic’s LINGO is a state-of-the-art Speech/NLP platform that harnesses the power of AI, Machine Learning (ML), and Natural Language Processing (NLP) to provide superior results compared to companies that focus solely on speech recognition, such as Speechmatics and Voicegain.

In response, we conducted a Word Error Rate (WER) analysis where LINGO AI achieves as less as WER of 6.3% , representing 50% lesser than the industry’s leading ASR with highest WER. Detailed data from this benchmark indicates that SandLogic LINGO AI is better than leading ASRs/Speech-to-Text engines. You can see the bar graph with the results below.

Lingo benchmark

We are releasing our speech-to-text benchmarking results for speech recognition accuracy. In this we have taken sample data sets of 44 files from the Jason Kincaid data set and 20 files published by rev.ai . Our goal in sharing the benchmarking is to provide a better and useful tool for the vendors, developers and analysts

Dataset Reference (Jason Kincaid data set): This article is dedicated to the challenges of testing speech recognition accuracy. In this Google Video model wins easily when tested on samples with relatively low WER.

Dataset Reference (rev.ai): Rev tested the WER on their ASR alongside Google, Amazon, and Microsoft’s speech-to-text services. The result showcased rev.ai new ASR service outperforming other major players.

Lingo-6.3%

Voicegain - 11.2%

Google Speech-Video - 16%

Temi - 18%

Amazon - 22%

Speechmatics - 23%

Trint - 23%

Microsoft - 24%

IBM Watson - 29%

Google Speech - Standard - 37%

Figure shown: Median file word error rate (WER) for overal aggregate across all audio domains. This highlights Sandlogic’s LINGO outstanding accuracy and establishes it as a leading choice for diverse speech recognition applications.

When it comes to selecting speech recognition or ASR software, there’s more to consider than just out-of-the-box accuracy.

Ability to customize Acoustic Model

The ability to customize the acoustic model is essential for achieving high accuracy in domain specific use cases. we have several writeups on success studies describing real use-case model customization.

Ease of integration

Consider the ease with which the speech-to-text software can be integrated into your existing systems, especially if you require interfacing with telephony or on-premise contact center platforms.

Cost-effective

SandLogic is 80% less expensive compared to other best Speech-to-Text/ASR software providers. We offer pocket-friendly, affordable pricing and flexibility to our customers and partners.

On-premise/Edge deployment support

SandLogic offers most reliable on-premise and edge deployment support. Latency: SandLogic’s offers <0.3 seconds Letency to ensure a smooth and seamless experience for users.

Security

SandLogic speech-to-text platform LINGO puts the security of Information at the forefront. Information security initiative has been designed using industry best practices (such as HIPAA, VAPT, GDPR).