Papercup was founded in 2017 with the mission of making all of the world's videos available in any language. The technology behind the mission – state-of-the-art machine learning capable of translating a person's voice into another language, capturing expressiveness and intonation, so it is indistinguishable from a human voice. Here we talk to Jesse about how the platform is helping companies – spanning media, brands and enterprises – scale globally.

Tell us about Papercup

There are multi-billions of hours of video content out there. Whether that's on streaming sites, social media platforms like YouTube, Facebook or Snapchat, media platforms like TED, Vox or Four Nine, on educational platforms like Udemy or Coursera, or within companies as corporate training content. The world is truly global, which means content owners, no matter what their industry, are scrambling to go global, yet there is still no simple, fast and cost effective way to translate content beyond subtitling. The problem was extremely clear; the question was – could we build the sophisticated technology needed to chip away at the problem. That is what we set out to do and it's what we've done.

Our machine learning speech technology translates people's voices into other languages and the output audio retains characteristics of the original speaker so that it sounds like a real voice. The technology solves many of the problems associated with traditional dubbing – namely speed and cost. By automating dubbing using AI, we're able to charge per minute of dubbed content and turn around translation in matter of days. Essential ingredients for companies looking to scale internationally. Expert translators still check the output to ensure we're maintaining the quality our clients need, but working with an accurate version from the beginning makes the process incredibly quick.

Who should use the Papercup platform?

We're working with media companies, enterprises and content creators at the moment but the usability of the platform extends to anyone who produces video content. The media companies and corporates we work with are already reaping the rewards of our technology. The former by reaching audiences they previously couldn't access – like Business Insider with Spanish-speaking audiences. The latter, corporates, can suddenly communicate more effectively with their non-native English speakers. We charge on the basis of throughput, e.g. per min of output video, or by a revenue share model. For either pricing option, we're making voice overs and dubbing accessible to the 99% of content owners who historically couldn't even afford to localize their content.

Key milestones on Papercup journey

There are many but the evolution of the technology is one and the success of various projects within that punctuate the history of the company. The expressivity of the voices keeps improving, the tool that overlays the voices becomes ever-more intuitive. Testament to the work the incredible work the machine learning and product teams do – we’ve had four papers accepted at prestigious machine learning and speech conferences despite being a young company. 

On the commercial side, translating Sky News into Spanish every day has been an incredible achievement. It’s an iconic news source that never penetrated the Spanish market. We made that a reality and in just over 18 months since launch reached over 32 million people!

What is the biggest challenge you face as a company?

The reality is that we’re creating a new category, a new product that is unfamiliar to people. People are mainly accustomed to traditional translation services, so the concept of using AI to translate video content at scale requires education. How do we generate highly natural voices? Can we ensure we hit a certain quality level? Where should I distribute the content? We have answers to all these questions of course, but the education process takes time. We learned early that people like to see working examples of what they can expect from Papercup, so samples are something we create as routine and their persuasion power, when decision makers hear the quality for themselves, is very strong.

Plans for the future

From the start, we’ve been on a mission to make all videos across the world watchable in every language. We're doing this in two ways. First, through fundamental research in machine learning which improves the expressivity and naturalness of our synthetic voices while we roll out new languages. Secondly, by educating a whole new market on dubbing or localization. Many had not considered this before: it was too expensive, but now we’re tackling more regions, content types and languages to make video content accessible to global audiences. 

Eventually we want to be able to extend our state-of-the-art technology to any form of human dialogue –  allowing any two people to engage in a conversation regardless of what language they happen to speak. In other words – your voice, in another language.

Anything exciting coming up you can tell us about?  

I think one of the more exciting shifts we’re seeing is the maturation of the FAST market – free ad-supported streaming. It’s not premium streaming like Netflix or Hulu, but instead is another home for catalogues of content that don’t fit the bill for Netflix and would be under-utilized on something like YouTube. We’re already working with content partners distributing on platforms such as Pluto and I'm super excited to see where we can take this.

Want to have a chat with Jesse to find out how Papercup can help you localize your video content at scale? Scroll down to this page and book a demo!