Data transformation startup Tobiko may not be a household name yet, but you may have seen co-founder and CEO Tyson Mao on “Beauty and the Geek” back in the aughts and his co-founder, brother and CTO Toby Mao, on the speedcubing circuit. (Both have held world records in the past, and Tyson co-founded the World Cube Association.) Since then, the brothers, together with their co-founder Iaroslav Zeigerman worked at wide variety of companies, ranging from Apple to Airbnb, Google and Netflix, where Tyson and Zeigerman first met.
Now, with Tobiko, they aim to reimagine how teams work with data by offering a dbt-compatible data transformation platform, with the popular SQLMesh and SQLGlot open-source projects at its core and an intuitive low-code user interface to build data pipelines and transformations.
The company on Tuesday is launching its cloud platform and announcing a total of $ 21.8 million in funding, split between a $ 4.5 million seed round and a $ 17.3 million Series A round led by Theory Ventures. 20Sales, Fivetran CEO George Fraser, Census CEO Boris Jabes, and MotherDuck CEO Jordan Tigani also invested in the company.
While at Airbnb, Toby led the company’s Minerva project, the company’s internal metrics semantic layer. While working on that, though, he says he realized that the real power of Minerva wasn’t the semantics but its data transformation capabilities.
“The steps from getting from raw data to actual business value — there’s a lot of stuff going on there,” he told me. “It’s a lot of hard work. And so we wanted to eventually build a semantics company, but first we want to solve transformation. And so at Airbnb, I got a demo of the industry standard tools, dbt, and that gave me the inspiration to start this.”
Toby acknowledged the popularity and functionality of dbt, which has become somewhat of an industry standard for building. But he argued that it’s not the right solution for every company. “DBT was really designed to accelerate Series A companies’ data stacks,” he said. “We wanted to make a data platform, a data transformation tool, that could work at any company, even FAANG-style. So we took our experience, our collective knowledge, and built a system that would scale with both large amounts of data and large amounts of people.”
As Zeigerman explained, at the core of this modern platform is SQLMesh, an open-source tool that allows developers to build data pipelines with built-in tools for data transformation, testing and collaboration. This is also where the team’s background in semantics comes in. “SQLMesh understands SQL, as opposed to treating it as a piece of text,” he explained. And that understanding comes from SQLGlot, which Toby created during his time at Airbnb. “This ability to understand SQL unlocks a bunch of things that significantly boost the speed of developing and engineering productivity.”
This tool enabled Tobiko to do syntax checking on SQL queries, for example, before they are sent to the database. It also categorizes and tracks all of the changes that engineers make in the development process and tell them whether their break anything in relation to other datasets and transformations in the system.
“We truly believe that this is going to be one of the first observability tools that not only understands that something broke, but why it broke, because we understand your code, we understand every version of every code you’ve ever written, and we can tie every failure to that change,” Tyson said.
Tobiko also offers businesses the ability to create what the team calls “virtual data environments” that developers can use during the development phase and then reuse for other projects (or even in production).
The team tells me that it is mostly targeting data engineering teams right now and that it is working with customers of all sizes, including some unicorn startups. A lot of them are bringing entirely new applications to the service, but since it is compatible with dbt, there are also a number of dbt users who have made the switch.