Is Your Data Infrastructure Stifling Innovation? (Kitterman)

Craig Kitterman - Oct 20 '22 - - Dev Community

Image description

iStock

There are myriad reasons why an estimated 90% of startups fail. You need a great idea (and not just one great idea), you need inspiration, funding, smart people—and a fair amount of luck. Miss any one of these factors, and failure might be a foregone conclusion.

For young companies or small teams that build applications, data can be another stumbling block. The databases they rely on have historically stymied innovation by being complex and costly to spin up, manage, and maintain. Proofs of concept—ideas with the potential to turn into something big—can die before even being tested due to a lack of funding or database capacity.

But recent advances in database technology have begun making a real difference to a startup’s potential success. The complexities of setting up and maintaining a scale-out, distributed NoSQL database like Apache Cassandra™ can be shouldered by database-as-a-service (DBaaS) providers. And the availability of pay-as-you-go, serverless database offerings have opened new doors to experimentation and testing of new product or service ideas that might have been off the table before.

Let’s look at how engineering leads at two startups, for whom data is a critical piece of their success, are harnessing these advances to focus on what matters most: building, testing, and getting to market fast with new features and products that address their customers’ ever-changing needs.

Ankeri: Sailing away from the relational world

Ankeri, founded in Iceland in 2016, provides data services to companies that manage container ship fleets. Ankeri unifies commercial and technical ship data from thousands of vessels and disparate technology platforms and enables ship owners and charterers to manage and share data from their ships and collaborate for improved decision making (…think better fuel cost management and vessel selection).

It’s a cutting-edge platform built on data—lots of it. Ankeri runs roughly 4 million database reads and 2.4 million writes per hour, and the company is growing fast. The company decided early on that relational databases would not provide the performance or scalability needed to serve an expanding set of data sources.

“We needed something that could work for 100 ships as well as 10,000,” states Ankeri Vice President of Engineering Nanna Einarsdóttir. The choice of relying on a NoSQL database was an obvious one, she said. After some research, Cassandra proved to be the best candidate. But the move wasn’t without challenges.

Although made up of seasoned developers, her team had little experience with NoSQL databases; Einarsdóttir acknowledges that designing a product that runs on a non-relational database required getting over a few hurdles, at least initially.

For one, most NoSQL databases require building tables based upon access patterns (rather than doing so based upon the nature and structure of the data, as is the case for relational databases), and this means more up-front time investment, Einarsdóttir says. It also translates into more time investment for product design changes early in the product life cycle. There is, however, a silver lining in this extra required effort for the Ankeri team.

“Spending that extra time at the design table is, at its core, beneficial to the product, and it decreases the number of U-turns you might otherwise take during product development,” she mentions.

With the initial database set-up phase behind them, the Ankeri team needed to focus on building and improving its platform. Particularly for small teams, spending manpower and resources on database maintenance and monitoring can be difficult to justify. That’s a big reason that Einarsdóttir says Ankeri decided to go with a database-as-a-service; the company landed on Astra DB from DataStax, a serverless, multi-cloud DBaaS built on Cassandra.

“A DBaaS means that minimal time is spent on setup before we can start working on the product itself,” she states. “We are a start-up company, and as such need to be focused on features rather than infrastructure. The path from an idea to customer feedback must be short, and we need to be agile and forward thinking.”

Circle Media: Focusing on family, not data

For Circle Media Labs, a provider of apps and devices to help parents manage their family’s time online, choosing a DBaaS was a relatively easy decision. Circle Principal Engineer Nathan Bak was intimately involved with the development of Astra DB during his tenure as a senior software engineer at DataStax (Circle is also a DataStax customer).

Even with his awareness of the workings of NoSQL databases, Bak opted to use a service provider, for a reason similar to Einarsdóttir’s thinking

“Especially at small startups, do you really want to spend money to find and hire a person that’s going to be running a handful of databases? And what if that person goes on vacation, what are you going to do?” Bak asks.

But there’s another development offered by a select group of database services that is making it easier to focus on building products instead of data management: the availability of “serverless” data. Modern databases can be challenging to scale up and down, which often leads to costly overprovisioning.

But by separating the compute and storage functions, scaling up or down becomes simpler and faster. A serverless architecture matches data usage to workload peaks and valleys—no matter how spiky. This eliminates the costly and labor intensive task of estimating peak loads and enables developers to pay only for what they use—no matter how many database clusters they create and deploy.

For Circle, the fact that DataStax’s Astra DB is serverless simplifies their ability to test out new product and service ideas. If, say, someone came up with a new concept, the question might be posed: “But is it good enough to spin up a new database?” This isn’t an issue with serverless databases, however. Every developer can have a database for their own proof of concept; the contention that existed previously over who gets access to a testing database just goes away.

“I probably have half a dozen serverless databases with POCs running on them that might not go anywhere, but I can keep them running because it’s costing just pennies, and the data isn’t lost,” Bak says. In Circle’s case, one of those projects eventually became a new and valuable service: emails sent to users detailing their families’ weekly online device usage.

“This project went from me working on it on and off—with maybe a megabyte or two of data. But then it pretty quickly multiplied 1,000-fold—and then 10,000-fold,” Bak says. “There were plenty of things to worry about as that project grew. The database wasn’t one of them.”

The ability to spin-up new testing databases for POCs also prevents latency side effects that can affect the customer experience, Bak adds.

“Even if you do have your database right-sized, with a POC suddenly you’re doing something different. If you’re hitting your database with a bunch of new requests, that can affect the functionality of your core product,” he says. “You don’t want your customer to have a bad experience.”

Data is a key part of many enterprises’ success. Particularly for those that are early on their journey, it’s the last thing that should get in the way.

Einarsdóttir maintains, “It makes all the difference to be able to create new databases in a matter of minutes to test out a new idea, without speculations about provisioning and commitment. For innovation, time is of the essence.”

Learn more about DataStax Astra DB, the serverless, multi-cloud DBaaS built on Cassandra.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .