Recent Posts

Trust But Verify. Data Access best practices for large organizations

8 minute read Published: 2021-06-10

We discuss why data access is a surprisingly difficult problem to solve for large organizations. Neglecting this problem poses underappreciated yet existential risks to a company's ML strategy. We propose a few principles that can help enable data access in a way that both enables innovation and manages risk effectively.

A short intro to MLOps

6 minute read Published: 2021-06-10

We look at the benefits of MLOps and why you might need it for your company.

Introducing TileDB Rust bindings

3 minute read Published: 2021-05-27

We have built Rust bindings to the TileDB tensor storage engine. As good Rustaceans do, we're open-sourcing the bindings and are quite excited about the possibilities that universal tensor storage open up.

Introducing the hub!

1 minute read Published: 2021-04-10! Many data science projects start out with a boring task: downloading your data. We don't like being bored, which is why we built the hub.

Introducing Walden

5 minute read Published: 2021-02-15

We have built Walden, a small data lake for (mostly) solitary use, consisting of a set of configurations and images for deployment into a Kubernetes cluster. We are releasing the code as free and open source software, hoping to lower some of the barriers to entry to the world of big data and AI. Check it out on our github, or read below for more info!