Recent Posts

Best practices in data quality management

8 minute read Published: 2022-09-01

Good data quality management is expensive. Poor data quality management is “expensiver”.

Deploying Walden on AKS

3 minute read Published: 2022-06-14

Walden is our reference implementation of a data warehouse. After adding instructions for its deployment on Amazon's EKS last month, we are now also supporting it on Microsoft's Azure Kubernetes Service (AKS).

Deploying Walden on EKS

3 minute read Published: 2022-05-17

Walden is our reference implementation of a data warehouse. We are now supporting it on Amazon's Elastic Kubernetes Service. Follow deployment instructions here, or read more information about our experience deploying a data warehouse on AWS below.

Adding Alluxio to Walden

4 minute read Published: 2022-04-18

We have added Alluxio to Walden, our reference implementation of a small data lake. Alluxio provides a unified view into one or more underlying storage sources, adding caching and translation on top of them. This can greatly improve overall Trino performance across queries, while also enabling support for external storage types like NFS that are not supported natively by Trino.

How to get the most out of your data science initiatives?

9 minute read Published: 2022-03-31

“Every business is a software business” proclaimed more than 20 years ago Watts S. Humphrey, the “Father of Software Quality”. A cursory look at organizations today — whether big or small — is enough to ascertain his premonitions. In the 2020s we could even go one step further and say that “Every business is a data business.”