At Moat we add ~500 million rows each day to just a few tables in our data warehouse. Previously, we used Postgres for data for the last month (which takes ~15TB of disk), & we love its ability to handle concurrency & its cost compared to the commercial column-stores that we use for our historical data.
By using a combination of composite-types, arrays & Postgres' TOAST mechanism we were able to build a custom data warehouse that combines the best aspects of Postgres & a commercial column-store database, tailored specifically for our needs without modifying any of the Postgres source.
Our new store uses ~1TB of disk/month, has sped our ETL considerably (we're now compute rather than I/O bound) & will allow us to migrate our full historical data warehouse to a single Postgres instance.
This talk will cover: - An overview of our data problems & why we chose to go this route - A deep dive into the different Postgres mechanisms (TOAST, arrays, composite types, SRF's) we use to make our solution work, how we put them together, & how others might apply similar techniques to their use-cases - Strengths & weaknesses of the approach, lessons learned - Upcoming features in Postgres that are going to make this approach even more attractive - Future work/things that aren't upcoming (yet!?) that would really help us out.
Speaker: David Kohnis a Data Engineer focusing mostly on Postgres at Moat How he came to do data engineering at an advertising analytics company from battery engineering & electrochemistry startups is a story too long for a short bio, but you can ask him if you'd like.