Building Custom Data Warehouse Using PostgreSQL | NYC Tech Events - GarysGuide

Building Custom Data Warehouse Using PostgreSQL

With David Kohn (Data Enggr, Moat).

	Venue, To Be Decided
	Oct 12 (Thu) , 2017 @ 06:30 PM
	FREE

DETAILS

At Moat we add ~500 million rows each day to just a few tables in our data warehouse. Previously, we used Postgres for data for the last month (which takes ~15TB of disk), & we love its ability to handle concurrency & its cost compared to the commercial column-stores that we use for our historical data.

By using a combination of composite-types, arrays & Postgres' TOAST mechanism we were able to build a custom data warehouse that combines the best aspects of Postgres & a commercial column-store database, tailored specifically for our needs without modifying any of the Postgres source.

Our new store uses ~1TB of disk/month, has sped our ETL considerably (we're now compute rather than I/O bound) & will allow us to migrate our full historical data warehouse to a single Postgres instance.

This talk will cover: - An overview of our data problems & why we chose to go this route - A deep dive into the different Postgres mechanisms (TOAST, arrays, composite types, SRF's) we use to make our solution work, how we put them together, & how others might apply similar techniques to their use-cases - Strengths & weaknesses of the approach, lessons learned - Upcoming features in Postgres that are going to make this approach even more attractive - Future work/things that aren't upcoming (yet!?) that would really help us out.

Speaker: David Kohnis a Data Engineer focusing mostly on Postgres at Moat How he came to do data engineering at an advertising analytics company from battery engineering & electrochemistry startups is a story too long for a short bio, but you can ask him if you'd like.