Birchbox was founded in 2010 to redefine the way consumers discover & shop for beauty & grooming. The company quickly grew from an exciting idea to a business that has materially shaped the beauty industry: weve activated an enormous group of underserved, untapped consumers, awakening their relationship with beauty & grooming by making the experience relevant, easy & fun. Our innovation isnt the simple concept of delivering a box of samples its understanding that although not everyone is passionate about beauty & grooming, everyone still deserves to have a great experience finding, trying & buying it.
Birchbox welcomes people from all backgrounds, ethnicities, cultures, & experiences. We celebrate how different perspectives benefit our employees, our products, & our community. We are focused on cultivating a diverse & inclusive work environment that encourages collaboration, creativity, & innovation. We are proud to be an equal opportunity employer. See here to find out how Birchbox is making a difference.
Birchbox is seeking an ambitious & experienced Data Engineer to help evolve our data systems. In addition to fueling our BI platform, recommender systems, & email marketing, our data services inform decision-making company-wide.This role will be very impactful as we build out the next generation of our data services, & will have considerable leeway in driving future strategy & architecture.
You'll be joining a lean Agile team supporting Data Infrastructure & Machine Learning. Primary upcoming initiatives include retooling our event data collection systems, supporting the next generation of our A/B testing machinery, & iterating on our data pipelines. Each of these projects will primarily involve data extraction, transformation, & loading, as well as DevOps work & vendor management. You will train junior developers & work with the technical leadership of the company to drive best practices for data.
Our Stack: Fivetran/dbt/Redshift/Metabase. Airflow (astronomer.io), Databricks. Python & a little bit of Scala.
**This role is located in New York, NY & will be remote until 2021 due to COVID.**
- Build & maintain fault-tolerant, scalable batch data pipelines
- Architect & maintain soft real time event analytics pipelines
- Design & implement cloud-based data & machine learning pipelines (Databricks, s3 data lake)
- Implement fault tolerant data integrations between internal systems & with third-party APIs, supporting product & marketing needs
- Create & integrate internal A/B testing tooling for engineering & product teams
- Work with the Business Intelligence team, advising on data provenance & reliability & integrating data sources for analysts.
- Manage vendor interactions related to our data stack
- Design & build components of internal tooling used by our subscription operations team (Python)
- Tool our data systems for observability, including logging, metrics monitoring, & dashboarding
- Train & upskill future data engineers
- Contribute to an open, empowering, responsible, & proactive engineering culture.
- 3+ years of professional software experience (or equivalent).
- Degree in Computer Science or related field (or equivalent experience).
- Expert in Python, & comfortable with at least two other languages (e.g., Java, Ruby, PHP).
- Strong command line skills for working within virtualized machines (bash, tmux / screen, vim / emacs).
- Experience orchestrating data infrastructure (Spark, S3, Kafka/Kinesis, Redshift).
- Experience with modern workflow management systems (e.g., Airflow, Luigi).
- Advanced SQL skills (MySQL and/or Postgres), familiarity with data warehousing.
- Familiarity with non-relational data stores and/or indexes (e.g., MongoDB, DynamoDB, ElasticSearch).
- Experience working on teams using distributed version control (e.g., Git, Mercurial).
***To support an inclusive & equitable hiring process, we are eliminating resume submissions. To complete your application, answer the questions below.***