As a Python Backend Engineer on our team, you will accelerate the process of engineering features to get the most out of multiple data sources & data types. The people you will work with own the entire feature learning experience for DataRobot, & is responsible for making sure our feature discovery automation & resulting models are the best in the world.
You will build out Data Management & ETL system inside DataRobot utilizing distributed frameworks such as Spark, Hadoop & Kubernetes. You will build scalable solutions to process high data volumes, with the main focus on crafting robust components for data intake, cleanup & full variety of data transformation. You will also be responsible for designing & implementing features from start to finish, including clean & easy to use APIs, automated tests, & deployment infrastructure.
The ideal candidate can bring new ideas from concept to implementation, write quality, testable code, & participate in design/development discussions.
5+ years of experience in Python
3+ years of experience in architecting & developing distributed systems
Experience with data processing using at least one of the following tools:
Experience with Hadoop ecosystem
In the interview process, you will be evaluated on your performance in a number of coding & design scenarios - be prepared to think!
Ability to communicate about technical topics
Willingness to learn about new technologies
Experience in some or all of the following:
System/performance evaluation such as profiling process memory/cpu/io/network usage & language-specific debugging tools
Document-oriented databases, ideally MongoDB
Distribute search engines such as ElasticSearch or Solr
Messaging services such as RabbitMQ
Hadoop services such as Yarn, Spark & HDFS