Senior Data Engineer
New York, NY
Who we are
DoubleVerify is the leading independent provider of marketing measurement software, data & analytics that authenticates the quality & effectiveness of digital media for the world's largest brands & media platforms. DV provides media transparency & accountability to deliver the highest level of impression quality for maximum advertising performance. Since 2008, DV has helped hundreds of Fortune 500 companies gain the most from their media spend by delivering best in class solutions across the digital ecosystem, helping to build a better industry. Learn more at www.doubleverify.com.
As a Senior Data Engineer, you own new initiatives, designs & build world-class platforms in order to measure & optimize ad performance. You ensure industry-leading scalability & reliability of mission-critical systems processing billions of real-time transactions a day. You apply state of the art technologies, frameworks, & strategies to address complex challenges with Big-Data processing & analytics.
What youll do
- Architect, design & build big data processing platforms handling tens of TBs/Day, serves thousands of clients & supports advanced analytic workloads
- Explore the technological landscape for new ways of producing, processing, & analyzing data in order to gain insights into both our users & our product features
- Design, develop, & test data-driven products, features, & APIs that scale
- Continuously improve the quality of deliverables & SDLC processes
- Operate production environments, investigate issues, assess their impact, & come up with feasible solutions.
- Understand business needs & work with product owners to establish priorities
- Bridge the gap between Business / Product requirements & technical details
- Work in multi-functional agile teams with end-to-end responsibility for product development & delivery
Who you are
- Lead by example - design, develop & deliver quality solutions.
- Love what you do & are passionate about crafting clean code
- Steady foundation with 5+ years of programming experience in coding, object-oriented design and/or functional programming
- Deep understanding of distributed system technologies, standards, protocols
- 2+ years of experience working in distributed systems like Hadoop, Big Query, Spark, Kafka Eco System ( Kafka Connect, Kafka Streams), & building data pipelines at scale.
- Hands-on experience building low latency, high-throughput APIs, & are comfortable using external APIs from platforms.
- Excellent SQL query writing abilities & data understanding
- Care about agile software processes, data-driven development, reliability, & responsible experimentation
- Genuine desire to automate decision making, processes, & workflows
- Experience working with dependency management tools such as Luigi/Airflow
- Experience with DevOps domain - working with build servers, docker & containers clusters (kubernetes)
- Experience in Mentoring & growing a diverse team of talented data engineers
- B.S./M.S. in Computer Science or a related field
- Excellent communication skills & a team player
- Vertica or other columnar data stores
- Google BigQuery
- Spark Streaming or other live stream processing technology
- Cloud environment, Google Cloud Platform
- Container technologies - Docker / Kubernetes
- Ad serving technologies & standards
- Experience with Avro, Parquet, or ORC