GIPHY is the worlds largest platform to search, discover, & create all the GIFs. We're the leading brand in short-form entertainment & visual communication.
GIPHY is integrated in iMessage, Facebook, Instagram, Snapchat, Twitter, Tinder, Slack, WhatsApp & many more, serving over 7 billion GIFs daily to over 500 million people across the globe. We are also creating the largest distributed ad platform for short form media.
Were a creative & passionate group of GIF-obsessed individuals continuing to build out what we believe is the future of communication. We have big goals & are looking for like-minded, talented people to join us.
GIPHY serves over 7 billion GIFs per day, our mobile products have millions of users, & our API is integrated into some of the biggest digital platforms in the world. Were looking for a Senior Site Reliability Engineer to help run & improve our different platforms so we can provide the best, most relevant content in real time. You will join a team supporting & scaling the production infrastructure for the delivery of these applications.
What Youll Do:
- Build & manage software delivery, systems integration, & developer support tools
- Design & deploy applications using components of the AWS stack, focusing on high-availability, fault tolerance, & disaster recovery
- Work with developers to troubleshoot, monitor, analyze, & optimize microservices to maintain SLOs
- Conduct performance tuning, load testing, optimization of information/data processing, maintenance, & support of production & development environments
- Serve as a technical expert on the most difficult support & troubleshooting problems
Who You Are:
- 3+ years of experience in Site Reliability or Infrastructure Engineering
- Expert-level knowledge of the Amazon Web Services (AWS) ecosystem
- Strong background in Linux/Unix Administration
- Experience monitoring & supporting 24x7, high availability systems that include web, application & database servers & load balancing systems
- Hands-on experience with a container orchestration system (especially Kubernetes) with experience running, deploying, & debugging containerized microservice deployments in production
- Experience with a scripting language (Python, Ruby, Bash) is required; background in software development (Go, Python, Scala, or Java) is a plus
- Experience with configuration management (Ansible, SaltStack or an equivalent) & defining infrastructure-as-code (Terraform or CloudFormation)
- Experience with relational databases (MySQL, Postgres) is required; experience managing & optimizing databases & data models is a plus
- Working knowledge of web servers, proxies, & caches (e.g., Nginx, Varnish, HAProxy)
- Experience with build (Jenkins, Travis) & deployment automation (Spinnaker, CircleCI) tools for managing software delivery
- Experience with log aggregation tools (Splunk, Elasticsearch, Scalyr)
- Experience with metrics infrastructure tools (DataDog, New Relic, Prometheus)
This is a full time salaried position, including stock options, fully covered health insurance, 4% 401K match, 4 month maternity leave (additional 2 months of transition), 1 month paternity leave (additional 2 months of transition), free lunch every day, free gym membership & lots of other fun perks.