WHAT IS BOX
Box is the market leader for Cloud Content Management. Our mission is to power how the world works together. Box is partnering with enterprise organizations to accelerate their digital transformation by creating a single platform for secure content management, collaboration & workflow. We have an amazing opportunity to further establish ourselves as leaders in the space, & we need strong advocates to help us achieve that goal.
By joining Box, you will have the unique opportunity to help capture a majority of this developing market & define what content management looks like for the digital enterprise. Today, Box powers over 97,000 businesses, including 70% of the Fortune 500 who trust Box to manage their content in the cloud.
WHY BOX NEEDS YOU
Box is looking for a dynamic Technical Duty Officer to help lead our Network Operations Center & support an industry-leading platform. It is the responsibility of the NOC team to monitor, troubleshoot, & resolve issues that affect the availability & quality of the Box platform. The NOC team is the frontline of defense in making sure our customers like GE, Pandora, Apple & Gap have a seamless experience when accessing their content on Box.
This is an integral job function within the NOC that ensures the overall production site health & the performance of core customer facing journeys. This role will help maintain total site awareness, detecting metric & service deviations, monitoring changes, & proactively identifying potential issues & resolving before they escalate to customer impacting levels.
We are building a world class NOC & need the best talent possible to get us there. That's where you come in!
WHAT YOU'LL DO
- Own live-site Incident Management
- Spring into action during customer-impacting events & lead a team to quickly solve the problem
- Operate across interpersonal boundaries to protect our customers, their data, & the availability of all Box services
- Troubleshoot critical problems through applications, systems, clouds, & networks
- Provide technical leadership & key insights to improve Box's Reliability Engineering capabilities
- Build tools & processes to improve manageability, observability, resiliency & time to restore service for critical incidents
WHO YOU ARE
- You have 5+ years of large-scale production operations or development experience & enjoy talking reliability engineering
- You take initiative when you see a problem; you are a life-long learner who seeks out knowledge
- You are confident & comfortable communicating from the individual-contributor level up through C-level staff
- You have a rock solid command presence & are calm & collected in stressful situations, such as a major service outage.
- You're driven to learn new skills & technologies
- Bachelor's degree in Computer Science or Information Systems or equivalent technical field, or similar work experience in a large-scale 24/7 production environment supporting critical, real-time applications
- Solid grasp of Linux Red Hat, Unix, Perl & Shell scripts
- Experience working in virtualized environments & cloud implementations
- Solid understanding of the TCP/IP suite, routing protocols such as BGP & OSPF & DNS
- Kubernetes/Public Cloud experience
- Outstanding interpersonal & communication skills.
- Incident management in a large scale, high uptime environment a plus
- Flexibility to work shift model
We are an equal opportunity employer & value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
For details on how we protect your information when you apply, please see our Personnel Privacy Notice.