SRE Infrastructure Operations Engineer

This job is no longer active. View similar jobs.

POST DATE 9/2/2016
END DATE 10/25/2016

Everbridge Pasadena, CA

Pasadena, CA
AJE Ref #
Job Classification
Full Time
Job Type
Company Ref #
Entry Level (0 - 2 years)


About the Team: As the Everbridge Infrastructure Operations team, we are responsible for ensuring overall service quality and availability of Everbridge's global SaaS offerings. The network, compute, and application platforms that we support enable our international unified critical communications services to get the right messages to the right people at the right time. We are a 24x7x365 distributed team that can do our job anytime, anywhere on the planet with an Internet connection. Our holistic understanding of OSI layers 0 through 8 allows us to effectively maintain a heterogeneous blend of worldwide public and private cloud infrastructure platforms where lives and livelihoods are at stake in the event of failures. We are dedicated, passionate people who are committed to internal/external customer service and doing the right thing.

About the Job: This position offers the right candidate an opportunity in a hands-on contributor role to help architect, build, maintain, and scale the next generation of public and private cloud infrastructure required for our growing worldwide SaaS platforms. We are looking for proactive, detailed-oriented engineers with diverse technical skills ranging from basic network administration techniques to public cloud infrastructure architecture and deployment. Specifically, candidates should have a T-shaped skill set focusing on modern Linux system administration leveraging automations and industry standard tools.

* Job Duties:

* Collaborate with Architects, Developers, DBA, Application, Security, and NOC teams on designing and maintaining scalable and highly available SaaS infrastructure platforms

* Engage in all aspects of building and maintaining the production infrastructure and service

* Ensure proper security, monitoring, alerting, and reporting for SaaS infrastructure

* Troubleshoot and resolve production issues

* Help drive the capacity planning process

* Help develop and maintain processes, tools, and documentation in support of production

* Participate in the evaluation of new software, hardware, and infrastructure solutions

* Work non-traditional hours when necessary, including a rotating on-call pager duty schedule


* Basic Qualifications:

* Previous experience operating in a NOC or Technical Operations/Site Reliability environment

* Strong scripting skills for task automation and tool framework extension (Python, Perl, bash)

* Experience with deployment/configuration automation and software-defined infrastructure management techniques (PXE/TFTP-based kickstart/preseed, SaltStack preferred, other automations tool experience also acceptable)

* Solid background in UNIX/Linux operating system and security maintenance (especially Ubuntu and Debian GNU/Linux)

* Experience deploying highly scalable and fault-tolerant services within public and private cloud infrastructure (AWS, VMware)

* Infrastructure monitoring and trending software (Nagios, Cacti, Graphite/Grafana, Logstash/commercial ELK stack derivatives)

* Sharp troubleshooting faculties, deductive reasoning, and careful attention to detail

* Independent and self-directed work ethic when participating in a collaborative environment

* Dedicated commitment to service availability and quality customer experience

* Ability to communicate clearly in written and verbal mediums

* Must have proof of US Citizenship for this role

* Desired Qualifications:

* Familiarity with ITIL/ITSM processes

* Agile/Kanban/Lean methodologies applied to IT/Operations workflow

* Basic familiarity with load balancing and maintenance of high availability network services on local and global scales (F5 BIG-IP LTM/GTM)

* Basic familiarity with email transport software and deliverability management concepts (Postfix/Sendmail and derivative commercial MTAs, SPF, DomainKeys/DKIM, IP reputation)

* Basic familiarity with VoIP and traditional TDM telephony infrastructure (FreeSWITCH w/ SIP, T1/DS3/OC3 PRI)

* Basic familiarity with Cisco IOS/NX-OS, Juniper JUNOS, and related hardware device families (Cisco Catalyst/Nexus/ISR/ASR, Juniper routing/switching platforms, Brocade Vyatta SDN)

* Basic familiarity with IPv4/6 routing and dynamic routing protocols (OSPF, BGP)

* Basic familiarity with ethernet switching and related protocols (802.1q, 802.1d/w/s & 802.1q-2005, 802.3ad)

* Basic familiarity with secure firewall and dynamic site-to-site IPsec VPN deployments

* Basic familiarity with rackmount and blade server hardware and software maintenance, including all appropriate best practices (Dell rackmount, HP C7000 blade chassis)

* Basic familiarity with datacenter facilities management (electrical power, HVAC, cable distribution plant)

* Basic awareness, and engagement of system administration community culture and current events

About the company:

Everbridge is the leading critical communications platform trusted by corporations and communities of all sizes to connect the right people for real-time collaboration and response. Connecting more than 100 million people and internet-enabled devices, the company assures that secure, compliant communications are delivered and confirmed, whether locally or globally. . Everbridge was recently named one of the Boston Business Journal s Best Places to Work for 2015!

Everbridge is an Equal Opportunity/Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex including sexual orientation and gender identity, national origin, disability, protected Veteran Status, or any other characteristic protected by applicable federal, state, or local law.