Lead Engineer for Site Reliability and Automation
This job is no longer active.
View similar jobs.
POST DATE 9/2/2016
END DATE 3/3/2017
The University of Chicago
Campus - Hyde Park, IL
JOB DESCRIPTIONWe're looking for a problem solver with a highly technical background to work closely with our development & system infrastructure teams to build out and refine the automation methods for our large-scale data intensive systems. You will join the team as the primary engineer leading this work, and soon have an opportunity to build out your group as we continue to grow. Elevate your career with this opportunity to work with a number of automation tools across the full stack and use the latest technologies. You will join a team of innovative engineers and intelligent research scientists who will keep you challenged in our demanding environment.
This role focuses on the Genomic Data Commons, which by its nature lies at the intersection of cutting edge research and production systems, both in terms of the bioinformatics and the computer science principles being utilized. The Genomic Data Commons is the one of the world's largest collection of harmonized cancer genomics data. Developing a deep technical and quantitative understanding of the system, software, and security architecture will be critical to success in this role.
You will focus on system availability, performance, and capacity monitoring, along with installation, configuration, and operations procedures. You will be given broadly defined goals and expected to work collaboratively across functional teams to determine best methods for achieving objectives. You will be expected to use quantitative models for understanding and improving the overall performance of the system. You will identify, establish, and manage proof of concept environments and report on design outcomes to inform rapid technology advancement. Key responsibilities include:
Automation Frameworks - Build out and maintain automation frameworks across systems, software, data management, and security aspects of a complex platform across on-premise and public cloud environments with a mix of best practices and custom solutions
Production Support ? Triage, research, communicate, address production incidents
Production Monitoring - Wrangle disparate system monitoring assets and develop common analytics to inform optimization, define benchmarks and confidence intervals, and forecast to proactively mitigate production incidents
Build Monitoring - Troubleshoot source code management and deployment issues and participate in continuous delivery objectives
Security Automation ? assist with the automation of our security and compliance procedures.
Technical Writing - Contribute written knowledge and expertise to system documentation, security documentation, scientific manuscripts, reporting, grant proposals and reports, and presentation materials.
Stay abreast of broad technical knowledge of existing and emerging technologies, including public cloud offerings from Amazon Web Services, Microsoft Azure, and Google Cloud.
Other duties as assigned.
This at-will position is wholly or partially funded by contractual grant funding which is renewed under provisions set by the grantor of the contract. Employment will be contingent upon the continued receipt of these grant funds and satisfactory job performance.
All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, protected veteran status or status as an individual with disability.
The University of Chicago is an Affirmative Action / Equal Opportunity / Disabled / Veterans Employer.
Job seekers in need of a reasonable accommodation to complete the application process may contact Human Resources by calling 773-834-1841 or by emailing email@example.com with their request.