Senior Software Engineer
Apply your software engineering skills to solving the hardest problems in big-data genomics and have a broad impact on science and clinical practice, including the treatment of cancer and other diseases. Join a lively team of software engineers and computational biologists dedicated to creating the GATK (http://www.broadinstitute.org/gatk/), a widely used and successful software toolkit for applying next-generation DNA sequencing to medical genetics. The GATK team is an integral part of the Data Sciences and Data Engineering group at the Broad Institute, a nonprofit research institution that is transforming medicine and human health by building software to organize, process, and visualize scientific data on an unprecedented scale. Our tools are used by tens of thousands of researchers across the globe, processing petabytes of data, consuming hundreds of thousands of core-hours weekly, and informing life-saving clinical decisions in medical and cancer genetics.
You will design and implement core capabilities into the GATK framework to allow our tools and algorithms to run effectively at a massive scale in distributed, cloud-based environments, and to help solve the key engineering challenges of the next generation of genomic analysis tools. The amount of data we process is growing at an exponential rate, and innovative solutions and talented engineers are required to ensure that our tools can handle the ever-increasing workload. You will work in close collaboration with computational biologists and other software engineers in a fast-paced research environment, acting as an evangelist for software engineering best practices and design principles, and building a framework that enables the rapid development of usable, scalable, and reliable scientific tools. You will also collaborate with a large and varied team of industry partners working to allow the GATK to take advantage of the latest hardware and software innovations. Your work will directly enable cutting-edge research in genomics and medicine, and be used by scientists all over the world.
* Implement new framework-level subsystems and capabilities in response to the needs of tool authors and computational biologists.
* Develop strategies for running tools efficiently at scale and performing distributed computation over multi-terabyte datasets in the cloud.
* Work with a large team of industry collaborators to enable our tools to leverage the latest hardware and software technologies.
* Profile existing tools and workflows to identify performance bottlenecks, and devise solutions.
* Promote effective testing strategies and quality control measures.
* Educate and mentor computational biologists and junior software engineers in software engineering best practices.
* B.S. degree in Computer Science or related field with 5-10 years of professional work experience is required.
* Expert-level proficiency in Java or another high-level language.
* Experience with Apache Spark, distributed computing, and cloud-based environments such as Google Cloud Platform and Amazon Web Services a plus.
* Thorough knowledge of software testing methodologies and best practices.
* Experience working in Unix/Linux environments.
* Ability to solve complex problems individually and as part of a team.
* Excellent oral and written English communication skills.
EOE / Minorities / Females / Protected Veterans / Disabilities