Genomics Data Engineer (Data Science) - Regeneron Genetics Center
Compensation: $90,875.00 - $148,460.00 /year *
Employment Type: Full-Time
Industry: Information Technology
Loading some great jobs for you...
Summary: The Regeneron Genetics Center's Genome Informatics & Data Engineering team is looking for a R&D Data Engineer/Data Scientist to work at the interface of genomics, big data engineering, and advanced analytics. The candidate will contribute to the expansion of our Apache Spark-based distributed analytics platform, building production-quality data processing infrastructure and developing scalable algorithms to leverage genomic and health data from millions of individuals. This role encompasses engineering of end-to-end solutions that 1) unify and structure diverse data sets efficiently, and 2) perform downstream analyses at scale and derive genomic health insights that support Regeneron's drug development pipelines. The ideal candidate will have a strong background in computer science, data mining, machine learning, or a related field, with demonstrated experience in engineering scalable and performant data processing software in Spark or another distributed compute environment. Background in bioinformatics or another life sciences domain is a plus, but not essential. This job requires strong communication skills in order to effectively collaborate with multiple cross-functional teams of scientists, analysts, and engineers. This position will provide exciting opportunities to work on the bleeding edge of big data analytics and genomic medicine. The RGC hosts one of the world's largest data sets encompassing paired genomic and health data, presenting a unique opportunity to incorporate modern big data technologies into the field of precision medicine and to drive novel therapeutic discovery efforts forward. Responsibilities: o Build out a big data distributed architecture capable of efficiently processing terabytes of genomic and clinical data o Develop algorithms and tools to analyze large data sets consisting of billions of rows o Develop and deploy machine learning algorithms at scale o Develop new web applications used by Regeneron scientists to analyze genomic and clinical datasets o Build automated and production-quality data processing systems o Interact and collaborate with other scientists to clearly define and iterate on requirements o Keep abreast of new state-of-the-art software data engineering and data science technologies Qualifications: This position requires a B.S. (M.S. or Ph.D. preferred) with experience in computer science, specializing in high-performance/distributed computing, data mining, machine learning, bioinformatics, or a related STEM discipline. Additional requirements include: o Experience in developing scalable, high-performance software, with a deep understanding of algorithm design principles and data processing pipelines o Knowledge of distributed compute technologies, such as Spark, Hadoop, map-reduce, MPI, or other parallel computing frameworks is essential o Strong foundation in data engineering, data science, and/or machine learning, with demonstrated experience applying these technologies at scale on real-world data sets o 3+ years of software engineering experience in a modern Object Oriented or Functional language o Knowledge of database technologies, indexing/partitioning, and SQL o Excellent communication and presentation skills required o Experience with cloud computing (AWS preferred) o Familiarity with genomics and bioinformatics is preferred, but not required This is an opportunity to join our select team that is already leading the way in the Pharmaceutical/Biotech industry. Apply today and learn more about Regeneron's unwavering commitment to combining good science & good business.Salary Range: NAMinimum QualificationNot Specified years
Associated topics: data analyst, data engineer, data management, data warehousing, database, etl, erp, mongo database, sql, sybase
* The salary listed in the header is an estimate based on salary data for similar jobs in the same area. Salary or compensation data found in the job description is accurate.
Loading some great jobs for you...