
Data Scientist - II
Highlights
- Job Number 37702
- Location West Point, PA
- Pay Rate Up to: $68/Hour
- Start Date Oct 27, 2025 to Oct 23, 2026
Description
Position Description:
The Data Scientist will join the Digital Sciences team within the Analytical Enabling Capabilities sub-department of Analytical Research & Development (AR&D) at Merck. This group plays a critical role in establishing data workflows and predictive tools to accelerate the identification, characterization, and development of novel medicines and vaccines.
The successful candidate will design, develop, and implement data workflows and pipelines in support of scientific research. This is a highly collaborative role, requiring close partnership with scientists, IT colleagues, and digital/data teams across Merck Research Laboratories. The work spans diverse therapeutic modalities, including small molecules, peptides, biologics, and vaccines.
This is not a typical IT position — it requires a professional who can understand scientific data, partner with scientists to automate electronic notebooks, and apply analytical expertise to genomics and experimental outputs.
Location: West Point, PA (onsite 2–3 days per week)
Openings: 2 positions
Key Responsibilities:
- Design, develop, and implement data workflows and pipelines in Python.
- Collaborate with scientists to understand data generated from experiments and support automation of electronic laboratory notebooks.
- Partner with IT to integrate workflows into production environments.
- Manage projects, timelines, and provide accurate effort estimations.
- Participate in daily standups and present updates to collaborators.
- Contribute to the continuous improvement of data engineering practices and propose innovative solutions to common workflow challenges.
Skills & Qualifications
Required:
- Bachelor’s degree in Computer Science or related field; or a degree in Chemistry or related discipline with strong programming capabilities.
- 4–6 years of relevant data engineering experience.
- Strong expertise with AWS cloud services (Lambda Functions, S3, CloudFormation Templates, RDS, ECR).
- Proficiency in developing ETL processes, data workflows, pipelines, wrangling, and ingestion.
Python 3.9+ software development, including:
- Packages: Boto3, Pandas, pyodbc, openpyxl
- Virtual environments: conda
- IDEs: Visual Studio Code or PyCharm
- Experience with software design, development, and testing (unit and system testing).
- Proficiency with version control (Git, GitHub) and CI/CD workflows (GitHub Actions).
- Strong database skills (relational databases, SQL, data modeling and design).
- Familiarity with multiple file formats (XLSX, YAML, JSON, CSV, TSV).
- Excellent verbal and written communication skills.
- Ability to work independently and collaboratively in team settings.
- Demonstrated drive for continuous improvement and innovation in data workflows.
Preferred:
- Additional AWS cloud services experience: SQS, DLQ, SNS, EventBridge, API Gateway.
- Python packages (Cerberus, PyYAML, logging), linters, type hints, and regular expressions.
- Experience with data pipeline tools such as Dataiku or Trifacta.
- Previous IT or data engineering experience in pharmaceutical research.
- Analytical or genomics experience related to scientific data generation and interpretation.
Education:
Bachelor’s degree in Computer Science or related field; OR
Bachelor’s degree in Chemistry (or related discipline) with strong programming capabilities.