Senior Data Engineer (with Data Science Skills)
PeakData
About PeakData
PeakData provides AI-powered market intelligence to optimize drug launch execution and resource allocation for pharmaceutical companies. Our platform delivers actionable insights on healthcare professionals (HCPs) and healthcare organizations (HCOs), empowering commercial leaders with real-time, data-driven decision-making.
Role Overview
We're looking for a Senior Engineer with strong data science capabilities to join our Data Platform team. In this role, you’ll design and build cloud-native data solutions that support large-scale processing, analytics, and AI-powered automation across our platform.
This is a hands-on, senior-level role. You will be expected to work independently, own end-to-end pipelines and infrastructure, and drive initiatives forward both individually and within the team. You should have a strong foundation in Python, SQL, AWS, and/or GCP, with experience using or integrating LLMs into data workflows.
Tech Environment
You’ll work with and expand upon:
- Python for data pipelines and automation
- SQL (PostgreSQL) for transformation and analytics
- AWS (S3, Glue, Lambda, ECS, Bedrock) as primary cloud environment
- GCP (Vertex AI) for select workloads and integrations
- Medallion architecture with RAW/CLEANED/CURATED layers
- LLM integrations for automation, enrichment, and insight generation
- Data quality frameworks and orchestration tools (e.g., Argo)
Key Responsibilities
Engineering Ownership
- Design, implement, and maintain scalable and efficient data pipelines across AWS and GCP
- Build data products and services supporting both internal analytics and client-facing insights
- Own ETL/ELT workflows from ingestion to curation
- Implement observability and alerting for pipeline health and data integrity
- Integrate LLMs into workflows to support enrichment, automation, or intelligent data handling
Team Leadership & Initiative
- Act as a technical lead for data engineering projects, driving execution independently
- Collaborate cross-functionally with Data Science, Product, and Engineering teams
- Contribute to architectural decisions and long-term data platform evolution
- Champion best practices for performance, security, and scalability
Data Science & LLM Integration
- Apply data science techniques where appropriate (e.g., clustering, statistical inference)
- Prototype and validate LLM-powered solutions using tools like AWS Bedrock or Vertex AI
- Use prompt engineering and evaluation frameworks to fine-tune LLM interactions
- Help bridge engineering and AI innovation across the platform
Qualifications
Required Skills & Experience
- 6+ years of experience in data engineering or back-end systems with data-heavy workloads
- Strong hands-on skills with Python and SQL
- Deep understanding of AWS cloud data tooling (S3, Lambda, Glue, Step Functions, etc.)
- Working experience with GCP services, especially BigQuery and Vertex AI
- Exposure to LLMs and how they integrate into data workflows
- Experience building data pipelines at scale with monitoring and alerting
- Ability to work independently and take ownership of technical topics
Bonus Skills
- Experience with Argo, Airflow or similar orchestration frameworks
- Familiarity with IaC tools (Terraform) for deploying infrastructure
- Experience with data quality monitoring, validation frameworks, or anomaly detection
- Previous work in healthcare, life sciences, or regulated data environments
Personal Attributes
- Proactive: You take initiative and don’t wait for tasks to be assigned
- Autonomous: You can own projects from design to production with minimal oversight
- Curious: You explore new approaches (especially LLMs/AI) and bring them to the table
- Collaborative: You work well with cross-functional teams
- Customer-aware: You understand the real-world impact of your pipelines and models
What We Offer
- Purpose-driven work: support pharmaceutical innovation and better patient outcomes
- Ownership: real autonomy in shaping our data systems and how they scale
- Innovation: work on LLM integration and next-gen data workflows
- A collaborative, fast-moving environment
- Competitive compensation
- Access to both AWS and GCP ecosystems in production
If you're a hands-on data engineer who enjoys owning end-to-end systems, loves solving real business problems, and thrives in a hybrid cloud + AI environment — we want to talk to you.