GenAI Data Engineer

GenAI Data Engineer

Our client, a leading global supplier for IT services, requires experienced GenAI Data Engineer to be based at their client’s offices in London or Edinburgh.

这是个混合型职位——您可在英国远程办公,每周前往伦敦或爱丁堡办公室工作两天。

这是一份为期6个月以上的临时合同,需尽快开始工作。

日费率:具有市场竞争力的费率

主要职责:

  • Design and maintain scalable data pipelines using PySpark, Python, and distributed computing frameworks to support high‑volume data processing.
  • Architect and optimise AWS-based data and AI infrastructure, ensuring secure, performant, and cost‑efficient ingestion, transformation, and storage.
  • Develop, finetune, benchmark, and evaluate GenAI/LLM models, including custom training and inference optimization.
  • Implement and maintain RAG pipelines, vector databases, and document-processing workflows for enterprise GenAI applications.
  • Build reusable frameworks for prompt management, evaluation, and GenAI operations.
  • Collaborate with cross-functional teams to integrate GenAI capabilities into production systems and ensure high-quality data, governance, and operational reliability

关键要求:

  • Strong experience with PySpark, distributed data processing, and largescale ETL/ELT pipelines.
  • Strong SQL expertise including star/snowflake schema design, indexing strategies, writing optimized queries, and implementing CDC, SCD Type 1/2/3 patterns for reliable data warehousing.
  • Advanced proficiency in Python for data engineering, automation, and ML/GenAI integration.
  • Hands on expertise with AWS services (S3, Glue, Lambda, EMR, Bedrock / custom model hosting).
  • Practical experience with GenAI/LLM model creation, finetuning, benchmarking, and evaluation.
  • Solid understanding of RAG architectures, embeddings, vector stores, and LLM evaluation methods.
  • Experience working with structured and unstructured datasets (documents, logs, text, images).
  • Familiarity with scalable data storage solutions (Delta Lake, Parquet, Redshift, DynamoDB).
  • Understanding model optimisation techniques (quantisation, distillation, inference optimisation).
  • Strong capability to debug, tune, and optimise distributed systems and AI pipelines.

由于申请数量庞大,我们很遗憾无法逐一回复每位申请人。

若您在提交申请后7天内未收到我们的回复,请理解本次申请未能成功。

请务必关注我们的网站 https://projectrecruit.com/jobs/ 以获取未来职位信息。

上传您的简历或其他相关文件。最大文件大小:50 MB。

全球项目
隐私概述

本网站使用 Cookie,以便为您提供最佳的用户体验。Cookie 信息存储在您的浏览器中,其功能包括在您再次访问我们的网站时识别您的身份,以及帮助我们的团队了解您对网站的哪些部分最感兴趣和最有用。