GenAI Data Engineer
Our client, a leading global supplier for IT services, requires experienced GenAI Data Engineer to be based at their client’s offices in London or Edinburgh.
هذه وظيفة مختلطة – يمكنك العمل عن بُعد في المملكة المتحدة والحضور إلى مكتب لندن أو إدنبرة يومين في الأسبوع.
هذا عقد مؤقت لمدة 6 أشهر أو أكثر يبدأ في أقرب وقت ممكن.
السعر اليومي: سعر السوق التنافسي
المسؤوليات الرئيسية:
- Design and maintain scalable data pipelines using PySpark, Python, and distributed computing frameworks to support high‑volume data processing.
- Architect and optimise AWS-based data and AI infrastructure, ensuring secure, performant, and cost‑efficient ingestion, transformation, and storage.
- Develop, finetune, benchmark, and evaluate GenAI/LLM models, including custom training and inference optimization.
- Implement and maintain RAG pipelines, vector databases, and document-processing workflows for enterprise GenAI applications.
- Build reusable frameworks for prompt management, evaluation, and GenAI operations.
- Collaborate with cross-functional teams to integrate GenAI capabilities into production systems and ensure high-quality data, governance, and operational reliability
المتطلبات الأساسية:
- Strong experience with PySpark, distributed data processing, and largescale ETL/ELT pipelines.
- Strong SQL expertise including star/snowflake schema design, indexing strategies, writing optimized queries, and implementing CDC, SCD Type 1/2/3 patterns for reliable data warehousing.
- Advanced proficiency in Python for data engineering, automation, and ML/GenAI integration.
- Hands on expertise with AWS services (S3, Glue, Lambda, EMR, Bedrock / custom model hosting).
- Practical experience with GenAI/LLM model creation, finetuning, benchmarking, and evaluation.
- Solid understanding of RAG architectures, embeddings, vector stores, and LLM evaluation methods.
- Experience working with structured and unstructured datasets (documents, logs, text, images).
- Familiarity with scalable data storage solutions (Delta Lake, Parquet, Redshift, DynamoDB).
- Understanding model optimisation techniques (quantisation, distillation, inference optimisation).
- Strong capability to debug, tune, and optimise distributed systems and AI pipelines.
نظرًا لضخامة عدد الطلبات الواردة، لا يمكننا للأسف الرد على الجميع.
إذا لم تتلق ردًا منا في غضون 7 أيام من إرسال طلبك، يرجى اعتبار أن طلبك لم يتم قبوله هذه المرة.
يرجى متابعة موقعنا الإلكتروني https://projectrecruit.com/jobs/ لمعرفة الوظائف المستقبلية.

