Data Pipeline Engineer

Salary: 

£60-65k + benefits (incl high equity)

Position type: 

Permanent

Job Location: 

London

Data Pipeline Engineer - NLP, Infrastructure, GCP/GKE, Docker, Python

£60-65k + benefits (incl High equity)

This research-led AI software start-up - an entrepreneurial PhD team - is developing new information retrieval and machine learning based approaches for web search, powered by a proprietary real-time task and text processing engine. 

We’re seeking an exceptional, technically hands on Data Engineer to own and manage the development of this data processing engine, so that it maintains the continued, smooth operation which is critical to the product and business. 

In this role, you will: 

  • Take on the management of the data pipeline 
  • Work closely with other engineers to continuously integrate components and features, helping with development where necessary 
  • Architect and implement improvements to the pipeline to make it more scalable and secure 
  • Lead and enforce coding standards and best practice in the engineering team

We're looking for a Data Engineer with 5+ years of experience: 

  • Working on cutting-edge data and text processing pipelines 
  • Working in a process-driven agile environment
  • Developing in Python/Django, ES6/7 and React 
  • Being ambitious, self-driven and comfortable working in a start-up environment 
  • Communicating effectively with product teams and researchers

And who has started to look at the likes of: 

Google Cloud Platform (GCP) / Google Kubernetes Engine (GKE), Docker, K8s 

Experience with some of the following is highly desirable: 

  • Working with machine learning, information retrieval, web search, NLP, knowledge bases and graph frameworks 
  • Making product prototypes and research prototypes production-ready 
  • Integrations with APIs 
  • Ensuring systems are secure and compliant with data protection regulations 
  • Working on virtual assistants 

Here’s a flavour of the existing pipeline: 

  • A data collection component that collects web-based data 
  • A state-of-the-art NLP system that performs real-time entity extraction, workflow identification and knowledge graph construction 
  • A 3rd party integration architecture build in Django and running on GKE/GPE 
  • A user facing product built using React

Visa sponsorship is available for the right candidate.

Data Engineer / Data Pipeline Engineer

Category: