Company: BP
Skills: IT - Analysis & Management, IT - Software Development
Experience: 5 + Years
Education: Bachelors/3-5 yr Degree
Location: Pune, Maharashtra, India


Role Synopsis:
Developingand maintainingdata infrastructure andwriting, deploying and maintainingsoftware tobuild, integrate, manage, maintain, and quality-assure data. Plan and build compelling data products and services, in collaboration with business stakeholders, Data Managers, Data Scientists, Software Engineers and Architects. In bp's Data and Analytics Platform team, responsible for the platforms and services that underpin bp's data supply chain, covering technologies that support the life cycle of critical data products, bringing together data producers/consumers through enablement and industrial scale operations of data ingestion, processing, storage and publishing, including data visualisation, advanced analytics, data science and data discovery platforms
Key Accountabilities:
  • Design and development of industrial scale data pipelines on Azure and AWS data platforms and services, building data ingestion and publishing pipelines and development and provisioning of data nodes for wide scale access for data professionals
  • Design and develop software for distributed systems, data warehouses, execute on GDPR and other privacy requirements from digital security and need to have business context and knowledge about the data domains they are working in
  • Own the end-to-end technicaldatalifecycleand correspondingdatatechnology stackfor their data domainandto havea deep understanding of the bp technology stack
  • Mentors and coaches other data engineers
Desirable Education and Experience:
  • BS degree in computer science or related field
  • 5 to 10 years with minimum of 3 to 5 years relevant experience
Required Criteria:
Experience in:
  • Designing, planning, implementing, maintaining & documenting reliable & scalable data infrastructure & data products in complex environments
  • Development experience in 1 or more prog. lang. (Python, Go, Java, C++)
  • Designing and implementing large-scale distributed systems
  • Technologies across all data lifecycle stages
Preferred Criteria:
  • Data Manipulation: debug and maintain the end-to-end data engineering lifecycle of the data products; design and implementation of the end-to-end data stack, including designing complex data systems, e.g. interoperability across cloud platforms
  • Software Engineering: SQL and NoSQL database fundamentals, query structures & design best practices, incl. scalability, readability, and reliability; proficient in at least one object-oriented programming language, e.g. Python [specifically data manipulation packages - Pandas, seaborn, matplotlib], Apache Spark or Scala
  • Scalability, Reliability, Maintenance: building scalable & re-usable systems; automating operations, identifying & building for long-term productivity over short-term speed/gains, executing on those opportunities to improve products or services
  • Data Domain Knowledge: understanding of data sources and data & analytics requirements and typical SLAs associated to data provisioning and consumption at enterprise scale
  • Standards/Best Practices: understanding of leading insight of industry and tech trends and best practices for data product life cycle; demonstrable knowledge of data engineering best practices
  • Right approach/tool choice: wide range of data engineering and data infrastructure approaches and tools, latest developments in the field, and ability to mentor others in selecting the right approaches to solve problems
  • Agile: modern development methodologies (Agile using Scrum and/or Kanban)
Key Behaviors:
  • Empathetic: Cares about our people, our community & our planet
  • Curious: Seeks to explore & excel
  • Creative: Imagines the extraordinary
  • Inclusive: Brings out the best in each other

.