I. Core responsibilities
Ensures sustained production by engineering high-availability systems, eliminating chronic failures, and optimizing maintenance for wells, subsea/topsides equipment, and onshore processing assets.
- 1.1 Define and own production reliability KPIs: availability, production efficiency, deferment, bad-actor rates, and maintenance effectiveness.
- 1.2 Perform RCM/FMEA for wells, subsea architecture, separators, compressors, pumps, valves, E&I systems; develop maintenance strategies and critical spares policies.
- 1.3 Lead RCA on unplanned shutdowns and chronic losses; generate corrective actions, management of change, and verification-of-effectiveness plans.
- 1.4 Build and maintain RAM models/RBDs to predict uptime, optimize redundancy, and support Brownfield/Greenfield design decisions.
- 1.5 Drive bad-actor elimination: Pareto analyses, defect elimination sprints, re-engineering of failure-prone components, and operating envelope tuning.
- 1.6 Execute condition monitoring programs (vibration, thermography, oil analysis, corrosion/erosion monitoring) and trend failure precursors.
- 1.7 Optimize preventive/predictive maintenance intervals using statistical reliability data (Weibull, exponential) and cost–risk trade-offs.
- 1.8 Quantify and report production deferment; separate planned vs. unplanned; prioritize removal by value-at-risk.
- 1.9 Develop and steward SCE performance standards and barrier health monitoring; interface with SIL/LOPA for high-integrity protection layers.
- 1.10 Plan reliability scope for turnarounds/campaigns; define inspection test plans, end-of-life replacements, and post-startup validation.
- 1.11 Lead spares criticality/stocking studies (ABC, risk-based) and obsolescence management for long-lead items.
- 1.12 Govern data quality in CMMS/APM aligned to ISO 14224 (taxonomy, failure codes); close feedback loop from execution to strategy.
- 1.13 Produce monthly reliability reports to asset leadership with trends, forecasts, and prioritized improvement actions.
- 1.14 Contribute to Front-End Loading (FEL) by embedding operability, maintainability, sparing, and testability into design.
I.A Key reliability equations (selected)
- 1.A.1 Failure rate: \( \lambda = \dfrac{N_{\text{fail}}}{T_{\text{oper}}} \)
- 1.A.2 MTBF/MTTR: \( \text{MTBF} = \dfrac{T_{\text{oper}}}{N_{\text{fail}}},\quad \text{MTTR} = \dfrac{T_{\text{repair}}}{N_{\text{rep}}} \)
- 1.A.3 Availability: \( A = \dfrac{\text{MTBF}}{\text{MTBF} + \text{MTTR}} \)
- 1.A.4 Reliability (Weibull): \( R(t) = \exp\!\left[-\left(\dfrac{t}{\eta}\right)^{\beta}\right] \)
- 1.A.5 Systems: \( R_{\text{series}} = \prod_i R_i;\quad R_{\text{parallel}} = 1 - \prod_i (1 - R_i) \)
- 1.A.6 Deferment: \( D = Q_{\text{potential}} - Q_{\text{actual}};\quad D_{\text{cum}} = \sum_t D_t \)
II. Required technical skills, soft skills, and physical demands
II.A Technical skills
- 2.1 RCM/FMEA/FMECA application to wells, subsea, rotating/static equipment, E&I, and control systems.
- 2.2 RAM modeling, reliability block diagrams, Monte Carlo simulation, and sensitivity analyses.
- 2.3 Data analytics on historian/CMMS data; Weibull analysis, Bayesian updating, regression, SPC.
- 2.4 Condition-based maintenance techniques: vibration, ultrasonic, oil/grease tribology, thermography, motor current signature, corrosion probes.
- 2.5 Process/production systems knowledge: wells, ESPs/gas lift, flowlines, subsea controls, separation, compression, dehydration, flaring, power/instrument air.
- 2.6 SCE/barrier management, SIL targets, proof-test optimization, and functional safety interfaces.
- 2.7 CMMS/APM master data: equipment hierarchy, criticality, failure coding, PM optimization, and work management KPIs.
- 2.8 Cost–risk optimization: LPO/NPV of redundancy, repair vs. replace, maintenance interval economics.
- 2.9 Standards familiarity: ISO 14224, IEC 61508/61511, API 580/581, API 670/671 (estimated).
II.B Soft skills
- 2.10 Facilitation of cross-disciplinary RCA/RCM workshops; conflict resolution; consensus building.
- 2.11 Decision framing with uncertainty; clear risk communication to leadership.
- 2.12 Influence without authority across Operations, Maintenance, Projects, and HSE.
- 2.13 Report writing and visualization for non-technical stakeholders.
II.C Physical demands and certifications
- 2.14 Periodic offshore/site visits; climbing stairs/ladders, confined space entry oversight, field walkdowns.
- 2.15 PPE usage in hazardous areas; fit for H2S environments; ability to read P&IDs in situ.
- 2.16 Typical industry trainings: BOSIET/HUET, H2S, lockout/tagout, electrical area classification (estimated).
III. Typical tools/software/equipment used
- 3.1 CMMS: SAP PM, Maximo; APM suites for asset health dashboards.
- 3.2 Historian/analytics: PI/real-time data historians; SQL, Python/R, and BI visualization tools.
- 3.3 RAM/RBD and availability simulators (e.g., MAROS/AvSim, BlockSim).
- 3.4 RCM/FMEA facilitation and LOPA tools; bow-tie analysis software.
- 3.5 Condition monitoring: vibration analyzers, portable data collectors, ultrasonic detectors, IR cameras, oil analysis kits, corrosion/erosion probes.
- 3.6 NDT methods/interfaces: UT, RT, PT/MT; thickness gauges for corrosion monitoring.
- 3.7 Engineering: PFD/P&ID and 3D model viewers; cause-and-effect and SIF proof-test management tools.
III.A Toolchain Snapshot
- 3.A.1 CMMS/APM: SAP PM/Maximo + APM dashboards
- 3.A.2 RAM/RBD: MAROS/AvSim, BlockSim (or equivalent)
- 3.A.3 Data: PI historian, SQL, Python, BI tools
- 3.A.4 RCA/RCM: LOPA/bow-tie/RCM platforms
- 3.A.5 CM: Vibration/IR/ultrasonic/oil labs, corrosion probes
- 3.A.6 Documentation: 3D model/P&ID viewers; cause-and-effect matrices
IV. Work environment
- 4.1 Onshore-based in asset support or central reliability team with regular offshore or plant visits (1–4 trips/quarter typical).
- 4.2 For offshore roles, campaign/rotation during major outages or projects (e.g., 14–14 or 28–28; estimated based on asset).
- 4.3 Mixed tempo: desk analysis, site walkdowns, workshop facilitation, and leadership briefings.
- 4.4 Travel to vendors for FAT/SAT, overhaul inspections, and critical spares audits as needed.
V. Reporting lines and cross-functional interfaces
- 5.1 Reports to: Reliability & Maintenance Manager or Production/Operations Manager (asset-level).
- 5.2 Interfaces with: Operations (production supervisors, control room), Maintenance (mechanical, E&I, I&C), Process Engineering, Wells (production technology/artificial lift), Subsea, Integrity/Inspection, Projects, Supply Chain, and HSE.
- 5.3 External: OEMs, repair shops, inspection/NDT contractors, and engineering consultancies.
V.A Deliverables & Interfaces
- 5.A.1 To Asset Leadership: monthly reliability report, deferment register, risk register updates, RAM forecasts.
- 5.A.2 To Operations/Maintenance: optimized PM routines, condition-monitoring routes, bad-actor action plans, shutdown/startup risk controls.
- 5.A.3 To Projects: RCM/FMEA outputs, sparing philosophy, RBD/RAM results, maintainability and accessibility requirements.
- 5.A.4 To Supply Chain: critical spares lists, stocking policies, repair/replace criteria, vendor performance feedback.
- 5.A.5 To HSE/Integrity: SCE performance standards, barrier health KPIs, proof-test intervals, LOPA inputs.
VI. Career ladder
- 6.1 Production Reliability Engineer (this role): asset-level ownership of reliability KPIs and improvement program.
- 6.2 Senior Production Reliability Engineer: leads multi-asset initiatives, mentors engineers, governs standards and data models.
- 6.3 Reliability Lead/Team Lead: manages reliability team, sets strategy, integrates across Maintenance/Operations/Projects.
- 6.4 Production Assurance/Operations Excellence Manager: portfolio-wide reliability, RAM in projects, performance management, and budgeting.
- 6.5 Adjacent pathways: Asset Integrity Manager, Maintenance Manager, or Project Engineering Manager (estimated based on organization).
VI.A Progression Trigger
- 6.A.1 Typically promoted to Senior after 3–5 years in-role with =3 major RCAs closed, =2 RAM/RCM studies delivered, and certifications such as CMRP/CRE (or equivalent).
- 6.A.2 Advancement to Lead after demonstrating multi-asset impact (e.g., sustained availability uplift of =2–3 percentage points) and governance of CMMS/APM master data.


Collaborate and learn alongside you peers. Professional development on your schedule. API training programs will help you advance your career. Browse our list of courses today.