Participate as part of on call rotation supporting Digital platform services and solutions.
Improve the reliability of our systems and processes with a keen focus on ensuring built-in quality.
Partner with Product, Architects, and Engineering to help define/measure KPIs, SLI/SLOs.
Strive to reduce toil through automation initiatives.
Create and maintain operational documentation.
Run daily HoTo call with SRE Team members.
Attend daily connect call with SRE Manager and SRE Lead.
Attend daily connect call with customer, SRE Lead and SRE Shift Lead for day to day progress.
Make sure SLA meet for Incident and Service Requests.
For any priority tickets, escalated issues, inform to SRE Lead, SRE Manager and Onsite stakeholders.
Manage a team of SRE’s to proactively ensure the stability, resilience and scale of our services by automation, testing and engineering. To take highly complex and manual processes and work to simplify and automate them.
Provide coaching and mentoring to the SRE team to improve their skill sets.
Do alert analysis on a daily basis and create SOP on escalated alerts.
Perform Incident post-mortem to analyze system failures, identify areas of improvement and work to minimize downtime and disruptions.
Ensure adequate staffing is maintained in all the shifts in order to meet the offshore deliverables.