8+ Years Relevant Experience
We are seeking an experienced Application Support Lead with over 8 years of hands-on expertise in production support, incident management, and performance monitoring. The ideal candidate should possess strong problem-solving capabilities, leadership qualities, and a deep understanding of modern monitoring and alerting tools, API troubleshooting, and SLA/KPI compliance. You will lead a team of support engineers, ensuring the stability, performance, and reliability of production systems.
Key Responsibilities:
- Lead and coordinate day-to-day application support activities for production environments.
- Use tools like Postman to troubleshoot API/web page issues and understand HTTP response codes (2xx, 3xx, 4xx, 5xx).
- Leverage monitoring/reporting tools such as Grafana, Coralogix, Datadog for system performance and health checks.
- Analyze and optimize key business performance indicators (KPIs) and service level metrics.
- Collect and analyze performance data to perform root cause analysis (RCA) and implement corrective and preventive actions (CAPA).
- Drive incident resolution and ensure SLA adherence with proper documentation and tracking.
- Train new team members, set clear expectations on SOPs/SLAs, and perform regular knowledge-sharing sessions.
- Proactively identify and suggest process improvements or automation opportunities.
- Collaborate with cross-functional teams and stakeholders to resolve issues and implement improvements.
- Maintain a clear view of personal and team priorities and tasks; manage a 3–5 member team effectively.
- Participate in Agile ceremonies and provide support documentation and backlog grooming when needed.
Required Skills & Qualifications:
- 8+ years of experience in application/production support and leading support teams.
- Hands-on experience with Postman and API troubleshooting.
- Familiarity with HTTP status codes and common server error diagnostics.
- Proficiency in tools such as Grafana, Datadog, Coralogix, or similar.
- Strong troubleshooting, debugging, and performance-tuning abilities.
- Experience in incident management, change control processes, and technical documentation.
- Strong leadership, organizational, and communication skills.
- Ability to function both independently and collaboratively in a fast-paced environment.
Good to Have:
- Experience with automation scripting/tools to reduce manual effort.
- Working knowledge of AWS cloud services, including alert configuration and tracing in APM/log aggregation tools (e.g., ELK, Splunk).
- Familiarity with Agile methodologies, Kanban, and JIRA.
- Understanding of CI/CD pipelines and DevOps culture.