8 to 10 Years of Relevant Experience
We are seeking a highly skilled and motivated Data Engineer with 8–10 years of experience in building scalable data pipelines and implementing robust data engineering solutions. This role involves working with modern data tools and frameworks such as Apache Airflow, Python, and PySpark to support the reliable delivery, transformation, and integration of data.
Key Responsibilities
- Design, develop, and maintain data pipelines and ELT processes using Apache Airflow, Python, and PySpark to ensure efficient and reliable data delivery.
- Build custom data connectors to ingest structured and unstructured data from diverse sources and formats.
- Collaborate with cross-functional teams to gather requirements and translate business needs into scalable technical solutions.
- Implement DataOps principles and best practices for efficient and resilient data operations.
- Design and deploy CI/CD pipelines for automated data integration, transformation, and deployment.
- Monitor and troubleshoot data workflows to proactively resolve ingestion, transformation, and loading issues.
- Perform data validation and testing to ensure accuracy, consistency, and compliance.
- Stay current with emerging trends and best practices in data engineering and analytics.
- Maintain comprehensive documentation of data workflows, pipelines, and technical specifications to support governance and knowledge sharing.
Qualifications
- MUST HAVE: AWS Certified, such as Solution Architect Associate.
- AWS Data Engineer with hands-on experience on several of the following AWS services.
- Experience in Apache Airflow will be a bonus.
- With Data Engineer profiles with Python background.
Required Skills
- Athena
- Glue ETL & Crawlers
- Redshift
- RDS
- Dynamo DB
- Lambda
- CLI
- EC2
- Step Functions
- CloudWatch
Nice to Have
- Linux
- OpenShift
- Kubernetes
- Apache Superset