Spark (Pyspark) Developer

Full time
|
Work From Office
Apply Now
Position Filled
Listed on Feb 22, 2023

Job Description

Experience: 6 Years

  • The developer must have sound knowledge in Apache Spark and Python programming.
  • Deep experience in developing data processing tasks using pySpark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations.
  • Experience in deployment and operationalizing the code is added advantage – Have knowledge and skills in Devops/version control and containerization. Preferable – having deployment knowledge.
  • Create Spark jobs for data transformation and aggregation Produce unit tests for Spark transformations and helper methods
  • Write Scaladoc-style documentation with all code
  • Design data processing pipelines to perform batch and Real- time/stream analytics on structured and unstructured data
  • Spark query tuning and performance optimization – Good understanding of different file formats (ORC, Parquet, AVRO) to optimize queries/processing and compression techniques.
  • SQL database integration (Microsoft, Oracle, Postgres, and/or MySQL)
  • Experience working with (HDFS, S3, Cassandra, and/or DynamoDB)
  • Deep understanding of distributed systems (e.g. CAP theorem, partitioning, replication, consistency, and consensus)
  • Experience in building cloud scalable high-performance data lake solutions
  • Hands on expertise in cloud services like AWS, and/or Microsoft Azure.

Required Skills

Hive
Spark
SQL

Hiring Process

  • Screening (HR round)
  • Technical Round 1
  • Technical Round 2
  • Final HR round
Apply Now
Position Filled
Relavant Jobs in Data Science & Machine Learning
Close Icon

Personal Details

Suceess Message Icon
Thank you for submitting your form!
We appreciate your time and effort in providing us with your information.
We will get in touch with you soon.
Error occured submitting the form.
Top to Scroll Icon