Exp : 5+ years
Must Have :
- Hadoop/Big Data:HDFS, UNIX, MapReduce, Sqoop, PIG, Zookeeper
- Database: Hive, Hbase, Cassandra, Elastic Search, MapReduce
- Experience in Python, Spark, Hive, Sqoop
- Spark/Scala, Java
- Understanding Data warehousing and data modeling techniques
- Expertise in ETL solutions
- Knowledge in analytical tools such as Tableau and R
- Strong data engineering skills on azure platform
- Streaming framework like Kafka
- Knowledge on Programming Language like core Java, Python, Linux, SQL and any scripting languages
- Knowledge in tools like Streamset, autosys, Cloudera Products
- Knowledge in relational databases Oracle, SQL
- Understanding of Data warehousing and Data Modeling techniques
- Real time scenario/use cases for the above components
- Good programming knowledge in spark with scala or spark with java and some python knowledge with 4-5 years of experience
- Work on ingesting, storing, processing and analyzing large data sets
- Develop and maintain data pipelines implementing ETL processes
- Take responsibility for Hadoop/spark development and implementation
- Building Scala-Spark based data pipeline framework in the project ,if any code development in Scala-spark from scratch.
Generic Managerial Skills :
- Experience with Agile development methodology.
- Experience with Cloud Computing
- Team Management
- Project Management
- Experience with Agile development methodology
- Good Communication and interpersonal Skills