Key Skills: Big Data, Java, Hadoop, Cassandra, AWS
Roles and Responsibilities:
- Design, implement, and support large-scale data processing systems using Hadoop (MapReduce, Hive, HDFS)
- Build and optimize data lakes on AWS to store and process massive data sets
- Work with Cassandra to manage distributed, high-volume, and low-latency data workloads
- Collaborate with data scientists, analysts, and product teams to understand data needs and deliver reliable solutions
- Ensure data quality, governance, and consistency across systems and environments
- Monitor and troubleshoot data pipelines, ensuring reliability and scalability
- Utilize strong proficiency in Java with a solid understanding of data structures and multithreading
- Apply hands-on experience with Big Data technologies such as Hadoop, Hive, HDFS, and Spark
- Maintain a solid understanding of ETL processes and data architecture principles
- Work in agile environments with CI/CD practices and ensure familiarity with data security, compliance, and privacy best practices
- Exposure to DevOps tools (Docker, Kubernetes, Terraform) is a plus
Skills Required:
- Strong hands-on experience with Hadoop (MapReduce, Hive, HDFS)
- Proficiency in Java with a good understanding of data structures and multithreading
- Experience with Big Data tools and frameworks like Hive, Spark
- Working knowledge of Cassandra for distributed database management
- Exposure to AWS services and data lake architectures
- Familiarity with ETL processes and data architecture principles
- Knowledge of CI/CD, agile practices, and DevOps tools (e.g., Docker, Kubernetes, Terraform)
- Awareness of data governance, security, and compliance standards
Education: B.E., B.Tech, B.Tech M.Tech (Dual), M.E., MCA, M.Tech in Computer Science Engineering, Computer Science, or Computer Engineering