Skip to main content

Sr. Site Reliability Engineering - Hadoop, Spark, Hive(Platform Support)

Job Description A Senior Site Reliability Engineer must perform a variety of tasks and demonstrate a profound understanding of Hadoop and its related tools, such as Hive, Spark, and HDFS. The primary responsibilities include: Single Window Support: Utilize an in-depth understanding of Hadoop and its related tools, especially Hive, Spark, and HDFS, to conduct comprehensive root cause analyses, whether they are platform, data, or user code related. System Configuration: Recommend necessary system changes to the DAP platform engineering team by examining system activity and user logs for triaging and troubleshooting. Performance Tuning: Guide team members in crafting efficient queries by leveraging expertise in performance tuning and optimization strategies for big data technologies. Issue Resolution Across Tech Teams: Troubleshoot and resolve complex technical issues. Identify root causes, determine which tech/data platform team can rectify it, and coordinate amongst those teams. Reliability Engineering: Create reports to define performance and resolution metrics for proactively identifying issues and generating alerts. Office Hours and Liaising: Participate in calls across different regions in multiple time zones to ensure timely client delivery. Knowledge Cataloging and Sharing: Share knowledge and cross-train peers across geographic regions using wikis and communication tools. Provide communication around issues/outages affecting multiple users. Develop Standards: The team should prepare standard configurations for a variety of VCA workloads to ensure jobs run with optimal settings, maintaining good cluster health while executing jobs efficiently. Continuous Learning of VCA Workload: Continuously learn and stay updated with the changing nature of data science jobs to help improve cluster utilization. Additionally, with active engagement, collaboration, effective communication, quality, integrity, and reliable delivery, develop and maintain a trusted and valued relationship with the team, customers, and business partners. This is a hybrid position. Hybrid employees can alternate time between both remote and office. Employees in hybrid roles are expected to work from the office 2-3 set days a week (determined by leadership/site), with a general guidepost of being in the office 50% or more of the time based on business needs. Qualifications Basic Qualifications: A minimum of 3 years of relevant work experience and a Bachelor's degree, or 2 or more years of relevant work experience with an Advanced Degree (e.g. Masters, MBA) Preferred Qualifications: Practical experience as a Hadoop system engineer, specifically in managing Hadoop platforms. An ability to solve intricate production problems and debug code. A strong understanding of data pipelines built using PySpark, Hive, and Airflow. Experience with scheduling tools (such as Airflow, Oozie) or in building data processing orchestration workflows. Proficiency in tuning the performance of applications on Hadoop platforms. Good knowledge of the Hadoop ecosystem, including Zookeeper, HDFS, Yarn, Hive, and Spark. Hands-on experience in debugging Hadoop issues, both on the platform and applications. An understanding of Linux, networking, CPU, memory, and storage. Knowledge or experience in Python. Excellent written and verbal communication skills. A strong work ethic, the ability to work quickly and smartly, and a capacity for understanding complex concepts and functionalities. Additional Information Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.

Sr. Site Reliability Engineering - Hadoop, Spark, Hive(Platform Support)

Visa
Bengaluru, Karnataka
Full time

Published on 07/09/2024

Share this job now