Skills:
Business Intelligence and Analytics Tools
- Power BI:
- Proficiency in DAX (Data Analysis Expressions) and Power Query for data transformation.
- Experience in connecting Power BI to various data sources (SQL databases, Snowflake, APIs, etc.).
- Knowledge of Power BI administration, including workspace management and governance.
- Looker:
- Expertise in Looker modeling, including LookML development and dashboard creation.
- Experience in integrating Looker with Snowflake or other cloud databases.
- Familiarity with Looker API for automation and embedding analytics.
- Knowledge of Looker administration and user management.
Key Responsibilities
- Infrastructure and Operations: Ensure the reliability and scalability of critical systems by designing and managing robust infrastructure solutions.
- System Monitoring: Proactively monitor system health, using performance metrics and automated tools to detect potential issues before they impact users.
- Incident Management: Lead response efforts during service disruptions, ensuring swift resolution and minimal downtime.
- Problem Solving: Analyze root causes of system failures and implement long-term fixes to enhance system reliability.
- Automation: Develop scripts and tools to automate repetitive tasks, improving operational efficiency and reducing manual interventions.
- Collaboration: Partner with development teams to align on reliability goals and implement best practices into software design and deployment.
- Documentation: Maintain comprehensive system documentation to support consistent and efficient troubleshooting and knowledge sharing.
- Continuous Improvement: Drive innovation by identifying areas for enhancement and applying cutting-edge technologies and operational practices.
Qualification:
- Service Reliability: Experience with managing and maintaining highly-available systems, including cloud-based infrastructure.
- Programming: Proficiency in programming to automate repetitive tasks ("toil") to reduce manual effort and human error.
- Monitoring & Observability: Solid understanding of monitoring tools, incident management platforms, and metrics analysis.
- Technical Depth: Deep knowledge of system performance optimization and troubleshooting methodologies. Experience with cloud platforms, databases, CI/CD, distributed systems, and security best practices.
- Communication & Collaboration: Strong communication skills (written and verbal) to effectively collaborate across cross-functional teams.
- Problem Solving: Ability to thrive in high-pressure situations and demonstrate a calm, methodical approach to problem-solving. Analytical mindset for interpreting data, metrics, and patterns to make informed decisions and predict future issues.
- Systemic Thinking: Ability to view interconnected systems holistically anticipating the broader impact of changes and designing for resilience.
- Ownership and Proactiveness: Take responsibility for the reliability and performance of services. Proactively identifying potential problems, performance bottlenecks, and areas for improvement before they impact users.
Education & Experience
- Education: Bachelor’s or Master’s Degree in Information systems, Computer Science / Computer Engineering or equivalent.
- Experience: 11-15 yrs of experience
नौकरी रिपोर्ट करें