Site Reliability Engineer – Operations
RealPage View all jobs
- Manila City, Metro Manila Pasig City, Metro Manila
- Permanent
- Full-time
- Manage and support Windows-based production environments, including IIS, Windows Services, Active Directory, and related infrastructure
- Build, maintain, and enhance monitoring, alerting, and observability frameworks using ELK or equivalent platforms
- Lead incident response, troubleshooting, and root cause analysis (RCA) for customer-impacting issues
- Improve system reliability by reducing critical incidents and driving down Mean Time to Resolution (MTTR)
- Develop and maintain automation using scripting tools such as PowerShell, Python, or similar technologies
- Support high-availability, high-performance production systems and participate in on-call rotations
- Collaborate with cross-functional teams to ensure platform stability, security, and reliability
- Contribute to platform upgrades, patching, modernization initiatives, and operational best practices
- Create and maintain runbooks, operational standards, and documentation
- 5+ years of experience in Windows Server environments, including IIS and Windows Services
- 5+ years of experience with monitoring and observability tools (ELK stack or equivalent)
- Strong experience with incident management, troubleshooting, and root cause analysis
- Hands-on experience with automation and scripting (PowerShell, Python, etc.)
- Working knowledge of Linux systems for basic administration and troubleshooting
- Strong understanding of system performance, scalability, and operational best practices
- Experience supporting production systems with high availability requirements
- Familiarity with cloud platforms (AWS, GCP, Azure) is a plus
- Exposure to CI/CD tools and DevOps practices
- Strong communication, collaboration, and ownership mindset
- Ability to operate effectively in a fast-paced, production-focused environment