Senior Site Reliability Engineer
Site Reliability Engineer
Solas IT Recruitment Dublin, County Dublin, Ireland (Remote)
We're seeking an experienced Site Reliability Engineer (SRE) with a DevSecOps background to join our growing team. You’ll ensure high availability and performance of internal and external services, while working with real-time data from large-scale distributed systems. You’ll tackle challenges in building fault-tolerant, secure, microservice-based systems.
Key Responsibilities:
- Analyze system metrics to improve performance and fault detection.
- Collaborate with engineering teams to improve service through testing and automation.
- Ensure reliability and minimal downtime by balancing feature speed and system stability.
- Implement best practices for security, compliance, and availability.
- Plan and execute system upgrades.
- Mentor fellow engineers and participate in on-call rotation.
Qualifications:
- Kubernetes: Expertise in managing and troubleshooting production clusters. Experience with Amazon EKS is a plus.
- Configuration Management: Skilled with tools like Ansible, Helm, and Kustomize.
- Monitoring: Familiar with Prometheus, Grafana, and similar tools.
- AWS: Strong knowledge of AWS services (EC2, S3, VPC, etc.).
- Infrastructure as Code (IaC): Experience with Terraform for cloud resource management.
- Queuing Systems: Experience with RabbitMQ, Kafka, or AmazonMQ.
- Database Management: Experience with MySQL and Amazon RDS.
- Networking & Security: Knowledge of network design and security protocols.
- High-Uptime Systems: Expertise in maintaining high-availability environments.
- Collaboration: Ability to work across departments to meet project goals.
- Programming: Proficient in Python, Go, or JavaScript. Familiar with CI/CD pipelines.
- Problem-Solving: Skilled in identifying and fixing performance issues.
Error: Contact form not found.