Seeking a Senior Infrastructure Systems Architect & Technical Lead to design, implement, and maintain scalable, fault-tolerant infrastructure and distributed systems. This role requires deep expertise in cloud-native technologies, container orchestration, automation, and database management, along with strong technical leadership and problem-solving abilities.
We are seeking a highly experienced and technically adept individual to join our team and play a pivotal role in the global design and implementation of robust, scalable, and fault-tolerant infrastructure systems. Your contributions will be instrumental in supporting critical engineering and operational needs.
This role involves the deployment, meticulous configuration, and ongoing maintenance of complex distributed storage and database systems, ensuring their optimal performance and reliability. A key responsibility will be the thorough analysis of system failures, performance degradations, and misconfigurations that may arise across hardware, software, and network layers. Furthermore, you will be expected to mentor and guide our team of computer systems engineers, fostering their growth and technical expertise, and actively participate in strategic technical planning initiatives to shape the future of our infrastructure. To be successful in this position, candidates will typically possess a strong academic foundation, with a Bachelor's degree (BTech/BEng Hons) in Computer Science, Software Engineering, Information Systems, Electronic Engineering, or an equivalent discipline, combined with a minimum of 13 years of relevant professional experience. Alternatively, candidates holding a Master's degree (MTech/MEng) in a similar field with 9 years of experience, or an MEng with 7 years of experience, will be considered. A PhD in a related discipline, coupled with 5 years of experience, is also highly valued. Beyond formal education and years of service, we require a minimum of 3 years of experience in a technical leadership or software/system architecture role, where you have had direct responsibility for large-scale, platform-based distributed systems. Demonstrated hands-on experience is paramount, encompassing infrastructure design and automation, a deep understanding of distributed systems, robust observability practices, proficiency in CI/CD pipelines, container orchestration technologies such as Kubernetes, and a solid grasp of DevOps/SRE principles and cloud-native technologies. You will need an in-depth understanding of core systems engineering principles, including performance optimization strategies, fault tolerance mechanisms, and efficient resource scheduling within Linux-based environments. Proficiency in containerized environments like Docker and Podman, orchestration platforms such as Kubernetes and Helm, and runtime architectures including containerd and CRI is essential. Furthermore, expertise in infrastructure-as-code, continuous integration/deployment (CI/CD) workflows, and configuration management tools like GitLab CI, Ansible, Terraform, and ArgoCD is expected. An advanced understanding of distributed computing and storage architectures, including Ceph, S3, NFS, and various local/clustered file systems, is also a key requirement. You should possess operational and architectural fluency in both relational and NoSQL database systems, such as PostgreSQL, MySQL, and MongoDB, with a comprehensive knowledge of replication, backup strategies, and performance tuning techniques. A working knowledge of fundamental networking concepts, security protocols, and systems-level observability tools like Prometheus, Grafana, and the ELK/EFK stack will be highly beneficial. Your technical leadership will be demonstrated through a proven ability to spearhead cross-functional initiatives spanning systems, storage, and database infrastructure, confidently driving technical decisions from initial architecture conception through to successful implementation. A strong background in Linux administration, infrastructure automation, service orchestration, and performance optimization across diverse environments is crucial. You will leverage extensive experience in designing and deploying scalable, resilient services utilizing microservices, event-driven, and cloud-native design patterns. Proficiency in production-grade environments using Kubernetes, Docker, and Helm for both system and application deployments is a must. Your hands-on experience with infrastructure automation and CI/CD tools like GitLab CI, ArgoCD, FluxCD, Jenkins, or GitHub Actions will be vital for enhancing and securing platform operations. A solid understanding of infrastructure-as-code, configuration management, and release automation (DevOps), coupled with expertise in incident response, monitoring, defining and tracking SLIs/SLOs, and system reliability engineering (SRE) practices, is expected. Advanced Linux expertise, including troubleshooting, kernel tuning, systemd orchestration, and large-scale system optimization, will be a significant asset. Experience in backlog management, fostering cross-team collaboration, and Agile sprint execution will be necessary for effective technical delivery and planning. Practical experience managing both relational and NoSQL databases, including ensuring high availability, implementing robust backup and replication strategies, and performing detailed performance tuning, is essential. You must possess strong diagnostic and problem-solving skills, with a commitment to a root-cause-first approach, demonstrating a strong sense of ownership, accountability, and an unwavering focus on long-term operational stability. Your technical leadership will be characterized by your ability to lead architectural discussions, effectively influence design decisions, and mentor junior engineers across various infrastructure streams. Resource management and leadership will be demonstrated through your ability to foster innovation and support the development of emerging skills within the team. You will build trust through consistency, integrity, understanding, and patience, while effectively planning, allocating, and monitoring resources to achieve desired project outcomes. Finally, your strong problem-solving and analytical skills will be vital for effective root cause analysis, systems troubleshooting, and resolving complex performance bottlenecks
Infrastructure Systems Distributed Systems Technical Leadership Cloud-Native Devops/SRE
United States Latest News, United States Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
SAB spotlights South African women restoring water systems through invasive species clearing in the Western CapeIn South Africa’s water-scarce regions, one of the most effective ways to restore water is also one of the least visible: clearing invasive alien vegetati
Read more »
Tshwane Accused of Prioritizing Water Trucking Over Infrastructure Amidst Corruption AllegationsAn investigation into Tshwane's water supply contracts reveals that the city may be spending more on temporary water trucking services than on repairing and upgrading its aging infrastructure. Allegations of tender rigging and potential front companies are central to the controversy, which has drawn the attention of the Special Investigating Unit (SIU) and implicated city officials.
Read more »
Senior IT Technician Sought for Hands-On Infrastructure and Mentorship RoleA company is seeking a Senior IT Technician to serve as the technical core of their team. This role involves extensive hands-on work with enterprise infrastructure, including designing and implementing solutions for clients, managing FortiGate firewalls and MikroTik routers, and configuring BGP. The position also requires mentoring junior technicians and contributing to service delivery for over 100 clients, utilizing a modern tech stack with high-speed networking and enterprise tools.
Read more »
Disability Barriers: Nigerian Biometric Systems Exclude Individuals with Scars and DisabilitiesOvey Friday, a Nigerian youth, faced the threat of being denied university admission due to his inability to provide thumbprints for biometric verification. His scarred fingers, a result of torture at a traditional shrine, made the standard fingerprint scanning impossible. His case highlights the broader challenges faced by persons with disabilities in Nigeria who encounter similar obstacles with identity verification systems, impacting their access to education, essential services, and technology. The story also features Scarlett Eduoku, who experiences difficulties with facial recognition technology due to losing an eye.
Read more »
SAB spotlights South African women restoring water systems through invasive species clearing in the Western CapeIn South Africa’s water-scarce regions, one of the most effective ways to restore water is also one of the least visible: clearing invasive alien vegetation.
Read more »
Reimagining TVET Colleges through digital transformation and smart learning infrastructureThe initiative highlights the value of strategic collaboration between government and the private sector in accelerating the adoption of innovative technologies that enhance teaching, learning, and institutional performance within the TVET sector.
Read more »
