Sr. DevOps Engineer
Akash Network
Software Engineering
United States · Remote
Posted on Apr 18, 2024
Want to build the future of the cloud?
Overclock Labs is developing the world's first open-source and decentralized marketplace for cloud compute, enabling anyone to become a cloud provider by offering unused compute on an open marketplace. As more of the internet moves to the cloud, we are materializing the vision of the first decentralized cloud in an industry projected to reach $2.4 trillion in total market capitalization by 2030. Our team includes renowned open-source and blockchain developers, and seasoned experts from leading technology and platform companies. We are a remote-first, distributed, and growing team — offering unparalleled opportunities for career growth at the emerging intersection between crypto, AI, and permissionless compute.
The Role:
As a Sr. DevOps Engineer at Akash Network you will help manage, grow and maintain the network of compute providers. You will have an excellent opportunity to expand your scope and help build the Web 3.0 decentralized infrastructure of the future.
Our Stack:
Golang, JavaScript, Postgres, Kubernetes and virtualization technologies https://github.com/akash-network
Your Sphere of Impact:
- Build, configure, manage and maintain Overclock Labs owned infrastructure, including Akash compute providers that the core team owns and manages
- Support community network providers by helping them onboard, troubleshoot and manage their providers on the network
- Work with customers looking to automate deployments on Akash Network and/ or migrate infrastructure from another public cloud to Akash.
- Participate in technical discussions on public community forums like Discord, Github and other places
- Identify areas of improvement in terms of issues/ bugs as well as process and work with the core team to document and address them
- Help with improving documentation.
Requirements:
- 8+ years of relevant work experience
- 5+ years of experience working with Kubernetes & Docker in production
- Strong Linux system administration skills
- Advanced knowledge of cloud infrastructure (networking, cloud services, orchestration tools, containerization, compute, and storage systems)
- Experience utilizing automation tools like Terraform and Ansible
- Understanding of and experience with Telemetry and Monitoring tools like Grafana, Datadog, Splunk and others.Experience with CI/CD tooling similar to Jenkins
- Excellent written and verbal communication skills
- Passion for automation and DRY best practices
- Ability to work independently, take initiative when necessary and have a sense of ownership
- Must be based in North America or be willing to work North America hours.
Nice to Have:
- Ability to program in one or more programming languages like Python, Javascript or GoLang
- Experience managing data center and/ or public cloud infrastructure