We offer you to join our Ops team in order to cover regular operations and architecture, at the heart of the Diabolocom service. Ops team participates in the deployment and operational maintenance of our SaaS platforms, on a Hybrid Cloud basis
The Diabolocom Hybrid Cloud is composed of on-premise infrastructure distributed amongst 5 dedicated colocations around Paris and Germany and public clouds (AWS / GCP / OVH).
Offering solutions with a high level of security and availability, and certified environments, notably PCI-DSS and HDS (health data), ops team is responsible for making sure everything runs smoothly, in a scalable and secure way,24/24, 7/7.
In a context of strong development, Diabolocom is working on the evolution of its infrastructure, from proxmox- based infrastructure using VMs deployed via Ansible, to a Kubernetes oriented architecture deploying applications with ArgoCD. We are looking for people able to maintain legacy infra while it’s being migrated, but also able to architect and build the future with new tools and state-of-the-art technologies and methodologies.
Our legacy infrastructure still has some interesting features, including full IaC management.
Among other things,our servers and VM are booted over PXE on self-generated, fully IaC images generated with our Ansible playbooks.

At Diabolocom, you will:

● Co-architect new infrastructure to contribute to its evolution and migration to kubernetes
● Develop PoCs to test and validate new decisions
● Be in charge of the infrastructure, both legacy and new one, and its global vision
● Be interested in the evolution of standards to advise and propose upgrade plans
● Maintain, upgrade and secure existing and already deployed infrastructure
● Be using modern tools and good practices to deploy a scalable and reproducible infrastructure without SPoFs

What we are looking for:

● You are curious, always a source of proposals
● You are on the lookout for new technologies and are always ready to learn new things
● You master Ansible, docker and other standard tools used everyday by our ops team
● You are experienced with bash and you know your way around a linux-based OS (debian)
● You are familiar with some of the tools or technologies listed below (we don’t expect anyone to master
everything of course)
● You have an appetite to optimize and automate systems
● You have the security of the infrastructure in mind
● You are fluent in English, both written and spoken
● You have excellent interpersonal skills and are not afraid to solve problems
● You are autonomous and able to tackle problems by yourself

What’s in it for you:

● We offer a multicultural environment with teams in 5 countries across Europe (and expanding)
● A context to work in where your ideas are listened to and valued, and in which you can easily contribute
and make a difference
● We are at a pivotal stage in our development with a significant acceleration of our growth and paradigm
● We offer opportunities to learn and grow
● Great work atmosphere and regular company events, barbecues, team buildings…
● Partial remote work possible (2-3 days a week)

The technical scope of the Ops teams:

● IaC scripting and automation under Ansible, bash and Python
● IaC scripting for the modern kubernetes environment with ArgoCD and helm
● Monitoring and observability of deployed infrastructure with Netdata, Prometheus and Grafana
● Interact with dev and QA teams to provide tools and infrastructure evolution to fit their needs
● Advise them in CI/CD and Kubernetes usage
● Daily day 2 operation on existing infrastructure
● Incident detection and resolution with thorough post-mortem analysis to prevent the class of incidents in the future

Our technical stack:

● Kubernetes with ArgoCD and Helm
● Ceph
● nginx ingress controller, HAProxy
● Ansible (heavily used), terraform
● Proxmox cluster, debian-based VM booting with PXE
● Netdata, Prometheus, Victoriametrics, Grafana
● docker and docker-compose
● PostgreSQL
● Gitlab and Gitlab-CI
● Various deployed services, including: sentry, harbor, cert-manager, renovate, openLDAP, external-DNS,
powerdDNS, bind9, dhcpd, loki, rsyslog, mysql, authentik, ISC DHCPD, RabbitMQ, Hashicorp Vault, passbolt,
openreplay, weblate, …
● Bare metal infrastructure (90% of the load) and cloud-based (AWS)
● Slack, G Suite
● Small python and bash scripts

