Senior Database Reliability Engineer (DBRE)
Cognite, Norway

Experience
1 Year
Salary
0 - 0
Job Type
Job Shift
Job Category
Traveling
No
Career Level
Telecommute
Qualification
As mentioned in job details
Total Vacancies
1 Job
Posted on
Mar 16, 2021
Last Date
Apr 16, 2021
Location(s)

Job Description

Want to help us bring our fundamental data stores to multiple clouds - public and private?

We are looking to hire a Senior or Principal Engineer to join our Database Reliability Engineering team and work on delivering FoundationDB. We recognise that direct FoundationDB skills are very rare, and instead are looking for people with an interest in growing their knowledge, who have experience in SRE, DevOps, database administration, large scale system administration or other similar roles at the intersection of operations, human factors and software development.

The FoundationDB Kubernetes Operator is written in Golang, and FoundationDB itself is written in Flow, an Actor system that preprocesses C++ code.


About the job to be done:

  • Join Cognite’s DBRE team as a FoundationDB sub-team, owning the full cluster lifecycle of all of our FoundationDB clusters.
  • on both public clouds and on private Kubernetes deployments.
  • Establish robust reliability engineering to support these clusters, managing aspects like monitoring, chaos testing, alerting, on-call rotations, internal best-practices education, and capacity forecasting.
  • Enable product teams to focus on using the databases, and not on running them but deeply engage them to make sure the products are operable at scale.


About you:

  • A master's degree in Computer Science or a similar amount of experience
  • Broad experience with DevOps practices such as CI/CD and Infrastructure as code
  • Experience with large Cloud deployments on any of AWS, GCP, or Azure
  • Familiar with C++, Golang or other programming languages
  • 6+ years of Linux operations experience
  • 2+ years working with similar distributed systems
  • Familiarity and experience with our tech stack are beneficial

About the Database Reliability Engineering Team:

Cognite’s Cognite Data Fusion contextualizes operational data at scale, enabling asset-intensive industries to make data-driven decisions. Our platform is built on many different technologies, each good at solving different problems. Some of these are absolutely fundamental, and the Database Reliability Engineering team will be responsible for the continuous well-being of our portfolio of PostgreSQL, Elasticsearch, FoundationDB and Kafka clusters, some of which we expect to have thousands of in the years to come in both public and private clouds, through managed services and on self-managed Kubernetes clusters.

Even when using mature as-a-Service offerings and Kubernetes operators, there are many things that can and will go wrong. Herding clusters that need upgrading, upscaling, cost-trimming, and recovery, etc., while continuously serving heavy workloads with tight SLOs requires solid reliability engineering.

About our Tech stack:

We work with open source technologies that need to run in multiple cloud environments both public clouds (like Google Cloud Platform and Azure) and in private clouds with customer provided Kubernetes.

Managed Kubernetes (GKE, AKS, Openshift) forms the base that we build our products on top of. To prove the market we initially built on PaaS offerings to store state, such as Google Bigtable, Spanner and Pubsub. We replicate data to different storage systems to be able to answer different types of queries. As we diversify the platforms our offering runs on, we are migrating to a self-run Foundation DB based scale-out data store for managing time-series data. PostgreSQL and Elasticsearch are also important examples.

Our backend developer teams work with Java, Scala, Python, and Rust. CI/CD is handled by a combination of Github, Jenkins, and Spinnaker to test and deploy code to production. The infrastructure is managed as code with Terraform and Atlantis and services are monitored using Prometheus, Grafana, and Lightstep.

About Cognite:

Cognite is a global industrial Software-as-a-Service (SaaS) leader, with an eye on the future and a drive to digitalize the industrial world. We enable the full-scale digital transformation of heavy-asset industries, and we’re growing fast. We’ve won a lot of awards we’re quite proud of [link forthcoming], and we’re backed by Accel, one of the world’s leading venture capital firms.

Our core software product is Cognite Data Fusion (CDF), designed to quickly contextualize OT/IT data to develop and scale company solutions. In other words, we make data do more. Our partners and clients use CDF to increase safety, improve sustainability, optimize efficiency, and drive revenue.

Our people are our strength. We recruit from around the world, and our team is quite diverse because of it: We count over 50 nationalities througho

Job Specification

Job Rewards and Benefits

Cognite

Information Technology and Services - Oslo, Norway
© Copyright 2004-2024 Mustakbil.com All Right Reserved.