Luigi Zuccarelli, a principal software engineer, and Florian Moss, a solution architect, told SiliconRepublic.com about the work they were doing.

“My team is essentially a customer task force that takes on customer feedback and then builds enhancements for one of our flasgship products, OpenShift Container Platform,” said Zuccarelli.

“The operator we are building will be used to gather debugging and profiling data over a custom period of time so that we can better help our customers troubleshoot their clusters.”

Moss said that because there are customers all over the world who run critical infrastructure on the OpenShift Container Platform, every second of downtime has a huge impact on its customers.

“Sometimes, as it happens with technology, things go wrong and we need to ensure that the platform gets back up and running as quickly as possible. Or we proactively collect specific data that would enable us to prevent this from happening to begin with,” he said.

“The operator will essentially achieve just that. Rather than collecting a generic amount of data and reports, we are looking to implement a mechanism that helps us collect near real-time data in a specific time window.”

The way in which an operator like this is built can be extremely interesting on its own. But to add another layer to this specific project, Red Hat takes an open source approach.

According to Zuccarelli, this means that all of its software is developed “in the open” and anyone in the world can “essentially follow us in real time”.

For this particular operator, Zuccarelli said the team received a lot of feedback from Red Hat customers and support organisations in a process the company calls a ‘request for enhancement’, or RFE.

“Essentially everyone can open an RFE. This is one of the great things about open source development,” he said.

“Once our product team and/or the wider community agree that this feature would in fact make OpenShift Container Platform better, a team gets assigned to work on it. Because of the nature of these types of requests, they are assigned to my team.”

Zuccarelli said in order to build the feature, it’s important that the team understand what is required and why it’s needed before they start working on a design document.

This document usually outlines the problem, where the request comes from and what the team wants to do on an architectural level to address the problem. It can also sometimes include alternative implementation ideas or risks to the development process.

“Once our team has agreed on a design, we start breaking the development down into smaller tasks and essentially a typical Agile development process follows, usually in two-week sprints.”

However, even with a clear design, a map of development tasks and a process to follow, predicting how long an engineering project like this will take can be very difficult because it’s impossible to know what issues might crop up.

“It is not like physics where you can calculate exactly how long it will take a car to get from A to B,” said Zuccarelli.

“Sometimes we get problems that look rather trivial but once we start working on it, we might encounter a specific problem that we can’t solve. And then this problem requires its own design document or needs to be broken down into a whole list of sub-tasks.

“On the other hand, you might also look at a problem sometimes and think that it will take months to get this done, but all the right pieces come together in a way you wouldn’t expect and after a few weeks, you are done.”

As engineering professionals gain more experience, they can often become better at judging timelines. In the case of this particular operator, the Red Hat team believe they can deliver it in eight to 12 weeks.

Apart from timelines and unexpected issues, Moss said one of the biggest challenges of an engineering project like this is having to think of as many use cases as possible.

“OpenShift deployments can exist in environments that have no internet connection at all, or sit on top of public cloud provider platforms, or somewhere in an edge location that doesn’t have a reliable internet connection,” he said.

“We need to make sure that our design addresses all of these use cases equally because we can’t just develop something and exclude 20pc of our users.”

Moss added that another consideration they have to take into account is reliability and security. “When you know that banks, insurance companies or utility providers rely on your software, you always have to make sure to adhere to best practices. We can’t just skip corners because we are under pressure or have deadlines.”

For fellow engineers reading about this project and thinking about the kind of engineering project they could start, Moss said the first thing to do is to find what you’re passionate about.

“Some people prefer to work on front-end technologies, others love working with back-end technologies such as Golang or Java, and then you have people that feel much more comfortable in more low-level languages,” he said.

“Once you know what you like, it is very easy to find projects in the open-source community that you could contribute to. Just think about the tools you are using yourself, most of them will be open-source-based.”