Customer Success Story
Automating infrastructure in a highly secure environment
A new way of managing infrastructure
Bank Hapoalim is Israel’s largest bank and was named by The Banker magazine as Bank of the Year in Israel for 2015. The bank’s Unix and Linux team has been using Chef for only about five months and they have made enormous progress automating their infrastructure in a highly secure environment. The team uses Chef for configuration management and, less conventionally, they are also using their Chef server as a package and proxy package repository. The Chef server deploys and installs the application packages on production servers.
The bank has heterogeneous infrastructure that includes various flavors of Linux and Unix as well as Windows. Currently, the team is using Chef to manage hundreds of Linux nodes, with plans to start managing hundreds more Windows nodes and, of course, new Linux/Unix servers as they are added.
The decision to automate with Chef was driven by concerns that will sound familiar to many who are responsible for an organization’s infrastructure. Oz Sharon, the team’s manager explains, “We wanted to focus on doing standard, repeatable work. We saw that we were doing the same tasks again and again but each time the result was a bit different because the person who was doing the job now was not the person who did the job before. Consequently, we had a lot of services that were all slightly different from each other. We wanted to reach a point where we could do the same thing repeatedly, without any changes between deployments. We needed to know that our servers were always in compliance with the bank’s standards. The other thing we wanted to do was get rid of the boring things that take a lot of time like creating a special server with software on top of it or hardening servers.”
In the short time that they have been using Chef, the team has already seen significant improvements in deployment time. For example, creating a stack that includes bootstrapping a Linux server, hardening it, and installing the application server along with other software used to take days. It now takes minutes.
Not only have they significantly reduced the number of human errors but they can manage their infrastructure consistently across several DMZs. Chef has given them better control across the organization, even in areas that aren’t obvious. For example, the bank has operators who handle calls and problems from internal customers at night and, with Chef, the team has been able to standardize their menus.
Pavel Jeloudovski, Senior Systems Administrator
What do we plan to do with Chef? The answer is simple—everything. We are going to have a cutoff at the end of the year where nothing goes into production unless it's coded. Our goal is to code everything, check it in test, check it in pre-production, and do it all with Chef.
The team first heard about Chef when a developer saw IBM demonstrate how to use it to install a Websphere Application Server. The team initially thought they could use Chef to replace IBM’s Tivoli Provisioning Manager (TPM), which was going out of support. They quickly realized that Chef could also solve the problems that Oz outlined.
They did a small proof-of-concept to demonstrate to stakeholders that Chef could install all the prerequisites for one of the bank’s applications. They then used Chef to deploy the application itself. Once the stakeholders were on board, the team presented the plans for how they would deploy Chef, which needed approval from the security group. After that, they installed the production Chef server.
Creating a secure environment with Chef
Pavel Jeloudovski is the team’s senior systems administrator. He was largely responsible for the design of the network topology around the Chef server and also for its installation. He says that one of the things that made Chef appealing is that the server is accessible over a single, well-known port (443). He says, “We have a lot of security constraints and the fact that Chef runs on only one port means that we are able to create secure tunnels using F5. We installed an NGINX proxy in every segment. It forwards requests to F5, which analyzes them and then forwards them to the Chef server. Our security people love that they can use industry-standard tools like F5.” Everything is installed and configured with Chef.
Here’s an example of the topology.
There are three organizations, and each project has several environments. The environments are for integration, pre-production and production.
Chef’s single point of access gave the team an innovative idea. Pavel says, “Because of our environment, opening even a single firewall port requires a security review. We’re using, or possibly abusing, Chef by making it our first central management system. We’ve never had anything that all the networks in the bank could access. Without a central repository, we were always making copies and mounting ISO images manually.
“Now, we use the Chef server as a file server and, because everyone is using the same code and not their own homebrewed scripts, as our central yum repository. We also use it as a proxy for other services that use HTTP, such as our Artifactory repository.”
Securing the servers
When thinking about the most important security issue that Chef has helped address, Pavel immediately mentions hardening. “Our most critical job is to button down the security requirements. For every piece of technology that we have, the security people give us a manual on how to harden it: how to harden Apache, how to harden WebSphere. Before Chef, we did everything manually. There might be some scripts but they weren’t consistent across the organization.”
Naftali Burnham is a systems administrator who is a member of the team. He says, “I work with Unix and Linux. My job is basically systems engineering and administration. Recently, I’ve started to develop on Chef, and I like to describe my work as ‘systems ops’.”
Naftali worked on writing the cookbook to harden Linux systems and he’s now doing the same for AIX. He says, “We were able to write the Linux cookbook within a week, using the manual we had from the security team. It was around 100 pages—a pretty serious document.”
“When we began, we found some things in open source cookbooks but, in this case, we wrote most of it from scratch. Parts that were already available we used and, for the things that we had to tweak, change and add, we put wrappers around the existing cookbook.”
Commenting on how much easier it is now to remain compliant, Pavel says, “Having the cookbook means that the person who hardens the servers doesn’t have to go through the hassle of learning the security document, which was the situation we had before. They have the code that Naftali wrote and they can focus on other projects. Also, now we know for a fact that all our existing Linux servers are security compliant and that will also be true for the servers we add.”
The workflow for cookbooks
The team uses local development but, because their workstations generally don’t have enough RAM, they run Test Kitchen on dedicated servers. Locally, they run both RuboCop and Foodcritic. They are planning to increase their test coverage and begin using unit tests. Major cookbooks are version pegged according to the environment. If a cookbook is promoted to a higher environment, the team changes the version. Once it’s tested, the cookbook gets uploaded to production.
Pavel was the first in the bank to use Chef, followed by Naftali and another team member. They educated themselves by several methods. They read Chef documentation, they used the Learn Chef tutorials, and they took a third-party, online course. Of course, looking at community cookbooks also helped.
Now they are the bank’s technical champions whose mission is to persuade their colleagues to adopt a new way of managing infrastructure. This will involve both online classes from Chef and interaction with more experienced users. Oz says, “We’ll let others on the team get started and go at their own pace. This is a very new idea. Administrators are used to going to the console and doing things. Now, instead of going to the console, we do things centrally and deploy them. Adopting this new approach can take quite some time.”
Pavel, Naftali and Oz all think it’s important to realize that the world of system administration is changing. Pavel says, “I was able to identify the people on the teams who want new technology. There was a bit of excitement and some fear. I said that if they don’t learn to automate now they will be pretty much irrelevant in five years and we have examples of that in the organization. I bluntly came to people and said, ‘Hey, it’s either you doing it or the next person but the one who learns to automate has the better chance of being able to stay relevant.’ People who are concerned about their professional future are jumping on the bandwagon. Plus, the fact that two really good guys left the bank to do automation, one with Chef and the other with Puppet, for better pay is an incentive, too.”
Oz says, “I very much agree. I think that people want to learn new things, and they want to do things effectively. Lucky for us, we have a team of people who are very smart and know that Chef will give them a huge benefit. If, in an upcoming day, they choose to leave us, they have a new tool to use out there.”
The DevOps community
Tel Aviv, where the team members work, has many groups actively focused on DevOps. For example, there is a Meetup group called Devops-Israel.There are other activities sponsored by individual companies, such as those that use Chef. These companies host their own internal DevOps days but people who work elsewhere can also attend.
Until this article was first published, Pavel hadn’t engaged with the community, other than attending Tel Aviv DevOps Days last year. Since then, people have reached out and offered to help the team members with their DevOps journey. Pavel says, “It’s really inspiring getting this warm welcome.”
What’s in the works?
When asked what sort of projects involving Chef that the bank has planned, Pavel says, “What do we plan to do with Chef? The answer is simple—everything. We are going to have a cutoff at the end of the year where nothing goes into production unless it’s coded. Our goal is to code everything, check it in test, check it in pre-production, and do it all with Chef.”
Currently, the biggest project for the bank is that they are moving to a new data center. Of course, everything in the new location is automated with Chef.
Another smaller project that Naftali is working on is the bank’s AIX cookbook. He says, “There’s a very nice AIX cookbook that provides a lot of resources and that’s what we initially started to work with. It handles, for example, the inittab service and the TCP/IP services. We’ve been working to implement it and, at the same time, we’re also developing our own ideas on top of it.”
A third project is automating the Windows servers. Windows is a separate group but they work closely with Oz’s team and have adjacent office space. The same team is also responsible for managing the VMware installations. The group has allocated a couple of people to use Chef for IIS configuration and other tasks.
Right now, this team uses Systems Center Configuration Manager (SCCM). While they like it, it only covers services up to the system level. Most of the actual work is per-product, specific configurations and that’s where Chef will come in. The team will continue to use SCCM for lower-level configuration.
The bank has big plans for the future. The first project is to automate everything that the Unix and middleware administrators do. This includes installing a big data analytics platform as well as installing Redis.
A second project is to migrate from WebSphere to Tomcat. All the Tomcat servers are already automated.
Another large project, which Oz’s team is working on with another group, is to use Chef as the bank’s software delivery system. This will deliver applications that are developed in house. Chef will replace TPM. Because Chef automation uses version control, it will be easy to create a true DevOps-style process for managing changes.
Finally, there is a project that also involves the VMware and Windows people. The goal is to use Chef provisioning to orchestrate and automate VM provisioning and bootstrapping. For example, for the big data platform, Pavel says, “The platform will probably be KVM based. We will write our own backend because the platform we intend to use has a rich REST API. I don’t know if we’ll be able to write the backend driver but custom resources, definitely.”
When asked to sum up their experiences with Chef, Pavel, Oz and Naftali all agree that Chef is central to how the bank will handle infrastructure and software deployment going forward.
Naftali says, “Obviously, it’s great to be able to automate things that take up a lot of time, and Chef enables us to focus on other skill sets and to work on other important areas where we need to devote our attention. Also, it’s a new, great technology for us. We’re jumping all over it. I’ve basically integrated all my projects into it already and I’m already finding great benefits. It’s the future. If your eyes are open to that, you should pick it up.”