Installing and configuring multi-node systems are common enterprise tasks that can require a tremendous amount of time and resources. As technology continues to evolve, so will the need to effectively deploy and manage these systems. Chef and CloudBolt work together to simplify the installation and increase the scalability of these complex systems.
Challenge: Scale, Automation & Self-Service IT
A global B2C company was challenged with processing an ever-increasing amount of social media data that would enable the organization to measure and analyze their customers’ ‘likes’ and ‘dislikes’. Providing this valuable insight to the organization was identified as critical data required to meet their future growth goals.
The Apache Hadoop framework was used for distributed processing of large data sets across clusters of servers. Hadoop provides the ability to dynamically scale from a single server to thousands of machines using an open source version of Google’s MapReduce for processing. The servers also required a base Linux OS, prerequisites (MySQL, yum, rpm, etc.) and Ambari, which provides a Hadoop web interface and manages the clusters from a central location.
The company was using Chef which excelled at consistent and automated installation and configuration of the OS and middleware components. However, the challenge that remained was implementing Chef across 40 servers, while also providing a user-friendly portal that would provision and manage the Hadoop clusters over time.
Solution: Cloud Management Platform
The B2C company was using CloudBolt, a hybrid cloud management platform, to provision servers in diverse environments. It was recognized that a CloudBolt blueprint (which automates multistep installations) could be created to install a 40+ node Hadoop cluster on Google Cloud with just a few clicks.
The CloudBolt blueprint specified how to deploy and manage complex, multi-tier, applications. Build steps are run in parallel or sequentially. User options in the portal can be static or a multiple choice drop down can be developed to give the user a choice of only the items they need. Specific teardown procedures can also be specified, and manual or auto scaling of deployed services can also be driven by the CloudBolt blueprint. Finally, the blueprints can be imported and exported as JSON and stored in source control.
The Hadoop blueprint utilizes Chef and the Google Cloud Platform to install:
- Software requirements for the chosen Linux OS (curl, rpm, yum, etc)
- Java Development Kit
- MySQL database
- Hadoop and Ambari
Overview of how a user deploys the Hadoop cluster blueprint:
When deploying the Hadoop cluster blueprint from CloudBolt, the user inputs cluster name and selects the quantity of servers for each tier of the Hadoop cluster.
CloudBolt goes to work building the Linux instances in Google Cloud Engine and bootstraps a Chef agent onto each machine. Chef drives the installation and configuration of JDK, MySQL, and Hadoop/Ambari.
After Chef and CloudBolt are done with all of the heavy lifting, the client was able to log into the Ambari portal.
CloudBolt managed the entire lifecycle of the application. Users can now scale up, down, or they can set up autoscaling. Expiration dates can be set to automatically power down or remove a service when it is no longer needed. Additionally, CloudBolt can deregister the nodes from Chef and clean up all remaining components appropriately.
Time To Value
As this success story detailed, CloudBolt has the power to simplify almost any software deployment. Installations go from taking days to minutes. Developers were provided with the tools they needed, enabling them to begin value-generating projects faster. CloudBolt also reduced cloud spend, because users were more likely to delete unused resources they could easily recreate if and when needed. Last but not least, CloudBolt freed IT from tedious setup tasks, so they could focus on projects that would move the business forward.