We've been been cooking up several more AMIs to get you started on AWS quickly. This time we are introducing the Nvidia Digits 3 AMI which is designed to to get you started quickly with Nvidia's deep learning package which includes their branched versions of Caffe and Torch, as well as a browser accessible interface for quick experimentation. The second AMI is built on top of Ubuntu 14, the Cuda 7.5 Toolkit, and the latest Nvidia drivers, and is targeted at Cuda developers and those intending to deploy GPU applications with ease.
We recently presented at the GPU Technology Conference, where we demonstrated how to containerize GPU application with Docker and utilize Bitfusion Boost. This week, at the SaltConf 16 conference, we will be taking this concept a step further and demonstrating GPU accelerated containers through a complete Docker ecosystem under SaltStack control. In particular, we will show how we utilize both these technologies to create virtual GPU clusters that provide maximum performance and data center utilization for compute intensive applications.
In early April 2016, we started offering Monster GPU Machines on Amazon Web Services (AWS) powered by Bitfusion Boost and have seen really great interest. In the last couple of weeks alone we have seen massive usage and is growing at an even faster rate recently.
FREMONT, CA - APRIL 5, 2016 – AMAX, a leading provider of HPC, Cloud/IaaS, GPU and Data Center solutions, will demonstrate GPU virtualization technology at the GPU Technology Conference (GTC) 2016 on April 5-7, 2016. The demo will feature AMAX's award-winning Deep Learning Platforms running Bitfusion Boost to virtualize GPU resources from multiple nodes for rendering and deep learning applications.
You may have seen our recent post of enabling customers to create Monster GPU Machines on AWS using our Boost Technology. Ready to see some real applications using Boost, meet some of our partners utilizing Boost on their systems, and find out what else we can do with Boost? Then please come join us next week at the GPU Technology Conference (GTC) in sunny Silicon Valley, California.
At Bitfusion, our job is to know how well various compute-intensive workloads scale on different infrastructures and to help people maximize performance. Since we launched our Deep Learning and CUDA AMIs in the AWS Marketplace we’ve heard many of our customers ask for bigger GPU instances, but the largest Amazon EC2 instance, the g2.8xlarge, currently maxes out at just 4 GPUs.
There are many workloads which require significant image manipulation such as visualization and analysis of geospatial data to generate georeferenced imagery and terrain data. These workloads can be found in a wide variety of industries ranging from aerospace and defense to security and planetary research. One tool which is commonly used to tackle such tasks across a vast spectrum of Linux distributions is ImageMagick. ImageMagick is also found it just about all of the most popular web-stacks to handle image transformations such as re-sizing, contrast enhancement, and the application of various filters.
We recently published our Boost AMIs to the AWS market and walked through potential cluster configurations. Today, we are going to expand on that and set up a Bitfusion Boost cluster on AWS. We will be explicitly setting this up for the Caffe Deep Learning Framework. At the end of this tutorial, you will have a cluster comprised of: One g2.8xlarge as a client where the application runs Three g2.8xlarges as servers This configuration will give your application a total of 16GPUs! 1. Subscribe to the Bitfusion AMIs This walkthrough leverages AWS’s CloudFormation (CFN) templates. Using our template will enable you to get a Bitfusion Boost cluster up and running in minutes. In order to utilize the CloudFormation template, you need to be signed into the AWS console and be an active subscriber to the AMIs used in the template.
Our AWS Marketplace AMIs have been updated since this post to make launching them with Boost even easier. Please refer to our latest tutorial post titled: Deploy Bitfusion Boost on AWS faster than ever We recently published our Boost AMIs to the AWS market and walked through potential cluster configurations. Today, we are going to expand on that and set up a Bitfusion Boost cluster on AWS. We will be explicitly setting this up for the Caffe Deep Learning Framework. At the end of this tutorial, you will have a cluster comprised of: One g2.8xlarge as a client where the application runs Three g2.8xlarges as servers This configuration will give your application a total of 16GPUs! 1. Subscribe to the Bitfusion AMIs This walkthrough leverages AWS's CloudFormation (CFN) templates. Using our template will enable you to get a Bitfusion Boost cluster up and running in minutes. In order to utilize the CloudFormation template, you need to be signed into the AWS console and be an active subscriber to the AMIs used in the template. What does it mean to Subscribe to a Product? Subscribing to a product means that you have accepted the terms of the product as shown on the product’s listing page, including pricing terms and the software seller’s End User License Agreement, and that you agree to use such product in accordance with the AWS Customer Agreement. All Bitfusion AMIs are priced on an hourly basis and you will only incur charges on top of the base AWS instance charges when the cluster is up and running - simply subscribing to one of our AMIs does not cost you anything. WARNING: In Step 1 and Step 2, DO NOT launch directly using the "1-Click Launch" option as this will automatically launch an instance. This is not required as all instances will launch via the CFN. For both AMIs below, make sure the "Manual Launch" tab is selected, then simply click on "Accept Software Terms." [container] [row] [column md="6"] Step 1: Accept Bitfusion Boost Server Software Terms Boost Server AMI Software Terms [/column] [column md="6"] Step 2: Accept Bitfusion Boost Caffe Client Software Terms Boost Caffe Client AMI Software Terms [/column] [/row] [/container] 2. Create an AWS Key Pair The AWS key pair uses public-key cryptography to provide secure login to your AWS cluster. You will need create one to access the Bitfusion Client, unless you have created one previously, in which case you can re-use that key and skip directly to Section 3. Create Key Pair [container] [row] [column md="4"] Step 1: Select us-east-1 as your region Our CFN template currently only supports us-east-1 [/column] [column md="4"] Step 2: Create and name your key pair In the navigation pane, under "Network & Security", select "Key Pairs". Then choose the "Create Key Pair" button. [/column] [column md="4"] Step 3: Download and save the key pair The key pair will automatically download. Make sure you keep this file as it is required to login to the client machine. [/column] [/row] [/container] 3. Create a Bitfusion Boost Cluster The Bitfusion Boost template is specifically configured for running a Boost Cluster. If you modify any of the AWS template configurations, you may be unable to run the cluster or tools. Launch Bitfusion AWS Template [container] [row] [column md="4"] Step 1: Accept the template Accept the template already specified, and click "Next". [/column] [column md="4"] Step 2: Specify the template parameters On the specify details page, enter a name for your cluster (e.g. BitfusionCluster), accept the default parameters, select your "KeyName" and then click "Next". [/column] [column md="4"] Step 3: Accept default options On the options page, accept the defaults and click "Next" [/column] [/row] [/container] 4. Launch the Cluster Finish creating your AWS cluster and login to the Bitfusion Boost client. [container] [row] [column md="4"] Step 1: Create the cluster On the review page, check the box that allows CloudFormation to create the necessary IAM roles and click "Create". [/column] [column md="4"] Step 2: Monitor Provisioning Process The cluster stack spins up over a period of 10 to 15 minutes. Watch for the status to change from CREATE_IN_PROGRESS to CREATE_COMPLETE. You may need to refresh the page to see the status change. [/column] [column md="4"] Step 3: Login to the Bitfusion Client From the Amazon EC2 console page, click on the client. Copy the IP address and login via SSH: ssh -l ubuntu -i [/column] [/row] [/container] 5. Take It for a Spin Once you have logged in you can query how many GPUs you have and test out Caffe. How many GPUs do you have? You can query the number of GPUs available to you with the following command: bfboost client /usr/local/cuda-7.0/samples/bin/x86_64/linux/release/deviceQuery Caffe Run the following commands to test out Caffe and see it running on all 16 GPUs: cd /opt/caffe-gpu ./data/mnist/get_mnist.sh ./examples/mnist/create_mnist.sh bfboost client "./build/tools/caffe train --solver=examples/mnist/lenet_solver.prototxt -gpu all" For more information on using Boost please refer to our official documentation. 6. Deleting the Cluster Select your cluster on the Cloud Formation Management page and click Delete Stack. For more information, see Deleting a Stack on the AWS CloudFormation Console.