.. _cloudbench_setup: Benchmark Harness: CBTOOL and Benchmark Drivers Setup and Preparation ********************************************************************** .. role:: bash(code) :language: bash .. include:: ./cloudbench_intro.rst .. include:: ./prepare_cbtool/cloudbench_prepare_ubuntu_qcow2.rst .. include:: ./prepare_cbtool/cloudbench_prepare_centos_qcow2.rst .. include:: ./prepare_cbtool/cloudbench_image_qcow2.rst .. include:: ./prepare_cbtool/cloudbench_prepare_ec2.rst .. include:: ./prepare_cbtool/cloudbench_prepare_gcloud.rst .. include:: ./prepare_cbtool/cloudbench_prepare_digitalocean.rst .. include:: ./configure_cloud/configure_cloud_openstack.rst .. include:: ./configure_cloud/configure_cloud_ec2.rst .. include:: ./configure_cloud/configure_cloud_gcloud.rst .. include:: ./configure_cloud/configure_cloud_digitalocean.rst Multiple Network Interfaces on Benchmark Harness Machine ============================================================= If there are more than one network interfaces on the benchmark harness machine, it is best to configure CBTOOL with the network interface that is used to communicate with the cloud API. Set the following configuration in the configuration file of cbtool, assuming it was set up on an Ubuntu machine with :bash:`ubuntu` as the Linux user:: vi ~/osgcloud/cbtool/configs/ubuntu_cloud_definitions.txt MANAGER_IP = IPADDRESS_OF_INTERFACE_FOR_CLOUDAPI Timezone and NTP server ======================= It is highly recommended that the timezone configured on the benchmark harness machine is UTC. Set the timezone to UTC by running the following command:: sudo dpkg-reconfigure tzdata Scroll to the bottom of the list and select "None of the above". Then, select "UTC". It is also highly recommended that the timezone of benchmark harness machine and instances should be the same. If, for a compliant run, an NTP server other than the benchmark harness machine is used, it must be manually configured on the benchmark harness machine. Adding a New Cloud Adapter ======================================== As previously mentioned, CBTOOL’s layered architecture was intended to allow the framework to be incrementally expanded in a non-intrusive (i.e., minimal to no changes to the existing “core” code) and incremental manner. While multiple Cloud Adapters are already `available `_, new adapters are constantly added. These adapters can be divided in two broad classes, following the Cloud’s classification, white-box and black-box (i.e., public). It is recommended that, for the addition of a new cloud adapter, one uses either OpenStack (in case of white box) or EC2 (in case of black-box) as examples. Assuming that an adapter for a “New Public Cloud” (npc) will be written, here is the summarization of the required steps. 1. Using ``~/osgcloud/cbtool/configs/templates/_ec2.txt`` as an example, create ``~/osgcloud/cbtool/configs/templates/_npc.txt``: - A simple ``cp ~/osgcloud/cbtool/configs/templates/_ec2.txt ~/osgcloud/cbtool/configs/templates/_npc.txt`` should be enough 2. Using ``~/osgcloud/cbtool/lib/clouds/ec2_cloud_ops.py`` as an example, create ``~/osgcloud/cbtool/lib/clouds/npc_cloud_ops.py``. - Again, simply ``cp ~/cbtool/lib/clouds/ec2_cloud_ops.py ~/osgcloud/cbtool/lib/clouds/npc_cloud_ops.py`` - Open the file ``~/osgcloud/cbtool/lib/clouds/npc_cloud_ops.py`` and start by changing lines 37-38 (import New Public Cloud's native python client) and line 40 (rename Ec2Cmds into NpcCmds) 3. CBTOOL's abstract operations are mapped to five mandatory methods in the (newly created by the implementer) class NpcCmds: - vmccleanup - vmcregister - vmcunregister - vmcreate - vmdestory 4. In addition to the mandatory mapping methods, the following methods are also part of each Cloud adapter: - test_vmc_connection - is_vm_running - is_vm_ready 5. From a cloud native python client standpoint, an implementer needs determine how to: - connect to the cloud - list images - list SSH keys - list security groups (if applicable) - list networks (if applicable) - list instances - get all relevant information about instances (e.g., state, IP addresses) 6. The parameters in ``_npc.txt`` will have to be changed taking into account specific features on this cloud. 7. In addition to the “mandatory” methods one might opt for (as shown in the aforementioned table of already existing Cloud Adapters) to implement “optional” operations, such as “vmcapture” and “vmrunstate” (both additional methods in the same class). - It is also possible to add the ability for persistent storage attachment and detachment (i.e., “virtual volumes”) through the methods “vvcreate” and “vvdestroy”. 8. These optional methods will require, from cloud native python client, understanding on how to: - create an “image” from an “instance” - alter the instance power state (e.g., suspend, resume, power on/off) and list - get information and attach/detach volumes from instances 9. Finally, it is important to remember that the parameters in _npc.txt will have to be changed taking into account specific features on this cloud. 10. The recommended way to test the new adapter is to start with a simple “cldattach npc TESTNPCCLOUD”, followed by “vmcattach all”, directly on CBTOOL’s CLI. - This operation will ensure that vmccleanup and vmcregister methods are properly implemented. 11. At this point, the implementer should prepare an image on the New Public Cloud. 12. After that, the implementer can continue by issuing vmattach and vmdetach directives on the CLI. How Provisioning Scripts are Executed =================================================== CBTOOL decides which (and how many) instances to create, based on the "Application Instance" (AI) template. For an AI of type "Hadoop" there will be five instances with the role ``hadoopslave`` and one with role ``hadoopmaster``. For a "Cassandra YCSB" AI, there will be five instances with the role ``seed`` (all seed nodes) and one instance with the role ``ycsb``. The CBTOOL orchestrator node composes the list of instance creation requests into cloud-specific API calls (or commands) and issues these to the cloud under test. It then waits for the instances to complete fully boot, and collect all relevant IP addresses. After the instances are booted, the orchestrator node, again following the AI template logins on each instance through ``ssh``, and configures the applications by executing scripts specific to each instance role. Taking a Cassandra YCSB as an example, it executes (in parallel) scripts to form a Cassandra cluster on all five instances with the ``seed`` role, and a different script to configure YCSB on the instance that will generate load. After the actual Application Instance is fully deployed (i.e., the Cassandra or Hadoop clusters are fully formed, and the load generating application clients are fully configured) the orchestrator node starts the process designated Load Manager (LM) in one of the instances of the AI. The activities described in the two previous paragraphs are represented in the following picture. .. image:: images/cloudbench_application_instance_deployment.png :width: 750pt Once the LM is started, the whole Application Instance becomes self-sufficient, i.e., the Orchestrator node is not required to start any connections to any of the instances that composes the AI throughout the rest of the experiment. The LM will contact the Object Store (typically residing on the Orchestrator node), obtaining all relevant information about load profile, load duration and load level (i.e., intensity) and will execute a load generating process through a script also specified on the AI template. The Load Manager will wait until the process ends, collects all information from either the process' standard output or an output file, and will then process the results and submits a new sample containing application performance results. These results are written, in the form of time-series with multiple key-value pairs (some applications report multiple metrics such are read and write throughput, read and write latency, etc.) in CBTOOL's Metric Store. While the layered architecture of CBTOOL allows the use of multiple data stores for this purpose, the only implementation currently available is MongoDB. The continuous execution/results collection is depicted in the figure below. .. image:: images/cloudbench_application_instance_execution.png :width: 750pt .. include:: ./instance_config.rst