.. _cloudbench_setup:

Benchmark Harness: CBTOOL and Benchmark Drivers Setup and Preparation
**********************************************************************

.. role:: bash(code)
    :language: bash
    
.. include:: ./cloudbench_intro.rst
.. include:: ./prepare_cbtool/cloudbench_prepare_ubuntu_qcow2.rst
.. include:: ./prepare_cbtool/cloudbench_prepare_centos_qcow2.rst
.. include:: ./prepare_cbtool/cloudbench_image_qcow2.rst
.. include:: ./prepare_cbtool/cloudbench_prepare_ec2.rst
.. include:: ./prepare_cbtool/cloudbench_prepare_gcloud.rst
.. include:: ./prepare_cbtool/cloudbench_prepare_digitalocean.rst

.. include:: ./configure_cloud/configure_cloud_openstack.rst
.. include:: ./configure_cloud/configure_cloud_ec2.rst
.. include:: ./configure_cloud/configure_cloud_gcloud.rst
.. include:: ./configure_cloud/configure_cloud_digitalocean.rst


Multiple Network Interfaces on Benchmark Harness Machine
=============================================================

If there are more than one network interfaces on the benchmark
harness machine, it is best to configure CBTOOL with the 
network interface that is used to communicate with the cloud API.

Set the following configuration in the configuration file of cbtool,
assuming it was set up on an Ubuntu machine with :bash:`ubuntu` as the Linux 
user::

  vi ~/osgcloud/cbtool/configs/ubuntu_cloud_definitions.txt
  MANAGER_IP = IPADDRESS_OF_INTERFACE_FOR_CLOUDAPI


Timezone and NTP server
=======================

It is highly recommended that the timezone configured on the benchmark harness machine
is UTC. Set the timezone to UTC by running the following command::

	sudo dpkg-reconfigure tzdata

Scroll to the bottom of the list and select "None of the above". Then, select "UTC".

It is also highly recommended that the timezone of benchmark harness machine and 
instances should be the same.


If, for a compliant run, an NTP server other than the benchmark harness machine 
is used, it must be manually configured on the benchmark harness machine.

Adding a New Cloud Adapter
========================================

As previously mentioned, CBTOOL’s layered architecture was intended to allow the
framework to be incrementally expanded in a non-intrusive (i.e., minimal to no
changes to the existing “core” code) and incremental manner. 

While multiple Cloud Adapters are already 
`available <https://github.com/ibmcb/cbtool/wiki/DOC:-Supported-Clouds>`_, new 
adapters are constantly added. These adapters can be divided in two broad 
classes, following the Cloud’s classification, white-box and black-box (i.e., 
public). It is recommended that, for the addition of a new cloud adapter, one 
uses either OpenStack (in case of white box) or EC2 (in case of black-box) as 
examples.

Assuming that an adapter for a “New Public Cloud” (npc) will be written, here is
 the summarization of the required steps.

1. Using ``~/osgcloud/cbtool/configs/templates/_ec2.txt`` as an example, create ``~/osgcloud/cbtool/configs/templates/_npc.txt``:
  - A simple ``cp ~/osgcloud/cbtool/configs/templates/_ec2.txt ~/osgcloud/cbtool/configs/templates/_npc.txt`` should be enough

2. Using ``~/osgcloud/cbtool/lib/clouds/ec2_cloud_ops.py`` as an example, create ``~/osgcloud/cbtool/lib/clouds/npc_cloud_ops.py``. 
  - Again, simply ``cp ~/cbtool/lib/clouds/ec2_cloud_ops.py ~/osgcloud/cbtool/lib/clouds/npc_cloud_ops.py``
  - Open the file ``~/osgcloud/cbtool/lib/clouds/npc_cloud_ops.py`` and start by changing lines 37-38 (import New 
    Public Cloud's native python client) and line 40  (rename Ec2Cmds into NpcCmds)

3. CBTOOL's abstract operations are mapped to five mandatory methods in the 
(newly created by the implementer) class NpcCmds: 
  - vmccleanup
  - vmcregister
  - vmcunregister
  - vmcreate
  - vmdestory

4. In addition to the mandatory mapping methods, the following methods are also 
part of each Cloud adapter: 
  - test_vmc_connection
  - is_vm_running
  - is_vm_ready

5. From a cloud native python client standpoint, an implementer needs determine 
how to: 
  - connect to the cloud
  - list images
  - list SSH keys
  - list security groups (if applicable)
  - list networks (if applicable)
  - list instances
  - get all relevant information about instances (e.g., state, IP addresses)

6. The parameters in ``_npc.txt`` will have to be changed taking into account 
specific features on this cloud.

7. In addition to the “mandatory” methods one might opt for (as shown in the 
aforementioned table of already existing Cloud Adapters) to implement “optional”
operations, such as “vmcapture” and “vmrunstate” (both additional methods in 
the same class). 
  - It is also possible to add the ability for persistent storage attachment and
detachment (i.e., “virtual volumes”) through the methods “vvcreate” and “vvdestroy”.

8. These optional methods will require, from cloud native python client, 
understanding on how to: 
  - create an “image” from an “instance”
  - alter the instance power state (e.g., suspend, resume, power on/off) and list
  - get information and attach/detach volumes from instances

9. Finally, it is important to remember that the parameters in _npc.txt will 
have to be changed taking into account specific features on this cloud.

10. The recommended way to test the new adapter is to start with a simple 
“cldattach npc TESTNPCCLOUD”, followed by “vmcattach all”, directly on CBTOOL’s CLI. 
  - This operation will ensure that vmccleanup and vmcregister methods are properly implemented. 

11. At this point, the implementer should prepare an image on the New Public 
Cloud.

12. After that, the implementer can continue by issuing vmattach and vmdetach 
directives on the CLI.


How Provisioning Scripts are Executed
===================================================

CBTOOL decides which (and how many) instances to create, based on the "Application 
Instance" (AI) template. For an AI of type "Hadoop" there will be five instances with
the role ``hadoopslave`` and one with role ``hadoopmaster``. For a "Cassandra 
YCSB" AI, there will be five instances with the role ``seed`` (all seed nodes) and 
one instance with the role ``ycsb``. The CBTOOL orchestrator node composes the list of instance
creation requests into cloud-specific API calls (or commands) and issues these 
to the cloud under test. It then waits for the instances to complete fully boot, and 
collect all relevant IP addresses.

After the instances are booted, the orchestrator node, again following the AI template
logins on each instance through ``ssh``, and configures the applications by executing 
scripts specific to each instance role. Taking a Cassandra YCSB as an example, it 
executes (in parallel) scripts to form a Cassandra cluster on all five instances with 
the ``seed`` role, and a different script to configure YCSB on the instance that will
generate load. After the actual Application Instance is fully deployed (i.e., the
Cassandra or Hadoop clusters are fully formed, and the load generating application
clients are fully configured) the orchestrator node starts the process designated
Load Manager (LM) in one of the instances of the AI.

The activities described in the two previous paragraphs are represented in the 
following picture.

.. image:: images/cloudbench_application_instance_deployment.png
    :width: 750pt

Once the LM is started, the whole Application Instance becomes self-sufficient, 
i.e., the Orchestrator node is not required to start any connections to any of
the instances that composes the AI throughout the rest of the experiment. The LM will
contact the Object Store (typically residing on the Orchestrator node), obtaining 
all relevant information about load profile, load duration and load level
(i.e., intensity) and will execute a load generating process through a 
script also specified on the AI template. 

The Load Manager will wait until the process ends, collects all information from
either the process' standard output or an output file, and will then process 
the results and submits a new sample containing application performance results.
These results are written, in the form of time-series with multiple key-value 
pairs (some applications report multiple metrics such are read and write 
throughput, read and write latency, etc.) in CBTOOL's Metric Store. While the 
layered architecture of CBTOOL allows the use of multiple data stores for this 
purpose, the only implementation currently available is MongoDB. 

The continuous execution/results collection is depicted in the figure below.

.. image:: images/cloudbench_application_instance_execution.png
    :width: 750pt


.. include:: ./instance_config.rst