.. _baseline_measurements: ********************************************************* Running Baseline Phase For the First Time With Your Cloud ********************************************************* .. _prepare_baseline: This section assumes that CBTOOL is already started and has successfully connected with your cloud. Setting Up Parameters ===================== In baseline phase, application instances for the two workloads, KMeans and YCSB, are created five times. That is, instances are provisioned, data is generated, load generator is run, data is deleted, and then the instances are deleted. This is controlled by the following parameters:: iteration_count: 5 run_count: 1 destroy_ai_upon_completion: true Thus, a total of 35 and 30 instances are created and destroyed for each YCSB and KMeans workloads, respectively. Creation of data, instantiation of load generator, and deletion of data comprises a run, which is controlled by run_count parameter. If a tester knows that in their cloud, baseline results will be worse than elasticity phase results (due to performance isolation etc), they must set the run_count to five or higher before starting a compliant run. For compliant run, iteration_count must be 5 and destroy_ai_upon_completion must be true. Cloud Name ---------- Please make sure that the cloud name in ``osgcloud_rules.yaml`` matches the cloud name in the CBTOOL configuration.:: cloud_name: MYOPENSTACK CBTOOL configuration file is present in ``~/osgcloud/cbtool/configs/\*_cloud_definitions.txt`` YCSB Baseline Measurement ========================= Preparation ----------- Set the appropriate thread count for YCSB in the osgcloud_rules.yaml file, e.g.,:: For centos images: uncomment below line under cassandra section: #uncomment this for centos images #cassandra_conf_path: /etc/cassandra/conf/cassandra.yaml should be: #uncomment this for centos images cassandra_conf_path: /etc/cassandra/conf/cassandra.yaml for centos & ubuntu images: thread_count: 8 The tester will have to measure the thread count yourself for your cloud. The default thread count is 8. In general, the higher the thread count, the higher will be the throughput (it will reach capacity for AI with some number of threads). Consequently, the scalability results of a cloud under test may be higher, if there is no drastic decrease in elasticity measurements. Running ------- The YCSB baseline script parameter description is as follows:: usage: osgcloud_ycsb_baseline.py [-h] [--console_log_level CONSOLE_LOG_LEVEL] [--runrules_yaml RUNRULES_YAML] [--flush_log FLUSH_LOG] [--version] --exp_id EXP_ID It is run as follows:: python osgcloud_ycsb_baseline.py --exp_id SPECRUNID where SPECRUNID indicates the run id that will be used across baseline and elasticity + scalability phases. By default, the script logs the run to a file. If you will like to show the run on the console, type the following:: python osgcloud_ycsb_baseline.py --exp_id SPECRUNID --console_log_level DEBUG By default, the results for this experiment are present in:: ~/results/SPECRUNID/perf/ If five iterations are run (which are needed for a compliant run), the tester should expect to find five directories starting with ``SPECRUNIDYCSB`` in the ``~/results/SPECRUNID/perf`` directory. Following files will be present in the directory. The date/time in file and directory names will match the date/time of your run:: baseline_SPECRUNID.yaml osgcloud_ycsb_baseline_SPECRUNID-20150811233732UTC.log SPECRUNIDYCSBBASELINE020150811233732UTC SPECRUNIDYCSBBASELINE120150811233732UTC SPECRUNIDYCSBBASELINE220150811233732UTC SPECRUNIDYCSBBASELINE320150811233732UTC SPECRUNIDYCSBBASELINE420150811233732UTC K-Means Baseline Measurement ============================ Preparation ----------- The following parameters may be changed in ``osgcloud_rules.yaml`` depending on how Hadoop was set up in the instance image. The default value of the parameters is shown below:: centos images: java_home: /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.85-2.6.1.2.el7_1.x86_64 hadoop_home: /usr/local/hadoop dfs_name_dir: /usr/local/hadoop_store/hdfs/namenode dfs_data_dir: /usr/local/hadoop_store/hdfs/datanode ubuntu images: java_home: /usr/lib/jvm/java-7-openjdk-amd64 hadoop_home: /usr/local/hadoop dfs_name_dir: /usr/local/hadoop_store/hdfs/namenode dfs_data_dir: /usr/local/hadoop_store/hdfs/datanode Running ------- The KMeans baseline script parameter description is as follows:: usage: osgcloud_kmeans_baseline.py [-h] [--console_log_level CONSOLE_LOG_LEVEL] [--runrules_yaml RUNRULES_YAML] [--flush_log FLUSH_LOG] [--version] --exp_id EXP_ID It is run as follows:: python osgcloud_kmeans_baseline.py --exp_id SPECRUNID where SPECRUNID indicates the run id that will be used across baseline and elasticity + scalability phases. By default, the script logs the run to a file. If you will like to show the run on the console, type the following:: python osgcloud_kmeans_baseline.py --exp_id SPECRUNID --console_log_level DEBUG By default, the results for this experiment are present in:: ~/results/SPECRUNID/perf/ If five iterations are run (which are needed for a compliant run), the tester should expect to find five directories starting with ``SPECRUNIDKMEANS`` in the ``~/results/SPECRUNID/perf directory``. Following files will be present in the directory. The date/time in file and directory names will match the date/time of your run:: baseline_SPECRUNID.yaml osgcloud_kmeans_baseline_SPECRUNID-20150811233302UTC.log SPECRUNIDKMEANSBASELINE020150811233302UTC SPECRUNIDKMEANSBASELINE120150811233302UTC SPECRUNIDKMEANSBASELINE220150811233302UTC SPECRUNIDKMEANSBASELINE320150811233302UTC SPECRUNIDKMEANSBASELINE420150811233302UTC Configuring Supporting Evidence Collection =========================================== Make sure that supporting evidence parameters are set correctly in osgcloud_rules.yaml file.:: support_evidence: instance_user: cbuser instance_keypath: HOMEDIR/osgcloud/cbtool/credentials/cbtool_rsa support_script: HOMEDIR/osgcloud/driver/support_script/collect_support_data.sh cloud_config_script_dir: HOMEDIR/osgcloud/driver/support_script/cloud_config/ ########################################### # START instance support evidence flag is true # for public and private clouds. host flag # is true only for private clouds or for # those clouds where host information is # available. ########################################### instance_support_evidence: true host_support_evidence: false ########################################### # END ########################################### ``instance_user`` parameter indicates the Linux user that is used to SSH into the instance. It is also set in the cloud configuration text file for CBTOOL. ``instance_key_path`` indicates the SSH key that is used to SSH into the instance. Please make sure that the permissions of this file are set to 400 (chmod 400 KEYFILE) ``support_script`` indicates the path of the script that is used to gather supporting evidence. ``cloud_config_script_dir`` indicates the path where scripts relevant to gathering cloud configuration are present. These scripts differ from one cloud to the other. ``instance_support_evidence`` indicates that whether to collect supporting evidence from instances. This flag is ignored for simulated clouds. For testing of baseline phase, it is recommended to set this flag to false.