<?xml version="1.0"?>
<!DOCTYPE flagsdescription SYSTEM "http://www.spec.org/dtd/cpuflags1.dtd">
<flagsdescription>

<!-- filename to begin with "Intel-Windows-Platform-Settings" -->
<filename>Intel-Windows-Platform-Settings.xml</filename>

<title>SPEC CPU2006 Platform Settings for Windows-based systems</title>
<header>
<![CDATA[
<p style="text-align: left; color: red; font-size: larger; background-color: black">
 Copyright &copy; 2006 Intel Corporation.  All Rights Reserved.</p>
]]>
</header>

<platform_settings>
 <![CDATA[ 
	     <p><b>Platform settings</b></p>
             <p>One or more of the following settings may have been set.  If so, the "General Notes" section of the
             report will say so; and you can read below to find out more about what these settings mean.</p>

             <p><b>KMP_STACKSIZE </b></p>
             <p>
             Specify stack size to be allocated for each thread. 
             </p>

             <p><b>KMP_AFFINITY </b></p>
             <p>
             KMP_AFFINITY  =  &lt; physical | logical &gt;, starting-core-id <br/>
             specifies the static mapping of user threads to physical cores. For example, 
             if you have a system configured with 8 cores, OMP_NUM_THREADS=8 and 
             KMP_AFFINITY=physical,0 then thread 0 will mapped to core 0, thread 1 will be mapped to core 1, and 
             so on in a round-robin fashion.   
		 </p>

             <p>
             KMP_AFFINITY = granularity=fine,scatter <br/>
             The value for the environment variable KMP_AFFINITY affects how the threads from an auto-parallelized program are scheduled across processors. <br/>
             Specifying granularity=fine selects the finest granularity level, causes each OpenMP thread to be bound to a single thread context. <br/>
             This ensures that there is only one thread per core on cores supporting HyperThreading Technology<br/>
             Specifying scatter distributes the threads as evenly as possible across the entire system. <br/> 
             Hence a combination of these two options, will spread the threads evenly across sockets, with one thread per physical core. <br/>
             </p>
 
             <p><b>OMP_NUM_THREADS </b></p>
             <p>
	      Sets the maximum number of threads to use for OpenMP* parallel regions if no 
              other value is specified in the application. This environment variable 
              applies to both -openmp and -parallel (Linux and Mac OS X) or /Qopenmp and /Qparallel (Windows).
              Example syntax on a Linux system with 8 cores:
              export OMP_NUM_THREADS=8
             </p>

		 <p><b>Hardware Prefetch:</b></p> 
		 <p>
		 This BIOS option allows the enabling/disabling of a processor mechanism to                 
		 prefetch data into the cache according to a pattern-recognition algorithm.
		 </p>
		 <p>                
		 In some cases, setting this option to Disabled may improve
		 performance. Users should only disable this option 
		 after performing application benchmarking to verify improved
		 performance in their environment.
		 </p>

		 <p><b>Adjacent Sector Prefetch:</b></p> 
		 <p>
		 This BIOS option allows the enabling/disabling of a processor mechanism to                 
		 fetch the adjacent cache line within an 128-byte sector that contains 
		 the data needed due to a cache line miss.
		 </p>
		 <p>                
		 In some cases, setting this option to Disabled may improve
		 performance. Users should only disable this option 
		 after performing application benchmarking to verify improved
		 performance in their environment.
		 </p>

                 
             <p><b>submit= specperl -e "system sprintf qq{start /b /wait /affinity %x %s}, (1&lt;&lt;$SPECCOPYNUM), qq{ $command } " </b></p>
	 	 <p>When running multiple copies of benchmarks, the SPEC config file feature 
		 <b>submit</b> is used to cause individual jobs to be bound to 
		 specific processors. This specific submit command is used for Windows. 
		 The description of the elements of the command are:</p>
		 <ul>
		 <li><b>start /b /wait /affinity mask command </b>: <br/>
             The start command is used to launch a new COMMAND with a given CPU 
         	 affinity. The CPU affinity is represented as a bitmask, with the 
        	 lowest order bit corresponding to the first logical CPU and highest
        	 order bit corresponding to the last logical CPU. The process is only allowed to run 
             on a particular logical processor when the corresponding bit in the mask has been set to 1. 
             <br/>
        	 </li>
		 <li><b>mask</b>: The bitmask (in hexadecimal) corresponding to a specific
        	 SPECCOPYNUM. For example, the mask value for the first copy of a 
        	 rate run will be 0x00000001, for the second copy of the rate will 
        	 be 0x00000010 etc. Thus, the first copy of the rate run will have a
        	 CPU affinity of CPU0, the second copy will have the affinity CPU1 
        	 etc.</li>
		 <li><b>command</b>: Program to be started, in this case, the benchmark instance 
        	 to be started.</li>
		 </ul>		 

	  

  ]]> 
  </platform_settings></flagsdescription>

