OS Images |
os_Image_1(1)
|
Hardware Description |
hw_1
|
Number of Systems |
1
|
SW Environment |
non-virtual
|
Tuning |
BIOS Settings: - NUMA nodes per socket = NPS2
- Determinism Control = Manual
- Determinism Slider = Power
- cTDP Control = Manual
- cTDP = 240
- Package Power Limit Control = Manual
- Package Power Limit = 240
- Memory Clock Speed = 1467MHz
- L1 Stream HW Prefetcher = Disable
- L2 Stream HW Prefetcher = Disable
|
Notes |
None
|
|
JVM Instances |
jvm_Ctr_1(1), jvm_Backend_1(12), jvm_TxInjector_1(12)
|
OS Image Description |
os_1
|
Tuning |
- cpupower -c all frequency-set -g performance
- tuned-adm profile throughput-performance
- echo 10000000 > /proc/sys/kernel/sched_min_granularity_ns
- echo 15000000 > /proc/sys/kernel/sched_wakeup_granularity_ns
- echo 3000 > /proc/sys/kernel/sched_migration_cost_ns
- echo 990000 > /proc/sys/kernel/sched_rt_runtime_us
- echo 100000 > /proc/sys/kernel/sched_latency_ns
- echo 10000 > /proc/sys/vm/dirty_expire_centisecs
- echo 1500 > /proc/sys/vm/dirty_writeback_centisecs
- echo 40 > /proc/sys/vm/dirty_ratio
- echo 10 > /proc/sys/vm/dirty_background_ratio
- echo 10 > /proc/sys/vm/swappiness
- echo 0 > /proc/sys/kernel/numa_balancing
- echo always > /sys/kernel/mm/transparent_hugepage/defrag
- echo always > /sys/kernel/mm/transparent_hugepage/enabled
- Add cgroup_disable=memory,cpu,cpuacct,blkio,hugetlb,pids,cpuset,perf_event,freezer,devices,net_cls,net_prio to GRUB_CMDLINE_LINUX_DEFAULT
- ulimit -n 1024000
- UserTasksMax=970000
- DefaultTasksMax=970000
|
Notes |
None
|
Parts of Benchmark |
Controller
|
JVM Instance Description |
jvm_1
|
Command Line |
-Xms3g -Xmx3g -Xmn2g -XX:+UseParallelOldGC -XX:ParallelGCThreads=1 -XX:CICompilerCount=2
|
Tuning |
None
|
Notes |
Used numactl to interleave memory on all NUMA nodes
|
Parts of Benchmark |
Backend
|
JVM Instance Description |
jvm_1
|
Command Line |
-Xms31g -Xmx31g -Xmn29g -server -XX:MetaspaceSize=256m -XX:AllocatePrefetchInstr=2 -XX:LargePageSizeInBytes=2m -XX:-UsePerfData -XX:-UseAdaptiveSizePolicy -XX:+AlwaysPreTouch -XX:-UseBiasedLocking -XX:+UseLargePages -XX:+UseParallelOldGC -XX:SurvivorRatio=23 -XX:TargetSurvivorRatio=98 -XX:ParallelGCThreads=8 -XX:MaxTenuringThreshold=5 -XX:InitialCodeCacheSize=25m -XX:MaxInlineSize=900 -XX:FreqInlineSize=900 -XX:LoopUnrollLimit=30 -XX:LoopMaxUnroll=6 -XX:CICompilerCount=2 -XX:+UseGCTaskAffinity -XX:+UseTransparentHugePages -XX:ParGCArrayScanChunk=3584 -XX:InlineSmallCode=3000 -XX:AutoBoxCacheMax=5000
|
Tuning |
None
|
Notes |
Used numactl to affinitize each Backend JVM to 4Core/8Threads - --physcpubind=0-3,48-51 --localalloc
- --physcpubind=4-7,52-55 --localalloc
- --physcpubind=8-11,56-59 --localalloc
- --physcpubind=12-15,60-63 --localalloc
- --physcpubind=16-19,64-67 --localalloc
- --physcpubind=20-23,68-71 --localalloc
- --physcpubind=24-27,72-75 --localalloc
- --physcpubind=28-31,76-79 --localalloc
- --physcpubind=32-35,80-83 --localalloc
- --physcpubind=36-39,84-87 --localalloc
- --physcpubind=40-43,88-91 --localalloc
- --physcpubind=44-47,92-95 --localalloc
|
Parts of Benchmark |
TxInjector
|
JVM Instance Description |
jvm_1
|
Command Line |
-Xms3g -Xmx3g -Xmn2g -XX:+UseParallelOldGC -XX:ParallelGCThreads=1 -XX:CICompilerCount=2
|
Tuning |
None
|
Notes |
Used numactl to affinitize each Transaction Injector JVM to 4Core/8Threads - --physcpubind=0-3,48-51 --localalloc
- --physcpubind=4-7,52-55 --localalloc
- --physcpubind=8-11,56-59 --localalloc
- --physcpubind=12-15,60-63 --localalloc
- --physcpubind=16-19,64-67 --localalloc
- --physcpubind=20-23,68-71 --localalloc
- --physcpubind=24-27,72-75 --localalloc
- --physcpubind=28-31,76-79 --localalloc
- --physcpubind=32-35,80-83 --localalloc
- --physcpubind=36-39,84-87 --localalloc
- --physcpubind=40-43,88-91 --localalloc
- --physcpubind=44-47,92-95 --localalloc
|
|