Author Topic: runspecvirt fail  (Read 46563 times)

zgy

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
runspecvirt fail
« on: September 07, 2017, 05:06:37 AM »
HI all:
as i run ./runspecvirt.sh , it fail, shows like this:

[root@client1 SPECvirt]# tail -50 primectrl.out
2017-09-07 16:38:05:383 PrimeControl: Master 1: client1:1098
2017-09-07 16:38:05:383 PrimeControl: Master 2: client1:1096
2017-09-07 16:38:05:383 PrimeControl: Master 3: client1:1094
2017-09-07 16:38:05:383 PrimeControl: Master 4: client1:1092
2017-09-07 16:38:05:385 PrimeControl: adding host client1:1098
2017-09-07 16:38:05:394 PrimeControl: adding host client1:1096
2017-09-07 16:38:05:399 PrimeControl: adding host client1:1094
2017-09-07 16:38:05:405 PrimeControl: adding host client1:1092
2017-09-07 16:38:05:420 First client for 0: 172.21.128.219:1091
2017-09-07 16:38:05:434 First client for 1: 172.21.128.219:1010
2017-09-07 16:38:05:444 First client for 2: 172.21.128.219:1200
2017-09-07 16:38:05:450 First client for 3: 172.21.128.219:1902
2017-09-07 16:38:05:454 PrimeControl: starting clients...
2017-09-07 16:38:05:454 PrimeControl: clients.length = 4
2017-09-07 16:38:05:454 PrimeControl: clients[0].length = 1
2017-09-07 16:38:05:454 PrimeControl: starting clients[0][0]: 172.21.128.219:1091
2017-09-07 16:38:05:475 PrimeControl: started client: 172.21.128.219:1091
2017-09-07 16:38:05:475 PrimeControl: clients[1].length = 1
2017-09-07 16:38:05:476 PrimeControl: starting clients[1][0]: 172.21.128.219:1010
2017-09-07 16:38:05:482 PrimeControl: started client: 172.21.128.219:1010
2017-09-07 16:38:05:482 PrimeControl: clients[2].length = 1
2017-09-07 16:38:05:482 PrimeControl: starting clients[2][0]: 172.21.128.219:1200
2017-09-07 16:38:05:491 PrimeControl: started client: 172.21.128.219:1200
2017-09-07 16:38:05:491 PrimeControl: clients[3].length = 1
2017-09-07 16:38:05:491 PrimeControl: starting clients[3][0]: 172.21.128.219:1902
2017-09-07 16:38:05:510 PrimeControl: started client: 172.21.128.219:1902
2017-09-07 16:38:05:517 PrimeControl: PTDs not used for this benchmark run!
2017-09-07 16:38:05:517 PrimeControl: starting 4 masters.
2017-09-07 16:38:39:648 PrimeControl: master[0][0] sleeping 20 sec.
2017-09-07 16:38:39:648 PrimeControl: master[0][1] sleeping 20 sec.
2017-09-07 16:38:39:648 PrimeControl: master[0][2] sleeping 20 sec.
2017-09-07 16:38:39:648 PrimeControl: master[0][3] sleeping 20 sec.
2017-09-07 16:38:59:687 PrimeControl: waiting on 4 prime client(s).
2017-09-07 16:39:00:146 Sending config to client1:1092
2017-09-07 16:39:00:151 Sending config to client1:1098
2017-09-07 16:39:00:158 Sending config to client1:1094
2017-09-07 16:39:00:200 Sending config to client1:1096
2017-09-07 16:39:00:344 specvirt: waiting on 3 prime clients.
2017-09-07 16:39:00:618 specvirt: waiting on 2 prime clients.
.
2017-09-07 16:39:00:721 setting hostsReady = true
2017-09-07 16:39:09:684 specvirt: waiting on 1 prime clients.
2017-09-07 16:41:15:791 [ERROR] Received abort signal from client1:1098. Terminating.
2017-09-07 16:41:15:791 PrimeControl: sending abortTest() to prime clients.
2017-09-07 16:41:15:792 PrimeControl: id=1, abortID=0
2017-09-07 16:41:15:792 PrimeControl: masters[1]=client1:1096
2017-09-07 16:41:15:792 PrimeControl: id=3, abortID=0
2017-09-07 16:41:15:792 PrimeControl: id=2, abortID=0
2017-09-07 16:41:15:792 PrimeControl: masters[2]=client1:1094
2017-09-07 16:41:15:792 PrimeControl: masters[3]=client1:1092

so i touch client-1098 log, and it shows:
prime-client1_1098:
[root@client1 SPECvirt]# tail -50 logs/20170907-163805/prime-client1_1098.log
2017-09-07 16:38:59:890 Looking up SPECvirt controller: client1
2017-09-07 16:39:00:150 masterID: 0, tile: 0, workload: 0
2017-09-07 16:39:00:151 hostname: client1
Hostname of prime client: client1
2017-09-07 16:39:00:219 Thu Sep 07 16:38:05 CST 2017
2017-09-07 16:39:00:272 RMI server started: client1:9900
2017-09-07 16:39:00:273 Total clients: 1
2017-09-07 16:39:00:273 Adding host client1:1091
2017-09-07 16:39:00:291 Setting up clients...
2017-09-07 16:39:00:717 calling getHostVM() on jappclient...
2017-09-07 16:41:15:786 Remote exception in setup() from client1:1091
java.rmi.ServerException: RemoteException occurred in server thread; nested exception is:
        java.rmi.RemoteException: Connection refused (Connection refused); nested exception is:
        java.lang.Exception: Connection refused (Connection refused)

2017-09-07 16:41:15:787 Aborting...
2017-09-07 16:41:15:791 abortTest() signal sent...
2017-09-07 16:41:15:791 clientsRunning = true
2017-09-07 16:41:15:792 calling client.exit()...

PLS, the pollme is running in ALL vms, iptables is stop, and SSLv3 is removed  from the jdk.tls.disabledAlgorithms property in the java.security file( /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.141-3.b16.el6_9.x86_64/jre/lib/security/java.security),


Run pollmecheck.sh from client shows:
[root@client1 SPECvirt]# ./pollmecheck.sh
Tile 1 VMs:
dbserver1:
root@dbserver1's password:
root      3643     1  0 16:31 pts/0    00:00:01 java -jar /opt/SPECpoll/pollme.jar -n dbserver -p 8001
appserver1:
root@appserver1's password:
root      4025     1  0 16:30 pts/0    00:00:01 java -jar /opt/SPECpoll/pollme.jar -n appserver -p 8001
batchserver1:
root@batchserver1's password:
root      2362     1  0 16:27 ?        00:00:01 java -jar /opt/SPECpoll/pollme.jar -n batchserver -p 8001
mailserver1:
root@mailserver1's password:
root      2351     1  0 16:26 ?        00:00:02 java -jar /opt/SPECpoll/pollme.jar -n mailserver -p 8001
infraserver1:
root@infraserver1's password:
root      3375     1  0 16:32 pts/0    00:00:01 java -jar /opt/SPECpoll/pollme.jar -n infraserver -p 8001
webserver1:
root@webserver1's password:
root      3512     1  0 16:29 pts/0    00:00:02 java -jar /opt/SPECpoll/pollme.jar -n webserver -p 8001
 

Polleme.out:
[root@infraserver ~]# cat /tmp/pollme.out
Creating RMI listener using RMI Registry port 8001
infraserver/172.21.128.220:8001 ready...
infraserver: 1504773548135

[root@webserver ~]# cat /tmp/pollme.out
Creating RMI listener using RMI Registry port 8001
webserver/172.21.128.221:8001 ready...
2017-09-07 16:39:07:755 contacting host 'infraserver'...
webserver: 1504773547774
infraserver: 1504773548135

[root@mailserver ~]# cat /tmp/pollme.out
Creating RMI listener using RMI Registry port 8001
mailserver/172.21.128.222:8001 ready...
mailserver: 1504773534638
[root@mailserver ~]#

[root@appserver ~]# cat /tmp/pollme.out
Creating RMI listener using RMI Registry port 8001
appserver/172.21.128.224:8001 ready...
[root@appserver ~]#

[root@dbserver ~]# cat /tmp/pollme.out
Creating RMI listener using RMI Registry port 8001
dbserver/172.21.128.225:8001 ready...
[root@dbserver ~]#

[root@batchserver ~]# cat /tmp/pollme.out
Creating RMI listener using RMI Registry port 8001
batchserver/172.21.128.227:8001 ready...
batchserver: 1504773537995
[root@batchserver ~]#

My etc/hosts is:
# External VM-to-client communications
172.21.128.220     infraserver infraserver1
172.21.128.221     webserver webserver1
172.21.128.222     mailserver mailserver1
172.21.128.224     appserver appserver1 specdelivery specemulator
172.21.128.225     dbserver dbserver1
172.21.128.227     batchserver batchserver1

172.21.128.219  client1 specdriver

# Internal VM-to-VM only communications
172.21.128.240       infraserver1-int
172.21.128.241       webserver1-int
172.21.128.244       appserver1-int
172.21.128.245       dbserver1-int specdb

java version is 1.8.0_141 expect appserver is 1.7.0_151, because java 1.8 is incompatible with glassfish3, so i use 1.7.0_151 in appserver.

How to deal whit it now?

attachments is the client log and primectl.out,  also my Control.config in it , please help have a look, thanks so much.
« Last Edit: September 07, 2017, 05:40:29 AM by zgy »

lroderic

  • Moderator
  • Full Member
  • *****
  • Posts: 167
  • Karma: +6/-0
Re: runspecvirt fail
« Reply #1 on: September 07, 2017, 11:55:49 AM »
Thanks for all the logs. The problem is with appserver's internal network connection to dbserver evidenced in prime-client1_1098.log which gets reported into primectrl.out. Let's work on that without the noise of the other workloads. In Control.config, please change NUM_WORKLOADS=1 to run only app/dbserver.

You're not using an internal network since both the internal vNICs and external vNICs are on 172.21.128. The point of using internal vNICs for communications between appserver and dbserver is to move that intra-VM network traffic off of the LAN between the client and workload VMs, reducing network traffic between the client and workload VMs. The error in client-172.21.128.219_1091.log means that SPECjAppServer is trying to connect to a hostname it can't find.

To make the example VM configuration work, please create a "dummy" internal network that lets VMs communicate only with each other. If you use vSphere, see https://communities.vmware.com/thread/234233 to set up a "host-only" network. If you use KVM, see http://www.linux-kvm.org/page/Networking#Private_Virtual_Bridge for setting up a private virtual bridge.

Once you create the internal network and assign a second vNIC on each VM to it, edit the VM's /etc/hosts to reflect the IP addresses for those vNICS. The Example VM guide states in Section 2.0:

Code: [Select]
Each VM must have at least one virtual network adapter associated with a physical network that can communicate
with the client. Review the /etc/hosts file for each cloned VM for the IP address assignments. Assign the first network
adapter (eth0) to the external network (for example 192.168). Assign the second network adapter (eth1) to the internal
network (for example 10.10).


Using this example, we'll use 10.10.1 for the internal network on app/db and web/infra VMs:

Code: [Select]
# External VM-to-client communications
172.21.128.220     infraserver infraserver1
172.21.128.221     webserver webserver1
172.21.128.222     mailserver mailserver1
172.21.128.224     appserver appserver1 specdelivery specemulator
172.21.128.225     dbserver dbserver1
172.21.128.227     batchserver batchserver1

# Internal VM-to-VM only communications
10.10.1.220       infraserver1-int
10.10.1.221       webserver1-int
10.10.1.224       appserver1-int
10.10.1.225       dbserver1-int specdb

  • Reboot the workload VMs and check that SPECpoll has started on each of them.
  • Make sure that you can ping between appserver1-int and dbserver1-int to see that the internal network is up.
  • Run the helper script at /opt/SPECvirt/helper-sc2013/jApp-testappserver.sh to see that SPECjAppServer emulator, delivery, and dbserver are up and working.
  • Then rerun the test with only app/dbserver (NUM_WORKLOADS=1).
You could also do a search on "SPECpoll" here in the forum to see others who've had problems with getting SPECpoll working.

Lisa

zgy

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
Re: runspecvirt fail
« Reply #2 on: September 08, 2017, 10:15:49 AM »
HI Lisa, Thanks for your help.

After I use 10.10.1 for the internal network, I rerun runspecvirt.sh , now the error info is:
2017-09-08 22:10:34:097 SslConnection: [ERROR] IOException during SSL handshake: javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is disabled or cipher suites are inappropriate)
2017-09-08 22:10:34:097 Connection: [ERROR] createSocket() failed.
2017-09-08 22:10:34:097 HttpRequestSched: Creating socket...
2017-09-08 22:10:34:097 SslConnection: [ERROR] IOException during SSL handshake: javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is disabled or cipher suites are inappropriate)
2017-09-08 22:10:34:097 Connection: [ERROR] createSocket() failed.
2017-09-08 22:10:34:097 HttpRequestSched: Creating socket...
2017-09-08 22:10:34:097 SslConnection: [ERROR] IOException during SSL handshake: javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is disabled or cipher suites are inappropriate)
2017-09-08 22:10:34:098 Connection: [ERROR] createSocket() failed.
2017-09-08 22:10:34:098 HttpRequestSched: Creating socket...


After I did  below, but The error still exists
1:Remove SSLv3 and MD5withRSA from the jdk.tls.disabledAlgorithms property in the $JDK_HOME/jre/lib/security/java.security file (SPECvirt assumes an MD5-based SSLv3 cipher).
2:Add SSL_RSA_WITH_RC4_128_MD5 to the jdk.tls.legacyAlgorithms list in the $JDK_HOME/jre/lib/security/java.security file.
3:Remove !MD5 from the SSLCipherSuite list in /etc/httpd/conf.d/ssl.conf file.



lroderic

  • Moderator
  • Full Member
  • *****
  • Posts: 167
  • Karma: +6/-0
Re: runspecvirt fail
« Reply #3 on: September 08, 2017, 10:40:27 AM »
Hi. Your app/dbserver got further, but since you ran with NUM_WORKLOADS=4, we can't tell if your changes to the hosts file for app/dbserver worked. I would've made sure that was working before moving on to web/infraserver, but OK.

Thanks for reading the Technical Support FAQ. You're using Java 1.8.0_141 on all VMs except for the appserver? You made those three SSL changes on the client, correct?

Lisa


zgy

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
Re: runspecvirt fail
« Reply #4 on: September 09, 2017, 12:03:13 AM »
YES, I using java 1.8.0_141 on all VMs expect appserver, 

I made those threre SSL changes on all VMs,   it is right?

Or just for the client VM?

lroderic

  • Moderator
  • Full Member
  • *****
  • Posts: 167
  • Karma: +6/-0
Re: runspecvirt fail
« Reply #5 on: September 09, 2017, 12:54:08 AM »
You need those changes only on the client, not on any other VMs. As it says in https://www.spec.org/virt_sc2013/docs/SPECvirt_TechnicalSupport.html under Webserver/Infraserver:

To enable SSLv3, log into the client and: ...

You might consider using TLS_RSA_WITH_AES_128_CBC_SHA. See https://www.spec.org/forums/index.php?topic=58.msg358 for info on making it work. If you end up using TLS and intend to submit your test to SPEC for acceptance, make sure to apply the patch at https://www.spec.org/virt_sc2013/docs/SPECvirt_Patches.html.

Lisa

zgy

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
Re: runspecvirt fail
« Reply #6 on: September 11, 2017, 05:22:25 AM »
I use TLS_RSA_WITH_AES_128_CBC_SHA, now got futher, but anthor error....


Clientmgr1_1088.out:
-> 2017-09-11 17:32:33:537 SPECweb_Support: [ERROR] SocketTimeoutException encountered during run!
-> 2017-09-11 17:32:33:537 SPECweb_Support[653]: State Switch [4,3]
-> 2017-09-11 17:32:33:537 WorkloadScheduler[653]: Thinking for 28000 msec
-> 2017-09-11 17:32:33:562 Connection: SocketTimeoutException waiting for end-of-header
-> 2017-09-11 17:32:33:562 SPECweb_Support[687]: STATE 2; RESPONSE LENGTH = 20480
-> 2017-09-11 17:32:33:562 SPECweb_Support[687]: STATE 2; FILE BYTES READ = -4
-> 2017-09-11 17:32:33:562 SPECweb_Support: [ERROR] SocketTimeoutException encountered during run!
-> 2017-09-11 17:32:33:562 SPECweb_Support[687]: State Switch [2,3]
-> 2017-09-11 17:32:33:562 WorkloadScheduler[687]: Thinking for 12000 msec
-> 2017-09-11 17:32:33:565 Connection: SocketTimeoutException waiting for end-of-header
-> 2017-09-11 17:32:33:565 SPECweb_Support[2219]: STATE 1; RESPONSE LENGTH = 20480
-> 2017-09-11 17:32:33:565 SPECweb_Support[2219]: STATE 1; FILE BYTES READ = -4
-> 2017-09-11 17:32:33:565 SPECweb_Support: [ERROR] SocketTimeoutException encountered during run!
-> 2017-09-11 17:32:33:565 SPECweb_Support[2219]: State Switch [1,1]
-> 2017-09-11 17:32:33:565 WorkloadScheduler[2219]: Thinking for 6000 msec
-> 2017-09-11 17:32:33:575 Connection: SocketTimeoutException waiting for end-of-header
-> 2017-09-11 17:32:33:575 SPECweb_Support[688]: STATE 4; RESPONSE LENGTH = 20480
-> 2017-09-11 17:32:33:575 SPECweb_Support[688]: STATE 4; FILE BYTES READ = -4
-> 2017-09-11 17:32:33:575 SPECweb_Support: [ERROR] SocketTimeoutException encountered during run!
-> 2017-09-11 17:32:33:575 SPECweb_Support[688]: State Switch [4,3]
-> 2017-09-11 17:32:33:575 WorkloadScheduler[688]: Thinking for 29000 msec
-> 2017-09-11 17:32:33:625 Connection: SocketTimeoutException waiting for end-of-header
-> 2017-09-11 17:32:33:625 SPECweb_Support[423]: STATE 3; RESPONSE LENGTH = 20480
-> 2017-09-11 17:32:33:625 SPECweb_Support[423]: STATE 3; FILE BYTES READ = -4
-> 2017-09-11 17:32:33:625 SPECweb_Support: [ERROR] SocketTimeoutException encountered during run!
-> 2017-09-11 17:32:33:625 SPECweb_Support[423]: State Switch [3,2]
-> 2017-09-11 17:32:33:625 WorkloadScheduler[423]: Thinking for 14000 msec
-> 2017-09-11 17:32:33:714 Connection: SocketTimeoutException waiting for end-of-header
-> 2017-09-11 17:32:33:714 SPECweb_Support[816]: STATE 1; RESPONSE LENGTH = 20480
-> 2017-09-11 17:32:33:714 SPECweb_Support[816]: STATE 1; FILE BYTES READ = -4
-> 2017-09-11 17:32:33:714 SPECweb_Support: [ERROR] SocketTimeoutException encountered during run!
-> 2017-09-11 17:32:33:714 SPECweb_Support[816]: State Switch [1,3]
-> 2017-09-11 17:32:33:714 WorkloadScheduler[816]: Thinking for 3000 msec
-> 2017-09-11 17:32:33:720 Connection: SocketTimeoutException waiting for end-of-header
-> 2017-09-11 17:32:33:720 SPECweb_Support[57]: STATE 3; RESPONSE LENGTH = 20480
-> 2017-09-11 17:32:33:720 SPECweb_Support[57]: STATE 3; FILE BYTES READ = -4
-> 2017-09-11 17:32:33:720 SPECweb_Support: [ERROR] SocketTimeoutException encountered during run!
-> 2017-09-11 17:32:33:720 SPECweb_Support[57]: State Switch [3,2]
-> 2017-09-11 17:32:33:720 WorkloadScheduler[57]: Thinking for 5000 msec
-> 2017-09-11 17:32:33:774 Connection: SocketTimeoutException waiting for end-of-header
-> 2017-09-11 17:32:33:774 SPECweb_Support[217]: STATE 3; RESPONSE LENGTH = 20480
-> 2017-09-11 17:32:33:774 SPECweb_Support[217]: STATE 3; FILE BYTES READ = -4
-> 2017-09-11 17:32:33:774 SPECweb_Support: [ERROR] SocketTimeoutException encountered during run!
-> 2017-09-11 17:32:33:774 SPECweb_Support[217]: State Switch [3,4]
-> 2017-09-11 17:32:33:774 WorkloadScheduler[217]: Thinking for 7000 msec
-> 2017-09-11 17:32:33:821 SslConnection: [ERROR] IOException creating socket. Exception is:
-> java.net.ConnectException: Connection timed out (Connection timed out)
-> hostServer is: webserver; port is: 443
-> 2017-09-11 17:32:33:821 Connection: IOException closing SSL socket: java.net.SocketException: Socket is closed
-> 2017-09-11 17:32:33:821 Connection: IOException closing SSL socket: java.net.SocketException: Socket is closed
-> 2017-09-11 17:32:33:830 SslConnection: [ERROR] IOException creating socket. Exception is:
-> java.net.ConnectException: Connection timed out (Connection timed out)
-> hostServer is: webserver; port is: 443
-> 2017-09-11 17:32:33:830 Connection: IOException closing SSL socket: java.net.SocketException: Socket is closed
-> 2017-09-11 17:32:33:830 Connection: IOException closing SSL socket: java.net.SocketException: Socket is closed
-> 2017-09-11 17:32:33:831 SslConnection: [ERROR] IOException creating socket. Exception is:
-> java.net.ConnectException: Connection timed out (Connection timed out)
-> hostServer is: webserver; port is: 443
-> 2017-09-11 17:32:33:831 Connection: IOException closing SSL socket: java.net.SocketException: Socket is closed
-> 2017-09-11 17:32:33:831 Connection: IOException closing SSL socket: java.net.SocketException: Socket is closed
-> 2017-09-11 17:32:33:838 SslConnection: [ERROR] IOException creating socket. Exception is:
-> java.net.ConnectException: Connection timed out (Connection timed out)
-> hostServer is: webserver; port is: 443
-> 2017-09-11 17:32:33:838 Connection: IOException closing SSL socket: java.net.SocketException: Socket is closed
-> 2017-09-11 17:32:33:838 Connection: IOException closing SSL socket: java.net.SocketException: Socket is closed
-> 2017-09-11 17:32:33:856 SslConnection: [ERROR] IOException creating socket. Exception is:
-> java.net.ConnectException: Connection timed out (Connection timed out)
-> hostServer is: webserver; port is: 443
-> 2017-09-11 17:32:33:856 SPECweb_Support[613]: STATE 5; RESPONSE LENGTH = 20480
-> 2017-09-11 17:32:33:856 SPECweb_Support[613]: STATE 5; FILE BYTES READ = -1
-> 2017-09-11 17:32:33:856 SPECweb_Support: [ERROR] STATE 5; makeHttpRequest() failed.
-> 2017-09-11 17:32:33:856 SPECweb_Support[613]: State Switch [5,4]
-> 2017-09-11 17:32:33:858 Connection: SocketTimeoutException waiting for end-of-header
-> 2017-09-11 17:32:33:858 SPECweb_Support[795]: STATE 2; RESPONSE LENGTH = 20480
-> 2017-09-11 17:32:33:858 SPECweb_Support[795]: STATE 2; FILE BYTES READ = -4
-> 2017-09-11 17:32:33:858 SPECweb_Support: [ERROR] SocketTimeoutException encountered during run!
-> 2017-09-11 17:32:33:858 SPECweb_Support[795]: State Switch [2,3]
-> 2017-09-11 17:32:33:858 WorkloadScheduler[795]: Thinking for 9000 msec
-> 2017-09-11 17:32:33:859 SslConnection: [ERROR] IOException creating socket. Exception is:
-> java.net.ConnectException: Connection timed out (Connection timed out)
-> hostServer is: webserver; port is: 443
-> 2017-09-11 17:32:33:859 Connection: IOException closing SSL socket: java.net.SocketException: Socket is closed
-> 2017-09-11 17:32:33:859 Connection: IOException closing SSL socket: java.net.SocketException: Socket is closed
-> 2017-09-11 17:32:33:867 SslConnection: [ERROR] IOException creating socket. Exception is:
-> java.net.ConnectException: Connection timed out (Connection timed out)
-> hostServer is: webserver; port is: 443
-> 2017-09-11 17:32:33:867 SPECweb_Support[504]: STATE 0; RESPONSE LENGTH = 20480
-> 2017-09-11 17:32:33:867 SPECweb_Support[504]: STATE 0; FILE BYTES READ = -1
-> 2017-09-11 17:32:33:867 SPECweb_Support: [ERROR] STATE 0; makeHttpRequest() failed.
-> 2017-09-11 17:32:33:867 SPECweb_Support[504]: State Switch [0,2]
-> 2017-09-11 17:32:33:964 Connection: SocketTimeoutException waiting for end-of-header
-> 2017-09-11 17:32:33:964 SPECweb_Support[1810]: STATE 4; RESPONSE LENGTH = 20480
-> 2017-09-11 17:32:33:964 SPECweb_Support[1810]: STATE 4; FILE BYTES READ = -4
-> 2017-09-11 17:32:33:964 SPECweb_Support: [ERROR] SocketTimeoutException encountered during run!
-> 2017-09-11 17:32:33:964 SPECweb_Support[1810]: State Switch [4,3]
-> 2017-09-11 17:32:33:964 WorkloadScheduler[1810]: Thinking for 1000 msec
-> 2017-09-11 17:32:33:965 SslConnection: [ERROR] IOException creating socket. Exception is:
-> java.net.ConnectException: Connection timed out (Connection timed out)
-> hostServer is: webserver; port is: 443
-> 2017-09-11 17:32:33:965 Connection: IOException closing SSL socket: java.net.SocketException: Socket is closed
-> 2017-09-11 17:32:33:965 Connection: [ERROR] createSocket() failed.
-> 2017-09-11 17:32:33:965 HttpRequestSched: Creating socket...
-> 2017-09-11 17:32:33:965 Connection: IOException closing SSL socket: java.net.SocketException: Socket is closed



[root@client1 SPECvirt]# cat primectrl.out
2017-09-11 17:03:23:329 Mon Sep 11 17:03:23 CST 2017
2017-09-11 17:03:23:330 specvirt: maxPreRunTime = 1201
2017-09-11 17:03:23:330 specvirt: runTime = 7200
2017-09-11 17:03:23:330 specvirt: runTime = 7200
2017-09-11 17:03:23:331 specvirt: runTime = 7200
2017-09-11 17:03:23:331 specvirt: runTime = 7200
2017-09-11 17:03:23:331 specvirt: runTime = 600
2017-09-11 17:03:23:331 specvirt: runTime = 600
2017-09-11 17:03:23:331 specvirt: runTime = 600
2017-09-11 17:03:23:331 specvirt: runTime = 600
2017-09-11 17:03:23:397 RMI server started: client1:9990
2017-09-11 17:03:23:404 [INFO] This is a perf-only benchmark run. Skipping active idle polling interval.
2017-09-11 17:03:23:405 PrimeControl: preparing client drivers.
2017-09-11 17:03:23:405 PrimeControl: PRIME_HOST 0 = client1:1098
2017-09-11 17:03:23:405 PrimeControl: PRIME_HOST 1 = client1:1096
2017-09-11 17:03:23:405 PrimeControl: PRIME_HOST 2 = client1:1094
2017-09-11 17:03:23:405 PrimeControl: PRIME_HOST 3 = client1:1092
2017-09-11 17:03:23:407 PrimeControl: Master 1: client1:1098
2017-09-11 17:03:23:407 PrimeControl: Master 2: client1:1096
2017-09-11 17:03:23:407 PrimeControl: Master 3: client1:1094
2017-09-11 17:03:23:407 PrimeControl: Master 4: client1:1092
2017-09-11 17:03:23:409 PrimeControl: adding host client1:1098
2017-09-11 17:03:23:418 PrimeControl: adding host client1:1096
2017-09-11 17:03:23:424 PrimeControl: adding host client1:1094
2017-09-11 17:03:23:431 PrimeControl: adding host client1:1092
2017-09-11 17:03:23:448 First client for 0: 172.21.128.219:1091
2017-09-11 17:03:23:462 First client for 1: 172.21.128.219:1010
2017-09-11 17:03:23:473 First client for 2: 172.21.128.219:1200
2017-09-11 17:03:23:479 First client for 3: 172.21.128.219:1902
2017-09-11 17:03:23:484 PrimeControl: starting clients...
2017-09-11 17:03:23:484 PrimeControl: clients.length = 4
2017-09-11 17:03:23:485 PrimeControl: clients[0].length = 1
2017-09-11 17:03:23:485 PrimeControl: starting clients[0][0]: 172.21.128.219:1091
2017-09-11 17:03:23:512 PrimeControl: started client: 172.21.128.219:1091
2017-09-11 17:03:23:512 PrimeControl: clients[1].length = 1
2017-09-11 17:03:23:512 PrimeControl: starting clients[1][0]: 172.21.128.219:1010
2017-09-11 17:03:23:516 PrimeControl: started client: 172.21.128.219:1010
2017-09-11 17:03:23:516 PrimeControl: clients[2].length = 1
2017-09-11 17:03:23:517 PrimeControl: starting clients[2][0]: 172.21.128.219:1200
2017-09-11 17:03:23:524 PrimeControl: started client: 172.21.128.219:1200
2017-09-11 17:03:23:524 PrimeControl: clients[3].length = 1
2017-09-11 17:03:23:526 PrimeControl: starting clients[3][0]: 172.21.128.219:1902
2017-09-11 17:03:23:529 PrimeControl: started client: 172.21.128.219:1902
2017-09-11 17:03:23:530 PrimeControl: PTDs not used for this benchmark run!
2017-09-11 17:03:23:530 PrimeControl: starting 4 masters.
2017-09-11 17:10:55:297 PrimeControl: master[0][0] sleeping 20 sec.
2017-09-11 17:10:55:297 PrimeControl: master[0][2] sleeping 20 sec.
2017-09-11 17:10:55:298 PrimeControl: master[0][1] sleeping 20 sec.
2017-09-11 17:10:55:298 PrimeControl: master[0][3] sleeping 20 sec.
2017-09-11 17:11:15:349 PrimeControl: waiting on 4 prime client(s).
2017-09-11 17:11:15:850 Sending config to client1:1094
2017-09-11 17:11:15:858 Sending config to client1:1098
2017-09-11 17:11:15:859 Sending config to client1:1092
2017-09-11 17:11:15:893 Sending config to client1:1096
2017-09-11 17:11:16:209 specvirt: waiting on 3 prime clients.
.
2017-09-11 17:11:16:390 setting hostsReady = true
2017-09-11 17:11:16:539 specvirt: waiting on 2 prime clients.
2017-09-11 17:11:26:809 specvirt: waiting on 1 prime clients.
2017-09-11 17:13:32:065 PrimeControl: Workload and prime controller builds: 80
2017-09-11 17:13:32:065 PrimeControl: awaiting runtime started signal from prime clients
2017-09-11 17:13:49:066 specvirt: clock sync check completed successfully
2017-09-11 17:13:49:066 specvirt: initiating workload ramp-up.
2017-09-11 17:13:49:067 Polling start time = Mon Sep 11 17:33:50 CST 2017
2017-09-11 17:13:49:067 Polling end time   = Mon Sep 11 19:33:50 CST 2017
2017-09-11 17:18:50:103 PrimeControl: all workloads started.
2017-09-11 17:33:50:116 PrimeControl: all workloads in run time.
2017-09-11 17:33:50:117 PrimeControl: dumping polling start response times...
2017-09-11 17:33:50:117 client1:1098 (PRIME_HOST[0][0]) msec after pollStart: 14
2017-09-11 17:33:50:117 client1:1096 (PRIME_HOST[0][1]) msec after pollStart: 50
2017-09-11 17:33:50:117 client1:1094 (PRIME_HOST[0][2]) msec after pollStart: 26
2017-09-11 17:33:50:118 client1:1092 (PRIME_HOST[0][3]) msec after pollStart: 7
2017-09-11 17:33:50:118 PrimeControl: checking polling start response times...
2017-09-11 17:33:50:120 PrimeControl: sleeping for 0 sec
2017-09-11 17:33:50:120 PrimeControl: sending results counter reset command.
2017-09-11 17:33:50:120 PrimeControl: polling for 7200 sec


Client:
[root@client1 SPECvirt]# ulimit -n
999999
[root@client1 SPECvirt]# ulimit -u
unlimited
[root@client1 SPECvirt]#

My client run on a different physic whit other server VM, and I just use 1GE Network card.  Is it a  possibility
that resulting in the error( [ERROR] SocketTimeoutException encountered during run)?

Can I reduse the SIMULTANEOUS_SESSIONS to temply fix this error?  If so, what steps should i do ?

Or which log should i check?  thanks
« Last Edit: September 11, 2017, 07:31:26 AM by zgy »

zgy

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
Re: runspecvirt fail
« Reply #7 on: September 11, 2017, 07:48:17 AM »
At the end , error info is:
[root@client1 SPECvirt]# cat primectrl.out
2017-09-11 17:03:23:329 Mon Sep 11 17:03:23 CST 2017
2017-09-11 17:03:23:330 specvirt: maxPreRunTime = 1201
2017-09-11 17:03:23:330 specvirt: runTime = 7200
2017-09-11 17:03:23:330 specvirt: runTime = 7200
2017-09-11 17:03:23:331 specvirt: runTime = 7200
2017-09-11 17:03:23:331 specvirt: runTime = 7200
2017-09-11 17:03:23:331 specvirt: runTime = 600
2017-09-11 17:03:23:331 specvirt: runTime = 600
2017-09-11 17:03:23:331 specvirt: runTime = 600
2017-09-11 17:03:23:331 specvirt: runTime = 600
2017-09-11 17:03:23:397 RMI server started: client1:9990
2017-09-11 17:03:23:404 [INFO] This is a perf-only benchmark run. Skipping active idle polling interval.
2017-09-11 17:03:23:405 PrimeControl: preparing client drivers.
2017-09-11 17:03:23:405 PrimeControl: PRIME_HOST 0 = client1:1098
2017-09-11 17:03:23:405 PrimeControl: PRIME_HOST 1 = client1:1096
2017-09-11 17:03:23:405 PrimeControl: PRIME_HOST 2 = client1:1094
2017-09-11 17:03:23:405 PrimeControl: PRIME_HOST 3 = client1:1092
2017-09-11 17:03:23:407 PrimeControl: Master 1: client1:1098
2017-09-11 17:03:23:407 PrimeControl: Master 2: client1:1096
2017-09-11 17:03:23:407 PrimeControl: Master 3: client1:1094
2017-09-11 17:03:23:407 PrimeControl: Master 4: client1:1092
2017-09-11 17:03:23:409 PrimeControl: adding host client1:1098
2017-09-11 17:03:23:418 PrimeControl: adding host client1:1096
2017-09-11 17:03:23:424 PrimeControl: adding host client1:1094
2017-09-11 17:03:23:431 PrimeControl: adding host client1:1092
2017-09-11 17:03:23:448 First client for 0: 172.21.128.219:1091
2017-09-11 17:03:23:462 First client for 1: 172.21.128.219:1010
2017-09-11 17:03:23:473 First client for 2: 172.21.128.219:1200
2017-09-11 17:03:23:479 First client for 3: 172.21.128.219:1902
2017-09-11 17:03:23:484 PrimeControl: starting clients...
2017-09-11 17:03:23:484 PrimeControl: clients.length = 4
2017-09-11 17:03:23:485 PrimeControl: clients[0].length = 1
2017-09-11 17:03:23:485 PrimeControl: starting clients[0][0]: 172.21.128.219:1091
2017-09-11 17:03:23:512 PrimeControl: started client: 172.21.128.219:1091
2017-09-11 17:03:23:512 PrimeControl: clients[1].length = 1
2017-09-11 17:03:23:512 PrimeControl: starting clients[1][0]: 172.21.128.219:1010
2017-09-11 17:03:23:516 PrimeControl: started client: 172.21.128.219:1010
2017-09-11 17:03:23:516 PrimeControl: clients[2].length = 1
2017-09-11 17:03:23:517 PrimeControl: starting clients[2][0]: 172.21.128.219:1200
2017-09-11 17:03:23:524 PrimeControl: started client: 172.21.128.219:1200
2017-09-11 17:03:23:524 PrimeControl: clients[3].length = 1
2017-09-11 17:03:23:526 PrimeControl: starting clients[3][0]: 172.21.128.219:1902
2017-09-11 17:03:23:529 PrimeControl: started client: 172.21.128.219:1902
2017-09-11 17:03:23:530 PrimeControl: PTDs not used for this benchmark run!
2017-09-11 17:03:23:530 PrimeControl: starting 4 masters.
2017-09-11 17:10:55:297 PrimeControl: master[0][0] sleeping 20 sec.
2017-09-11 17:10:55:297 PrimeControl: master[0][2] sleeping 20 sec.
2017-09-11 17:10:55:298 PrimeControl: master[0][1] sleeping 20 sec.
2017-09-11 17:10:55:298 PrimeControl: master[0][3] sleeping 20 sec.
2017-09-11 17:11:15:349 PrimeControl: waiting on 4 prime client(s).
2017-09-11 17:11:15:850 Sending config to client1:1094
2017-09-11 17:11:15:858 Sending config to client1:1098
2017-09-11 17:11:15:859 Sending config to client1:1092
2017-09-11 17:11:15:893 Sending config to client1:1096
2017-09-11 17:11:16:209 specvirt: waiting on 3 prime clients.
.
2017-09-11 17:11:16:390 setting hostsReady = true
2017-09-11 17:11:16:539 specvirt: waiting on 2 prime clients.
2017-09-11 17:11:26:809 specvirt: waiting on 1 prime clients.
2017-09-11 17:13:32:065 PrimeControl: Workload and prime controller builds: 80
2017-09-11 17:13:32:065 PrimeControl: awaiting runtime started signal from prime clients
2017-09-11 17:13:49:066 specvirt: clock sync check completed successfully
2017-09-11 17:13:49:066 specvirt: initiating workload ramp-up.
2017-09-11 17:13:49:067 Polling start time = Mon Sep 11 17:33:50 CST 2017
2017-09-11 17:13:49:067 Polling end time   = Mon Sep 11 19:33:50 CST 2017
2017-09-11 17:18:50:103 PrimeControl: all workloads started.
2017-09-11 17:33:50:116 PrimeControl: all workloads in run time.
2017-09-11 17:33:50:117 PrimeControl: dumping polling start response times...
2017-09-11 17:33:50:117 client1:1098 (PRIME_HOST[0][0]) msec after pollStart: 14
2017-09-11 17:33:50:117 client1:1096 (PRIME_HOST[0][1]) msec after pollStart: 50
2017-09-11 17:33:50:117 client1:1094 (PRIME_HOST[0][2]) msec after pollStart: 26
2017-09-11 17:33:50:118 client1:1092 (PRIME_HOST[0][3]) msec after pollStart: 7
2017-09-11 17:33:50:118 PrimeControl: checking polling start response times...
2017-09-11 17:33:50:120 PrimeControl: sleeping for 0 sec
2017-09-11 17:33:50:120 PrimeControl: sending results counter reset command.
2017-09-11 17:33:50:120 PrimeControl: polling for 7200 sec
2017-09-11 19:33:50:083 PrimeControl: client1:1092 (PRIME_HOST[0][3]) run complete; numStarted = 3
2017-09-11 19:33:50:121 PrimeControl: stopping result polling.
2017-09-11 19:33:50:121 PrimeControl: waiting for all workloads to stop...
2017-09-11 19:33:50:123 PrimeControl: [ERROR] remote exception occured  while calling resetCounters() on 172.21.128.219: java.rmi.ConnectException: Connection refused to host: 172.21.128.219; nested exception is:
        java.net.ConnectException: Connection refused (Connection refused)
2017-09-11 19:33:50:171 PrimeControl: client1:1094 (PRIME_HOST[0][2]) run complete; numStarted = 2
2017-09-11 19:33:56:899 PrimeControl: client1:1098 (PRIME_HOST[0][0]) run complete; numStarted = 1
2017-09-11 19:34:00:122 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:34:15:122 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:34:30:123 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:34:45:123 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:35:00:124 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:35:15:124 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:35:30:125 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:35:45:125 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:36:00:126 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:36:15:126 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:36:30:127 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:36:45:127 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:37:00:128 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:37:15:128 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:37:30:129 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:37:45:129 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:38:00:130 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:38:15:130 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:38:30:131 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:38:45:131 PrimeControl: waiting for 1 masters to finish
2017-09-11 19:39:00:132 PrimeControl: [ERROR] Maximum prime client wait time exceeded!
2017-09-11 19:39:00:133 PrimeControl: [ERROR] startMasters() failed!
2017-09-11 19:39:00:133 PrimeControl: sending abortTest() to prime clients.
2017-09-11 19:39:00:134 PrimeControl: [ERROR] Remote exception occured while calling getValidationRept() on client1:1096 for tile  0, benchmark 1: java.rmi.ConnectException: Connection refused to host: 172.21.128.219; nested exception is:
        java.net.ConnectException: Connection refused (Connection refused)
2017-09-11 19:39:00:135 PrimeControl: id=1, abortID=-1
2017-09-11 19:39:00:135 PrimeControl: masters[1]=client1:1096
2017-09-11 19:39:00:136 PrimeControl: [ERROR] exception occurred sending abortTest signal to specweb_Stub[UnicastRef [liveRef: [endpoint:[172.21.128.219:4188](remote),objID:[544cc4e3:15e70350181:-7ffe, -1590442930850674644]]]]. Exception was:
 java.rmi.ConnectException: Connection refused to host: 172.21.128.219; nested exception is:
        java.net.ConnectException: Connection refused (Connection refused)
2017-09-11 19:39:00:152 PrimeControl: id=3, abortID=-1
2017-09-11 19:39:00:152 PrimeControl: masters[3]=client1:1092
2017-09-11 19:39:00:160 PrimeControl: id=0, abortID=-1
2017-09-11 19:39:00:160 PrimeControl: masters[0]=client1:1098
2017-09-11 19:39:00:184 PrimeControl: id=2, abortID=-1
2017-09-11 19:39:00:184 PrimeControl: masters[2]=client1:1094
2017-09-11 19:39:01:810 PrimeControl: stopping remote client processes
2017-09-11 19:39:03:821 PrimeControl: stopping local client threads
2017-09-11 19:39:03:822 PrimeControl: terminating run. Please wait...
2017-09-11 19:39:04:828 specvirt: benchmark run failed!
2017-09-11 19:39:04:828 specvirt: Done!

I not sure the SSLv3 problem is Solved, because i found https://www.spec.org/forums/index.php?topic=40.0 is Similar with me, and the reason is still the SSLv3!!

Follwing is the log.

zgy

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
Re: runspecvirt fail
« Reply #8 on: September 11, 2017, 08:01:54 AM »
20170911-170323_part0

(include Control.config,  SPECweb2005/Test.confg,  java.security,  primectrl.out ... )
« Last Edit: September 11, 2017, 08:13:44 AM by zgy »

zgy

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
Re: runspecvirt fail
« Reply #9 on: September 11, 2017, 08:06:27 AM »
client-172.21.128.219_1010_part0

zgy

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
Re: runspecvirt fail
« Reply #10 on: September 11, 2017, 08:11:16 AM »
client-172.21.128.219_1010_part1

zgy

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
Re: runspecvirt fail
« Reply #11 on: September 11, 2017, 08:11:34 AM »
client-172.21.128.219_1010_part2

zgy

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
Re: runspecvirt fail
« Reply #12 on: September 11, 2017, 08:12:03 AM »
client-172.21.128.219_1010_part3
client-172.21.128.219_1010_part4

zgy

  • Newbie
  • *
  • Posts: 42
  • Karma: +0/-0
Re: runspecvirt fail
« Reply #13 on: September 11, 2017, 08:12:27 AM »
client-172.21.128.219_1010_part5

lroderic

  • Moderator
  • Full Member
  • *****
  • Posts: 167
  • Karma: +6/-0
Re: runspecvirt fail
« Reply #14 on: September 11, 2017, 12:09:04 PM »
Hi. Please simplify and set NUM_WORKLOADS=2 so we can debug what's failing without the extra log data from mailserver (client1:1094) and batchserver (client1:1092).

You're on the right track. Since the client requires ~1.4 GbE, you'll need to separate your web/infraserver client from the other workloads. Until you do that, please reduce the SIMULTANEOUS_SESSIONS to 1000 and rerun. On the client, what do you have in /etc/sysctl.conf? At a minimum:

Code: [Select]
fs.file-max=1000000
net.core.optmem_max = 20000000
net.core.rmem_default = 20000000
net.core.rmem_max = 20000000
net.core.wmem_default = 20000000
net.core.wmem_max = 20000000
net.core.somaxconn = 8192
net.ipv4.tcp_max_tw_buckets = 500000
net.ipv4.tcp_mem = 20000000 20000000 200000000
net.ipv4.tcp_rmem = 20000000 20000000 200000000
net.ipv4.tcp_wmem = 20000000 20000000 200000000
net.ipv4.ip_local_port_range = 4096 65535
net.ipv4.tcp_tw_reuse=1
vm.swappiness = 0
vm.overcommit_memory = 0

You can disable IPV6 if you like.

Lisa