Author Topic: "PrimeControl: terminating run" while running 2 tiles  (Read 30573 times)

Miles

  • Jr. Member
  • **
  • Posts: 72
  • Karma: +0/-0
"PrimeControl: terminating run" while running 2 tiles
« on: August 03, 2017, 01:42:12 AM »
Hi
I try to run a 2T1W(batch workload) test but failed.
I use two clients, client1 and client2, and execute runspecvirt.sh on client1.
Should I correct the hosts files?
Thanks.

primectrl.out
2017-08-03 11:36:08:456 Thu Aug 03 11:36:08 CST 2017
2017-08-03 11:36:08:456 specvirt: maxPreRunTime = 1501
2017-08-03 11:36:08:456 specvirt: runTime = 7200
2017-08-03 11:36:08:456 specvirt: runTime = 7200
2017-08-03 11:36:08:457 specvirt: runTime = 600
2017-08-03 11:36:08:457 specvirt: runTime = 600
2017-08-03 11:36:08:458 Validator: [WARNING] WORKLOAD_LABEL[0] value is: Batch Server; should be Application Server
2017-08-03 11:36:08:458 Validator: [WARNING] WORKLOAD_SCORE_TMAX_VALUE[3] value is: 143.60; should be 0
2017-08-03 11:36:08:458 Validator: [WARNING] WORKLOAD_LABEL[3] value is: Mail Server; should be Batch Server
2017-08-03 11:36:08:458 Validator: [WARNING] WORKLOAD_SCORE_TMAX_VALUE[2] value is: 174.30; should be 143.60
2017-08-03 11:36:08:458 Validator: [WARNING] WORKLOAD_LABEL[2] value is: Application Server; should be Mail Server
2017-08-03 11:36:08:458 Validator: [WARNING] WORKLOAD_LOAD_LEVEL[0] value is: 0; should be 100
2017-08-03 11:36:08:458 Validator: [WARNING] WORKLOAD_LOAD_LEVEL[3] value is: 500; should be 0
2017-08-03 11:36:08:459 Validator: [WARNING] NUM_WORKLOADS value is: 1; should be 4
2017-08-03 11:36:08:459 Validator: [WARNING] WORKLOAD_LOAD_LEVEL[2] value is: 100; should be 500
2017-08-03 11:36:08:459 Validator: [WARNING] WORKLOAD_SCORE_TMAX_VALUE[0] value is: 0; should be 174.30
2017-08-03 11:36:08:459 Validator: [WARNING] RESULT_FILE_NAMES[0] must contain Atomicity.html
2017-08-03 11:36:08:459 Validator: [WARNING] RESULT_FILE_NAMES[0] must contain  Audit.report
2017-08-03 11:36:08:459 Validator: [WARNING] RESULT_FILE_NAMES[0] must contain  Dealer.detail
2017-08-03 11:36:08:460 Validator: [WARNING] RESULT_FILE_NAMES[0] must contain  Dealer.summary
2017-08-03 11:36:08:460 Validator: [WARNING] RESULT_FILE_NAMES[0] must contain  Mfg.detail
2017-08-03 11:36:08:460 Validator: [WARNING] RESULT_FILE_NAMES[0] must contain  Mfg.summary
2017-08-03 11:36:08:460 Validator: [WARNING] RESULT_FILE_NAMES[0] must contain  result.props
2017-08-03 11:36:08:460 Validator: [WARNING] RESULT_FILE_NAMES[0] must contain  SPECjAppServer.summary
2017-08-03 11:36:08:460 Validator: [WARNING] Non-compliant configuration.
2017-08-03 11:36:08:460 [WARNING] This will be a non-compliant benchmark result!
2017-08-03 11:36:08:480 RMI server started: client1:9990
2017-08-03 11:36:08:483 [INFO] This is a perf-only benchmark run. Skipping active idle polling interval.
2017-08-03 11:36:08:483 PrimeControl: preparing client drivers.
2017-08-03 11:36:08:483 PrimeControl: PRIME_HOST 0 = client1:1092
2017-08-03 11:36:08:483 PrimeControl: PRIME_HOST 0 = client2:1092
2017-08-03 11:36:08:484 PrimeControl: Master 1: client1:1092
2017-08-03 11:36:08:484 PrimeControl: Master 2: client2:1092
2017-08-03 11:36:08:485 PrimeControl: adding host client1:1092
2017-08-03 11:36:08:489 PrimeControl: adding host client2:1092
2017-08-03 11:36:08:497 First client for 0: 192.168.1.8:1902
2017-08-03 11:36:08:512 PrimeControl: [ERROR] exception  thrown:
java.lang.NullPointerException

   at org.spec.virt.clientmgr.getClients(clientmgr.java:232)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
   at sun.rmi.transport.Transport$1.run(Transport.java:177)
   at sun.rmi.transport.Transport$1.run(Transport.java:174)
   at java.security.AccessController.doPrivileged(Native Method)
   at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
   at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556)
   at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811)
   at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:744)
   at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:275)
   at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:252)
   at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:161)
   at org.spec.virt.clientmgr_Stub.getClients(Unknown Source)
   at org.spec.virt.PrimeControl.initClients(PrimeControl.java:600)
   at org.spec.virt.PrimeControl.runInterval(PrimeControl.java:326)
   at org.spec.virt.PrimeControl.access$800(PrimeControl.java:32)
   at org.spec.virt.PrimeControl$1.run(PrimeControl.java:201)
2017-08-03 11:36:08:513 PrimeControl: terminating run. Please wait...
2017-08-03 11:36:09:515 specvirt: Done!

lroderic

  • Moderator
  • Full Member
  • *****
  • Posts: 167
  • Karma: +6/-0
Re: "PrimeControl: terminating run" while running 2 tiles
« Reply #1 on: August 03, 2017, 10:42:55 AM »
Miles, if you're doing test runs, you might set POLL_INTERVAL_SEC to something shorter so you don't have to wait so long to see if it failed. Batch needs more than a half hour to run, but to make sure it's working, you could set RAMP_SECONDS = 300, WARMUP_SECONDS= 600, and POLL_INTERVAL_SEC = 900.

I need the Clientmgr*.log files to diagnose further. What's in Clientmgr*_1092.log?

Is the specpoll process running on batchserver2?

Code: [Select]
ssh batchserver2 "ps -ef|grep -i poll"
Lisa

Miles

  • Jr. Member
  • **
  • Posts: 72
  • Karma: +0/-0
Re: "PrimeControl: terminating run" while running 2 tiles
« Reply #2 on: August 03, 2017, 09:22:34 PM »
Hi Lisa
Yes, the specpoll process is running on batchserver2.
No error message is in Clientmgr1_1092.out.

Another question, in exampleVM, there is "specdriver" in /etc/hosts, I don't know what it means.
In 2-tile test, is client1 or client2 specdriver? or both?

Clientmgr1_1092.out
2017-08-03 15:02:06:287 Creating clientmgr using RMI Registry port 1092
2017-08-03 15:02:06:306 client1:1092 ready...

Clientmgr1_1088.out
2017-08-03 15:02:06:301 Creating clientmgr using RMI Registry port 1088
2017-08-03 15:02:06:320 client1:1088 ready...

Thanks.
« Last Edit: August 04, 2017, 06:02:34 AM by Miles »

lroderic

  • Moderator
  • Full Member
  • *****
  • Posts: 167
  • Karma: +6/-0
Re: "PrimeControl: terminating run" while running 2 tiles
« Reply #3 on: August 04, 2017, 11:01:32 AM »
The specdriver alias on the client is needed for appserver. The client pieces on SPECjAppServer2004 look for this alias on the appserver VM the same way it looks for the specdb alias for the dbserver.

Have you tried running only one tile with batchserver? Maybe the problem is with the client when you add the second tile.

Lisa


Miles

  • Jr. Member
  • **
  • Posts: 72
  • Karma: +0/-0
Re: "PrimeControl: terminating run" while running 2 tiles
« Reply #4 on: August 06, 2017, 09:23:32 PM »
Hi
I modified WORKLOAD_CLIENTS
  • to WORKLOAD_CLIENTS
  • [y] in Control.config, and can execute successfully.


Thanks.

Miles

  • Jr. Member
  • **
  • Posts: 72
  • Karma: +0/-0
Re: "PrimeControl: terminating run" while running 2 tiles
« Reply #5 on: August 15, 2017, 12:41:05 PM »
Hi
It passed with 2T1W, but failed with 2T4W
I think there is some failure in my web workload, because it completed the run when only 3 workloads(without web workload).

primectrl.out
2017-08-15 18:46:19:718 specvirt: waiting on 5 prime clients.
2017-08-15 18:46:19:725 setting hostsReady = true
2017-08-15 18:46:19:800 specvirt: waiting on 4 prime clients.
2017-08-15 18:46:20:258 RemoteException while trying to get workload build number: exception java.rmi.UnmarshalException: Error unmarshaling return header; nested exception is:
   java.io.EOFException

2017-08-15 18:46:20:259 PrimeControl: [ERROR] masters[0] build numbers (null) do not match the specvirt prime controller's (80). Please update complete harness and retry.
2017-08-15 18:46:20:259 PrimeControl: [ERROR] masters[1] build numbers (null) do not match the specvirt prime controller's (80). Please update complete harness and retry.
2017-08-15 18:46:20:259 PrimeControl: [ERROR] masters[4] build numbers (null) do not match the specvirt prime controller's (80). Please update complete harness and retry.
2017-08-15 18:46:20:259 PrimeControl: [ERROR] masters[5] build numbers (null) do not match the specvirt prime controller's (80). Please update complete harness and retry.
2017-08-15 18:46:20:259 PrimeControl: [ERROR] startMasters() failed!
2017-08-15 18:46:20:259 PrimeControl: sending abortTest() to prime clients.
2017-08-15 18:46:20:259 PrimeControl: id=1, abortID=-1
2017-08-15 18:46:20:259 PrimeControl: masters[1]=client1:1096
2017-08-15 18:46:20:260 PrimeControl: id=6, abortID=-1
2017-08-15 18:46:20:260 PrimeControl: id=7, abortID=-1
2017-08-15 18:46:20:260 PrimeControl: masters[6]=client2:1094
2017-08-15 18:46:20:260 PrimeControl: masters[7]=client2:1092
2017-08-15 18:46:20:260 PrimeControl: id=2, abortID=-1
2017-08-15 18:46:20:260 PrimeControl: id=3, abortID=-1
2017-08-15 18:46:20:260 PrimeControl: masters[2]=client1:1094
2017-08-15 18:46:20:260 PrimeControl: masters[3]=client1:1092
2017-08-15 18:46:20:260 PrimeControl: [ERROR] exception occurred sending abortTest signal to specweb_Stub[UnicastRef [liveRef: [endpoint:[192.168.1.8:39656](remote),objID:[-26afdb66:15de58053da:-7ffe, 3407266484835053569]]]]. Exception was:
 java.rmi.ConnectException: Connection refused to host: 192.168.1.8; nested exception is:
   java.net.ConnectException: Connection refused

2017-08-15 18:46:28:106 PrimeControl: id=5, abortID=-1
..

Clientmgr1_1096.out
-> 2017-08-15 18:46:19:523 SpecwebControl: * Running SPECweb_Support workload
-> 2017-08-15 18:46:19:523 Configuration: Clearing workload.
-> 2017-08-15 18:46:19:526 RemoteLoadGen: Total clients: 1
-> 2017-08-15 18:46:19:596 HttpRequestSched: [ERROR] initServers() exception reported making HTTP request:
-> java.lang.NullPointerException
-> 2017-08-15 18:46:19:596 HttpRequestSched: [ERROR]
initConorg.spec.specweb.Connection@46a329dc; threadByteArray: [B@2114ebf; useSSL: false
-> 2017-08-15 18:46:19:596 HttpRequestSched: [ERROR] response: HTTP/1.0 302 Redirect
-> Date: Fri Jun 16 16:29:40 2017
...
-> 2017-08-15 18:46:19:596 RemoteLoadGen: [ERROR] Unable to successfully initialize workload variables. Terminating.
-> 2017-08-15 18:46:19:596 SpecwebControl: [ERROR] Could not create all client threads.
-> 2017-08-15 18:46:19:596 SpecwebControl: [ERROR] setupWorkload() failed!
-> 2017-08-15 18:46:19:596 SpecwebControl: [ERROR] runTests() failed!
-> 2017-08-15 18:46:19:596 SpecwebControl: [ERROR] Benchmark run failed!
-> 2017-08-15 18:46:19:601 SpecwebControl: Terminating run. Please wait...



lroderic

  • Moderator
  • Full Member
  • *****
  • Posts: 167
  • Karma: +6/-0
Re: "PrimeControl: terminating run" while running 2 tiles
« Reply #6 on: August 15, 2017, 01:22:40 PM »
Hmm. From the prime client, please check the output of the following on all workload VMs, clients, and prime client. This case assumes that 211 is the last octet of infraserver1, 217 is the last octet of the prime client, and 10.140.3 is the network:

Code: [Select]
for i in `seq 211 217`; do  ssh 10.140.3.$i "java -jar /opt/SPECvirt/specvirt.jar -v"; echo $i;  done
Does this command report anything other than:

Code: [Select]
SPECvirt_sc2013 v1.1, build: 80
If not, you need to make sure you've installed SPECvirt correctly. It's easiest to do a full installation on every VM.

Lisa

Miles

  • Jr. Member
  • **
  • Posts: 72
  • Karma: +0/-0
Re: "PrimeControl: terminating run" while running 2 tiles
« Reply #7 on: August 16, 2017, 12:11:13 AM »
Hi Lisa
Do you mean that I should copy "SPECvirt" folder onto all VMs?
But the progress still failed after I completed the action.

Thanks.

lroderic

  • Moderator
  • Full Member
  • *****
  • Posts: 167
  • Karma: +6/-0
Re: "PrimeControl: terminating run" while running 2 tiles
« Reply #8 on: August 16, 2017, 11:20:28 AM »
Yes, SPECvirt needs to be installed on every VM and client. You can't just copy the directory - you need the entire /opt directory with all the workload and harness software.

Rather than cloning a clean VM and installing the workload software every time, you can clone an existing, working tile. That's the easiest way to scale up. See https://www.spec.org/forums/index.php?topic=15.msg145#msg145 for the steps on cloning a tile.

Lisa

Miles

  • Jr. Member
  • **
  • Posts: 72
  • Karma: +0/-0
Re: "PrimeControl: terminating run" while running 2 tiles
« Reply #9 on: August 22, 2017, 05:25:43 AM »
Hi
It can start the run but aborts after the polling phase starts because of
Connection refused to host: 100.100.1.8 from webserver1.
(100.100.1.8 is the SPECVIRT_HOST)

But the connection between them works well.
The firewall is disabled.

primectrl.out
            ...
2017-08-22 16:08:12:627 PrimeControl: checking polling start response times...
2017-08-22 16:08:12:628 PrimeControl: sleeping for 0 sec
2017-08-22 16:08:12:628 PrimeControl: sending results counter reset command.
2017-08-22 16:08:12:629 PrimeControl: polling for 7200 sec
0,0,2017-08-22 16:08:22:691,14,10.0,11,10.0,18,10.0,298,17.25
0,1,2017-08-22 16:08:22:734,475,0,0,475,0,0,0,0,0,2147483647,0,50
0,2,2017-08-22 16:08:22:656,730,626,104,0,2037856,699,21459
0,3,2017-08-22 16:08:22:634,1,1,3487,3487,3487,3487,0,1
1,0,2017-08-22 16:08:20:148,14,10.0,16,10.0,29,10.0,0,0.00
1,1,2017-08-22 16:08:20:170,127,41,0,86,4748035,1620291,0,0,41,8558,417399,57
1,2,2017-08-22 16:08:20:122,757,670,87,0,1984379,712,21226
1,3,2017-08-22 16:08:20:098,1,1,1700,1700,1700,1700,1,0
                    (aborted)         

Clientmgr1_1096.out
-> 2017-08-22 15:53:12:610 SpecwebControl: Warming up for 900 seconds.
-> 2017-08-22 16:08:12:612 SpecwebControl: Clearing results.
-> 2017-08-22 16:08:12:614 SpecwebControl: Starting 7200-second runtime.
-> 2017-08-22 16:08:12:633 SpecwebControl: Clearing results.
-> 2017-08-22 16:08:22:734,475,0,0,475,0,0,0,0,0,2147483647,0,50
-> 2017-08-22 16:12:12:655 RemoteLoadGen: Warning: RMI exception trying to contact client1:1010. Retrying...
-> 2017-08-22 16:12:12:656 RemoteLoadGen: [ERROR] Unable to contact client1:1010
-> 2017-08-22 16:12:12:656 RemoteLoadGen: [ERROR] 1 remote clients, but only 0 responded
-> 2017-08-22 16:12:12:656 SpecwebControl: [ERROR] Client(s) not responding. Aborting test.
-> 2017-08-22 16:12:12:656 RemoteLoadGen: [ERROR] Remote exception setting server reset data collection from client1:1010
-> java.rmi.ConnectException: Connection refused to host: 100.100.1.8; nested exception is:
->    java.net.ConnectException: Connection refused
-> 2017-08-22 16:12:12:656 SpecwebControl: Stopping remote clients.
-> 2017-08-22 16:12:12:658 RemoteLoadGen: 180-second ramp-down starting.
-> 2017-08-22 16:12:12:658 RemoteLoadGen: stopping client client1:1010; abort=false
-> 2017-08-22 16:12:12:658 RemoteLoadGen: [ERROR] Remote exception stopping clients from client1:1010
-> java.rmi.ConnectException: Connection refused to host: 100.100.1.8; nested exception is:
->    java.net.ConnectException: Connection refused
-> 2017-08-22 16:12:12:659 SpecwebControl: Waiting for remote clients to stop.
-> 2017-08-22 16:12:12:659 RemoteLoadGen: [ERROR] Remote exception waiting for clients to complete from client1:1010
-> java.rmi.ConnectException: Connection refused to host: 100.100.1.8; nested exception is:
->    java.net.ConnectException: Connection refused
-> 2017-08-22 16:12:12:659 SpecwebControl: [ERROR] runWorkload() failed!
-> 2017-08-22 16:12:12:659 SpecwebControl: [ERROR] runTests() failed!
-> 2017-08-22 16:12:12:659 SpecwebControl: [ERROR] Benchmark run failed!
-> 2017-08-22 16:12:12:661 SpecwebControl: Terminating run. Please wait...
« Last Edit: August 22, 2017, 05:27:34 AM by Miles »

lroderic

  • Moderator
  • Full Member
  • *****
  • Posts: 167
  • Karma: +6/-0
Re: "PrimeControl: terminating run" while running 2 tiles
« Reply #10 on: August 22, 2017, 01:00:05 PM »
Looks like the SPECpolling agent is down on webserver1. What are the contents of:

Code: [Select]
ssh webserver1 "cat /tmp/pollme*"
Should look something like:

Code: [Select]
Creating RMI listener using RMI Registry port 8001
webserver1-int/10.100.1.8:8001 ready...

I stop and restart SPECpoll on each workload VM before every test. To automate this, have runspecvirt.sh call pollInit.sh in the helper directory, or you can run pollmecheck.sh manually to see that all SPECpoll processes are up.

I recommend setting NUM_WORKLOADS = 2 to run only app/dbserver and web/infraserver until you get these working as well as setting RAMP_SECONDS = 600.

btw, looking through your Control.config, you can simplify some of the entries. Instead of having each tile and workload number in the indexes for PRIME_HOST_INIT_SCRIPT, you can set the value for the entire tile and/or workload this way:

Code: [Select]
PRIME_HOST_INIT_SCRIPT[0][0] = "jAppInitRstr.sh"
PRIME_HOST_INIT_SCRIPT[1][0] = "jAppInit.sh"
PRIME_HOST_INIT_SCRIPT[2][0] = "jAppInit.sh"
PRIME_HOST_INIT_SCRIPT[3][0] = "jAppInit.sh"
PRIME_HOST_INIT_SCRIPT[1] = "webInit.sh"
PRIME_HOST_INIT_SCRIPT[2] = "mailInit.sh"
PRIME_HOST_INIT_SCRIPT[3] = "batchInit.sh"


Same for RAMP_SECONDS:

Code: [Select]
RAMP_SECONDS[0] = 600
RAMP_SECONDS[1] = 600
RAMP_SECONDS[2] = 600
RAMP_SECONDS[3] = 600

Better yet:
Code: [Select]
RAMP_SECONDS= 600

lroderic

  • Moderator
  • Full Member
  • *****
  • Posts: 167
  • Karma: +6/-0
Re: "PrimeControl: terminating run" while running 2 tiles
« Reply #11 on: August 22, 2017, 01:03:19 PM »
Also, you're on the right track with testing web with WORKLOAD_LOAD_LEVEL[1] = 1000 vs. WORKLOAD_LOAD_LEVEL[1] = 2500. This shows you quickly if your SUT is under-configured. Are you running 10GbE between the clients and SUT? One tile uses about 1.4GbE per tile. If you don't have 10GbE, you can split off webserver onto its own client.

Lisa

Miles

  • Jr. Member
  • **
  • Posts: 72
  • Karma: +0/-0
Re: "PrimeControl: terminating run" while running 2 tiles
« Reply #12 on: August 23, 2017, 01:59:07 PM »
Hi
I got "Creating RMI listener using RMI Registry port 8001 webserver1-int/10.100.1.8:8001 ready..."
 with command "ssh webserver1 "cat /tmp/pollme*"

I changed the network to 10GbE, but sill failed.
The following is the main error message in Clientmgr1_1098.out:

WARNING: IOP00810011: Exception from readValue on ValueHandler in CDRInputStream
org.omg.CORBA.MARSHAL: WARNING: IOP00810011: Exception from readValue on ValueHandler in CDRInputStream  vmcid: OMG  minor code: 11 completed: Maybe
...

WARNING [javax.enterprise.resource.corba.ORBUtil]: IOP00810061: Could not
read exception from UEInfoServiceContext
...




lroderic

  • Moderator
  • Full Member
  • *****
  • Posts: 167
  • Karma: +6/-0
Re: "PrimeControl: terminating run" while running 2 tiles
« Reply #13 on: August 23, 2017, 02:56:08 PM »
The SPECpoll process is running on webserver1 - that's good.

The new error is with the GlassFish and the dbserver. I run into this error on occasion and found the only way to fix this is clone a dbserver VM if you have one or re-run the example VM scripts on a clean VM to remake it a dbserver.  I wish I had a better answer.

Make sure you don't run jAppInitRstr.sh on appserver2 or 3 or 4. Only run jAppInit.sh on them.

Are you restoring the dbserver between runs? Please post the contents of jAppInitRstr.sh and post the output of:

Code: [Select]
ssh dbserver1 "ls -lh /dbstore/backup"
It should be about 989MB.

Lisa

Miles

  • Jr. Member
  • **
  • Posts: 72
  • Karma: +0/-0
Re: "PrimeControl: terminating run" while running 2 tiles
« Reply #14 on: August 24, 2017, 03:18:27 AM »
Hi
Yes, I ran jAppInitRstr.sh only on appserver1.
I execute loaddb.sh before I start a run.
I cloned a new dbserver and use it on continue but got the following errors:

primectrl.out
2017-08-24 14:42:54:016 PrimeControl: checking polling start response times...
2017-08-24 14:42:54:018 PrimeControl: sleeping for 0 sec
2017-08-24 14:42:54:018 PrimeControl: sending results counter reset command.
2017-08-24 14:42:54:018 PrimeControl: polling for 7200 sec
2017-08-24 14:42:54:593 resetCounters() call failed on 100.100.1.8; aborting...
2017-08-24 14:43:23:919 [ERROR] One or more workloads exceeded maximum allowed polling response delay!
2017-08-24 14:43:23:919 PrimeControl: sending abortTest() to prime clients.
       ...
2017-08-24 14:43:23:919 PrimeControl: masters[2]=client1:1094
2017-08-24 14:43:23:922 PrimeControl: [ERROR] exception occurred sending abortTest signal to specweb_Stub[UnicastRef [liveRef: [endpoint:[100.100.1.8:6228](remote),objID:[ca345ed:15e12e621cd:-7ffe, -4227181212894052388]]]]. Exception was:
 java.rmi.ConnectException: Connection refused to host: 100.100.1.8; nested exception is:
   java.net.ConnectException: Connection refused
2017-08-24 14:43:24:923 specvirt: benchmark run failed!
2017-08-24 14:43:24:923 specvirt: Done!


And still find "[ERROR] Remote exception clearing statistics from client1:1010"

-> java.rmi.ConnectException: Connection refused to host: 100.100.1.8;
in Clientmgr1_1096.out.

Should I increase any value in Control.config such as "POLL_INTERVAL_SEC"?
Thanks.
« Last Edit: August 24, 2017, 03:40:58 AM by Miles »