Product Support > SPECvirt_sc2013
problems running three tiles (RMI)
tdeneau:
I am currently trying to run with 3 tiles, each tile being driven from its own unique client VM. I have yet to get a successful 3-tile run. My problem is probably with the sequence of steps preparing things before the run. Can someone post what the recommended procedure is?
For example, I noticed that if I used the Control.config that came with the Example VM scripts, then I see jAppInitRstr.sh being run simultaneously from each of the appservers so we have 3 appservers all trying to do a db restore at the same time, which seemed clearly wrong. I have commented out the PRIME_HOST_INIT_SCRIPT[x] section from Control.config and am trying to do the Inits from my own script before the run.
When I have completed my initialization I tend to check the following for x=1,2,3
* http://webserverx/Support
* http://appserverx:8000/SPECjAppServer/app?action=atomicityTests
and these always look fine
However, during the run something usually fails, for example
--- Code: ---2017-01-02 19:46:04:095 Warning: .doMyLongCommand received an SocketTimeoutException exception
java.net.SocketTimeoutException: Read timed out
--- End code ---
For a three-tile run would the following init sequence be correct?
* jAppInitRstr.sh for appserver1 (restores dbserver1)
* jAppInit.sh for appserver2, 3 (without restore) (Can these all be run concurrently?)
* mailInitRstr.sh for each of mailserver 1,2,3 (assume all can be concurrent)
* webInit.sh for each of webserver 1,2,3 (assume all can be concurrent)
* batchInit.sh for each of batchserver 1,2,3 (assume all can be concurrent)
* Do all VMs need to be rebooted before each run?
ChrisFloyd:
Tom,
The jAppInitRstr.sh should be run one the first app server (only) for each DB. E.g., Appserver1, Appserver5, Appserver9, etc..
As for the error msg you posted, that may be "normal", depending on how many SocketTimeoutExceptions you are experiencing during the start of the run. It isn't unusual to see maybe a dozen or so of these messages during warmup, especially if your disk subsystem isn't very low latency (e.g., SSD-based). For the mail workload, are you running from a previously "warmed-up then restored" mail store? (see my response to your question from Dec 5th: https://www.spec.org/forums/index.php?topic=63.0 )
Thanks,
Chris
tdeneau:
From someong not that familiar with RMI...
I am running each client in a separate VM. At this stage of "functionality only", all the VMs including the client VMs are running on the SUT.
Previously I had been running my SPECVIRT_HOST on the same VM as client1 and did not have any RMI problems.
I wanted to rearrange things so that the SPECVIRT_HOST is running on bare metal on the SUT itself (and eventually on a separate system).
In Control.config (on controller and on each client) I use
SPECVIRT_HOST=specvirt-controller
I make sure the name specvirt-controller with the correct IP is in the /etc/hosts file on all the VMs.
I then get the following error from the clients.
-> 2017-01-03 15:37:26:491 Remote exception calling getHostName(). Exception was:
java.rmi.ConnectException: Connection refused to host: 192.168.122.1; nested exception is: >
java.net.ConnectException: Connection timed out (Connection timed out)
I seem to be able to get rid of these errors by adding
-Djava.rmi.server.hostname=specvirt-controller
to the java invocations in Clientmgr.sh and runspecvirt.sh
But is there a cleaner way of handling this?
lroderic:
--- Code: ---jAppInitRstr.sh for appserver1 (restores dbserver1)
jAppInit.sh for appserver2, 3 (without restore) (Can these all be run concurrently?)
webInit.sh for each of webserver 1,2,3 (assume all can be concurrent)
batchInit.sh for each of batchserver 1,2,3 (assume all can be concurrent)
mailInitRstr.sh for each of mailserver 1,2,3 (assume all can be concurrent)
--- End code ---
Yes, these are run across the different clients simultaneously. Since the dbserver restore only occurs on vclient1, it'll take longer than the other three workloads. But the harness waits until all INIT scripts are finished running before continuing.
--- Code: ---Do all VMs need to be rebooted before each run?
--- End code ---
This is not a requirement and is up to you - depends on your measurement requirements.
Regarding the problem with hostname=specvirt-controller, does this happen if you don't use a - in the hostname?
Lisa
ChrisFloyd:
Tom,
Do you have lines at the top of your client's /etc/hosts file containing entries for "localhost" and "::1", etc? If so, have you tried commenting those out and trying again? I recall the problem you are running into has to do with RMI naming lookup/resolution not matching what it thinks the local hostname is. It's been a while since I've seen this problem, but I think the resolution for me was to comment/remove the "localhost" related entries.
Navigation
[0] Message Index
[#] Next page
Go to full version