Author Topic: appserver problems on tile 2  (Read 6776 times)

tdeneau

  • Jr. Member
  • **
  • Posts: 51
  • Karma: +1/-1
appserver problems on tile 2
« on: December 15, 2016, 04:27:54 PM »
Sending this test message because the forum software keeps telling me my message body is empty

AnoopGupta

  • Jr. Member
  • **
  • Posts: 60
  • Karma: +0/-0
Re: appserver problems on tile 2
« Reply #1 on: December 15, 2016, 04:28:50 PM »
We do see your post and comment

tdeneau

  • Jr. Member
  • **
  • Posts: 51
  • Karma: +1/-1
Re: appserver problems on tile 2
« Reply #2 on: December 15, 2016, 04:31:53 PM »
I have all 4 workloads running fine on tile 1 (and can set up Control.config with NUM_WORKLOADS=4 and drive all 4 workloads).

Now I am trying to make a second tile.  I think I have all the IP address and /etc/hosts stuff complete.  I am trying to run the helper script jAppInit.sh but modified to use appserver2 instead of appserver1.  I am having trouble at the step where the script does
      ssh appserver2 'cd /opt/SPECjAppServer2004; ./appsrv-ctrl.sh start

Here are the first things logged in /opt/glassfish3/glassfish/domains/spec2004-1/logs/server.log
Do I need to make some change on appserver2 to remedy the "no free port" error message?  I confess I do not know much about jAppServer2004.  The appsrv-ctrl.sh start  command of course works fine when directed to appserver1

[#|2016-12-15T15:09:26.361-0600|SEVERE|glassfish3.1.2|grizzly|_ThreadID=29;_ThreadName=Thread-2;|doSelect IOException
java.net.BindException: No free port within range: 7676=com.sun.enterprise.v3.services.impl.ServiceInitializerHandler@75e10cd1
    at com.sun.grizzly.TCPSelectorHandler.initSelector(TCPSelectorHandler.java:432)
    at com.sun.grizzly.TCPSelectorHandler.preSelect(TCPSelectorHandler.java:378)
    at com.sun.grizzly.SelectorHandlerRunner.doSelect(SelectorHandlerRunner.java:188)
    at com.sun.grizzly.SelectorHandlerRunner.run(SelectorHandlerRunner.java:132)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
|#]


AnoopGupta

  • Jr. Member
  • **
  • Posts: 60
  • Karma: +0/-0
Re: appserver problems on tile 2
« Reply #3 on: December 16, 2016, 04:27:50 AM »
BindException could mean GlassFish is trying to start on an incorrect host IP address, or the port 7676 is actually already in use.

Before running " ./appsrv-ctrl.sh start", you can check "netstat -an|grep 7676" to see if the port is already in use. If it is, try rebooting the appserver2 VM and recheck.

However, I am suspecting that the tile2 appserver2 /etc/hosts may be missing "appserver" or may be mapped to incorrect IP. Please ensure both appserver2 and appserver point to the same IP.



tdeneau

  • Jr. Member
  • **
  • Posts: 51
  • Karma: +1/-1
Re: appserver problems on tile 2
« Reply #4 on: December 16, 2016, 10:47:28 AM »
Thanks, Anoop.  It was indeed true that appserver and appserver2 were not mapped to the same IP address on appserver2.  I corrected this and appsrv-ctrl.sh start seemed to work.

I ran the complete jAppInitRstr.sh with appserver, specemulator and specdelivery all pointing to appserver2

However, http://appserver2:8000/SPECjAppServer/app?action=atomicityTests is showing that all 3 atomicty tests failed. 

http://appserver2:8080/Emulator/EmulatorServlet looks OK.

Also if I look at the atomicity tests page for appserver1 (sharing the same DB server as appserver2) http://appserver1:8000/SPECjAppServer/app?action=atomicityTests is showing that all 3 atomicty tests pased.



tdeneau

  • Jr. Member
  • **
  • Posts: 51
  • Karma: +1/-1
Re: appserver problems on tile 2
« Reply #5 on: December 16, 2016, 11:52:46 AM »
Ignore previous problem report about atomicity.  I did not have dbserver in /etc/hosts file on appserver2.


tdeneau

  • Jr. Member
  • **
  • Posts: 51
  • Karma: +1/-1
Re: appserver problems on tile 2
« Reply #6 on: December 19, 2016, 04:45:23 PM »
    I am still having some small problems with running 2 tiles.
    In my configuration, I have two client VMs, client1 (hitting tile1 ) and client2 (hitting tile 2) and I am running runspecvirt.sh itself from client1.
    I do not have separate IP addresses for the xxx-int names

    I want to make sure I understand how the /etc/hosts files should be set up on the clients and the server VMs.
    From Anoop's comment below I am assuming that the /etc/hosts for a server VM on tile 2 should look  something like this (although I am not sure whether the server VMs need only client2 or both client2 and client1):

    10.236.119.223  batchserver2 batchserver
    10.236.119.234  infraserver2 infraserver2-int infraserver infraserver-int
    10.236.10.31    webserver2 webserver2-int webserver webserver-int
    10.236.10.190   mailserver2 mailserver
    10.236.10.112   appserver2 appserver appserver-int specdelivery specemulator
    10.236.10.142   dbserver dbserver1  dbserver-int dbserver1-int specdb

    10.236.10.32     client2 specdriver client
    10.236.10.178   client1

    with a similar pattern for the server VMs on tile1 (i.e., they only need to know the other VMs in their own tile and the generic names (without numbers) should all coincide with tile 1 names.

    What about the /etc/hosts on client2 itself?  Does it need to be any different from the pattern shown above?

    My other questions are on Control.config. 
    • Does Control.config have to copied to each client?
    • I have attached the Control.config I am using for 2 tiles.  Does it look OK?

    I am asking these questions because I did a 2-tile run and it looked like it mostly worked but I saw the following problems:
    • 2016-12-18 17:14:37:756 Connection: SocketTimeoutException waiting for end-of-header
      2016-12-18 17:14:37:756 SPECweb_Support: [ERROR] SocketTimeoutException encountered during run!

      webserver on tile 1 looked OK.
      The http:webserver2/support looked OK in my browser.

    • started showing response sets at 5:22.  Seemed to run well until 6:49 when we got
      [ERROR] One or more workloads exceeded maximum allowed polling response delay!
      sending abortTest() to prime clients.

      and later
      [ERROR] exception occurred sending abortTest signal to specweb_Stub[UnicastRef [liveRef: [endpoint:[10.\
      236.10.32:18970](remote),objID:[1993d3e4:15914296663:-7ffe, -4334599631892616363]]]]. Exception was:
       java.rmi.ConnectException: Connection refused to host: 10.236.10.32; nested exception is:
          java.net.ConnectException: Connection refused (Connection refused)

However, I am suspecting that the tile2 appserver2 /etc/hosts may be missing "appserver" or may be mapped to incorrect IP. Please ensure both appserver2 and appserver point to the same IP.
[/list]
« Last Edit: December 20, 2016, 12:46:13 PM by tdeneau »

ChrisFloyd

  • Moderator
  • Jr. Member
  • *****
  • Posts: 52
  • Karma: +2/-0
Re: appserver problems on tile 2
« Reply #7 on: January 03, 2017, 03:30:37 PM »
Tom,

Your /etc/hosts file example look correct to me (for Tile2).  If you are using "generic" names for the SPECjAppServer config (i.e., specemulator, specdriver, dbserver, appserver), then you have the correct idea. Technically, each Tile's client and SUT VMs need only know about their own Tile's VMs IPs and the IP of the prime-controller (aka, 'master').  I generally configure a large /etc/hosts file that has all of the clients and SUT VMs listed with their IPs, copy that to all of the VMs, and then lastly, append the "webserver, appserver, emuserver, dbserver, mailserver, ...etc" to the correct entry location in /etc/hosts for each client and SUT VM.  (This can be automated via a few "grep" and "sed" commands, assuming the Tile# is part of the VM's hostname, and the naming scheme is consistent across tiles).

The Control.config file is used by the prime-controller (a.k.a. "Master") only.  The application specific initialization calls are made to the individual clients using this master Control.config file.  Each client will use it's local workload (web, mail, app, batch) configuration files, however.  (This allows specifying unique hostnames in the workload configuration files, but it sounds like you are not doing this, and are using 'generic hostnames' for client<-->SUT VM communications - which is fine.)

As for the errors you are seeing, those look like they may be QoS related -- i.e., a performance bottleneck rather than a misconfiguration, perhaps.
You may want to check your network utilization (i.e., 10Gb NICs are needed if running more than 1 tile over the same SUT NIC port). Also check CPU utilization's of your client VMs and webserver, etc...     As for performance tuning, optimizations, and scaling up to many tiles, that is beyond the scope of SPEC's support capabilities. However, it sounds like you are close to a successfully running multi-tile configuration.



tdeneau

  • Jr. Member
  • **
  • Posts: 51
  • Karma: +1/-1
Re: appserver problems on tile 2
« Reply #8 on: January 06, 2017, 03:52:34 PM »
I was rebuilding some VM images.
Everything worked fine for tile 1.

I copied the tile1 appserver image to tile 2, changed the hostname, and made sure all the /etc/hosts files on the various tile 2 vms were correct.

I ran jAppInitRstr.sh on client1.
I ran just jAppInit.sh on client2

I noticed while appserver2:8000 showed the glassfish info page, when I tried
http://appserver2:8000/Supplier/DeliveryServlet I got
"the requested resource is not available"

I saw the following in the glassfish server.log

exception while invoking class org.glassfish.ejb.startup.ejbdeployer load method


I was able to get around this by re-running
    /opt/appvm-scripts/makeme-appserver.sh
and
    /opt/appvm-scripts/setup-files.sh

but when I was experimenting with higher number tiles before, I never saw this problem, so I don't understand why I needed to rerun setup-files.sh.  Can anyone explain?


Note that all of this was just in preparation for the actual specvirt run.
-- Tom

AnoopGupta

  • Jr. Member
  • **
  • Posts: 60
  • Karma: +0/-0
Re: appserver problems on tile 2
« Reply #9 on: January 06, 2017, 04:01:41 PM »
If http://appserver2:8000/Supplier/DeliveryServlet did not work, then for some reason the specj application did not deploy on the glassfish server. If you run into it again and can share the glassfish server.log, we could take a look.

Also, you may look at the makeme-appserver.sh and setup-files.sh scripts to see what they did to fix your VM, and perhaps it might throw some light on whether you did anything different this time.