Author Topic: SPECvirt Database/Appserver Validation  (Read 1820 times)

aakel

  • Newbie
  • *
  • Posts: 17
  • Karma: +3/-0
SPECvirt Database/Appserver Validation
« on: October 27, 2016, 11:17:48 AM »
I continue to receive messages as part of the benchmark summary that my runs are failing validation:
Aggregate Audit for Shared Database
Database 1 Errors:
PO Transaction validation FAILED
POLine Transaction validation: WARNING: New POLine DB Count ~ Delivery Servlet Tx Count > 10%

I've gone through a number of iterations in optimizing both the underlying database (and its VM) and the Appserver JVM (and its VM).  Are there any tricks for optimizing for this particular validation failure?  Looking through the code, this looks like one process on each Appserver that executes PO Transactions that just isn't able to keep up with the number of deliveries?  Any thoughts as to how I could improve this?

Things I've tried:
* This is running on a fairly fast storage layer (a set of software-RAID NVMe drives), though it's not using storage-based SR-IOV.
* The DB VM data should basically fit in memory (~30GB allocated, which is much higher than the minimum).
* Increased the Appserver JVM memory footprint like many of the most-recent submissions.

Thanks.

lroderic

  • Moderator
  • Full Member
  • *****
  • Posts: 152
  • Karma: +6/-0
Re: SPECvirt Database/Appserver Validation
« Reply #1 on: October 30, 2016, 03:15:16 PM »
You don't need SR-IOV to pass validation.

Are you using SAN or NFS for the DB store? How many spindles are in the DB store LUN or mount point? Are you tracking any VM stats using top or iostat? How many tiles are you running? If more than one, which tile is failing validation? Do you use 10 GbE for the network? Are you using a private network for traffic between the DB and appserver?

AnoopGupta

  • Moderator
  • Jr. Member
  • *****
  • Posts: 60
  • Karma: +0/-0
Re: SPECvirt Database/Appserver Validation
« Reply #2 on: October 31, 2016, 12:26:06 PM »
I am suspecting that the emulator server has some issue. Are these working?
http://specemulator:8080/Emulator/EmulatorServlet?cmd=switchlog
http://specdelivery:8000/Supplier/DeliveryServlet?cmd=switchlog

Check emulator server log and application server log for any exceptions.

aakel

  • Newbie
  • *
  • Posts: 17
  • Karma: +3/-0
Re: SPECvirt Database/Appserver Validation
« Reply #3 on: November 06, 2016, 02:31:56 AM »
I believe that I've fixed the issue above by tuning the database (or the error is transient enough that it appears fixed).
In case it helps with the later questions:
Quote
Are you using SAN or NFS for the DB store?  How many spindles are in the DB store LUN or mount point?
Not in this setup.  I'm using local storage on the VM host (3x 3.2TB NVMe SSDs >>1.5GB/s performance each in a RAID5 configuration).

Quote
Are you tracking any VM stats using top or iostat?
Not for the moment.

Quote
How many tiles are you running?
8 total tiles with 2 DB VMs.  I've validated that each set of tiles is communicating with their appropriate DB VMs.

Quote
Do you use 10 GbE for the network?
Yes.  A single 10GbE card for the VM host and a single 10GbE card for the client VM host.

Quote
Are you using a private network for traffic between the DB and appserver?
Yes.  All of the internal traffic (basically ever line drawn on the illustrated tile diagram in the instructions) traverses an internal network.

Quote
I am suspecting that the emulator server has some issue. Are these working?
http://specemulator:8080/Emulator/EmulatorServlet?cmd=switchlog
http://specdelivery:8000/Supplier/DeliveryServlet?cmd=switchlog
Both of these work.

Another couple questions:
1. I'm running tests that include anywhere between 4-8 tiles.  When I look through the Atomicity.html files generated, I'm seeing intermittent failures (some tiles fail test one, others three).  Is this expected?  Could something be misconfigured in my database setup?  I'm using Postgres 9.2 (and I've tested 9.5, which shows the same results).  I've tried debugging with the jApp-testappserver.sh script, and I see the same results when I run the script from multiple tiles at a time.

2. I get a number of errors loading the database using the Postgresql scripts in the example VM documentation.  Since the provided schema files define indices and primary keys, the db-tuning.sql file produces a ton of errors saying that the indices and primary keys already exist.  Is this expected?  This occurs even in the documentation-recommended version of Postgres (9.2).

aakel

  • Newbie
  • *
  • Posts: 17
  • Karma: +3/-0
Re: SPECvirt Database/Appserver Validation
« Reply #4 on: November 06, 2016, 02:43:26 AM »
Also, just to add: Each tile individually passes the atomicity tests before a full SPECvirt run; however, after a full SPECvirt run, I've seen the tiles encounter failures in re-running the atomicity tests.

ChrisFloyd

  • Moderator
  • Jr. Member
  • *****
  • Posts: 50
  • Karma: +2/-0
Re: SPECvirt Database/Appserver Validation
« Reply #5 on: November 09, 2016, 03:58:41 PM »
Aakel,

>When I look through the Atomicity.html files generated, I'm seeing intermittent failures (some tiles fail test one, others three).  Is this expected?

No, that is not expected. The ACI tests that are run at the beginning of the run as part of the SPECjAppServer start phase should always pass.  The benchmark does not re-run these after ramp-down phase, but manually re-running (as you are doing) should still pass.  If not, there is a problem with the database configuration.

>I get a number of errors loading the database using the Postgresql scripts in the example VM documentation

I recall the Postgresql db scripts had a duplicate "indices creation" section. I suspect that is what you are seeing - if so, the messages about "index already exists" can be ignored.

Are you using the exact settings from the SPECvirt_sc2013 publications from Q2,Q3,Q4 in 2013, that were submitted by HP?  Those results used Glassfish and Postgresql 9.2.

 


aakel

  • Newbie
  • *
  • Posts: 17
  • Karma: +3/-0
Re: SPECvirt Database/Appserver Validation
« Reply #6 on: November 09, 2016, 07:37:42 PM »
Quote
>When I look through the Atomicity.html files generated, I'm seeing intermittent failures (some tiles fail test one, others three).  Is this expected?

No, that is not expected. The ACI tests that are run at the beginning of the run as part of the SPECjAppServer start phase should always pass.  The benchmark does not re-run these after ramp-down phase, but manually re-running (as you are doing) should still pass.  If not, there is a problem with the database configuration.

Are the atomicity tests intended to be run in parallel?  That's the condition that's failing.  The tests pass if the they're run one appserver at a time.

Quote
>I get a number of errors loading the database using the Postgresql scripts in the example VM documentation

I recall the Postgresql db scripts had a duplicate "indices creation" section. I suspect that is what you are seeing - if so, the messages about "index already exists" can be ignored.

Are you using the exact settings from the SPECvirt_sc2013 publications from Q2,Q3,Q4 in 2013, that were submitted by HP?  Those results used Glassfish and Postgresql 9.2.
I'm using PostgreSQL 9.5 (instead of 9.2) with different VM hardware provisioning, so some values are different.  Looking at the options, there's no reason why they should impact the atomicity, consistency, or isolation guarantees of PostgreSQL.

Here are the options that differ (old->new):
random_page_cost = 1.5->1.0
effective_cache_size = 40GB->20GB
checkpoint_segments = 256 -> max_wal_size = 12GB (per https://www.postgresql.org/docs/9.6/static/release-9-5.html)

Just to be sure, I set the first two settings back to the defaults assumed in the runs you've referenced.  The results were exactly the same: Some appservers passed tests 1 and 2, others 2 and 3, another only 2.  However, even failing appservers run in isolation (only a single appserver at a time) passes without issue.

AnoopGupta

  • Moderator
  • Jr. Member
  • *****
  • Posts: 60
  • Karma: +0/-0
Re: SPECvirt Database/Appserver Validation
« Reply #7 on: November 10, 2016, 03:55:19 AM »
Quote
Are the atomicity tests intended to be run in parallel?  That's the condition that's failing.  The tests pass if the they're run one appserver at a time.

No, they are not intended to be run in parallel. Up to 4 AppServers share a Database, during the run only the 1st AppServer is expected to run Atomicity test. So, tile 1,5,9,... will be the ones running the atomicity tests. 

aakel

  • Newbie
  • *
  • Posts: 17
  • Karma: +3/-0
Re: SPECvirt Database/Appserver Validation
« Reply #8 on: November 10, 2016, 11:05:00 AM »
Quote
Are the atomicity tests intended to be run in parallel?  That's the condition that's failing.  The tests pass if the they're run one appserver at a time.

No, they are not intended to be run in parallel. Up to 4 AppServers share a Database, during the run only the 1st AppServer is expected to run Atomicity test. So, tile 1,5,9,... will be the ones running the atomicity tests.

Perhaps there's an issue with how I've set up my run?  I'm running 8 tiles and have 8 different *-0_Atomicity.html files in my results folder:
1-1.0/0-0_Atomicity.html
1-1.0/1-0_Atomicity.html
1-1.0/2-0_Atomicity.html
1-1.0/3-0_Atomicity.html
1-1.0/4-0_Atomicity.html
1-1.0/5-0_Atomicity.html
1-1.0/6-0_Atomicity.html
1-1.0/7-0_Atomicity.html

Is it possible that these are run in a staggered fashion?  When things fail, they fail differently for each tile's appserver, regardless of which DB VM they're connected to.

AnoopGupta

  • Moderator
  • Jr. Member
  • *****
  • Posts: 60
  • Karma: +0/-0
Re: SPECvirt Database/Appserver Validation
« Reply #9 on: November 10, 2016, 11:35:14 AM »
Quote
Are the atomicity tests intended to be run in parallel?  That's the condition that's failing.  The tests pass if the they're run one appserver at a time.

No, they are not intended to be run in parallel. Up to 4 AppServers share a Database, during the run only the 1st AppServer is expected to run Atomicity test. So, tile 1,5,9,... will be the ones running the atomicity tests.

Perhaps there's an issue with how I've set up my run?  I'm running 8 tiles and have 8 different *-0_Atomicity.html files in my results folder:
1-1.0/0-0_Atomicity.html
1-1.0/1-0_Atomicity.html
1-1.0/2-0_Atomicity.html
1-1.0/3-0_Atomicity.html
1-1.0/4-0_Atomicity.html
1-1.0/5-0_Atomicity.html
1-1.0/6-0_Atomicity.html
1-1.0/7-0_Atomicity.html

Is it possible that these are run in a staggered fashion?  When things fail, they fail differently for each tile's appserver, regardless of which DB VM they're connected to.

You should be looking at 0-0_Atomicity.html and 4-0_Atomicity.html only, rest can be ignored. Your tile 1-4 and 5-8 shoudl be going against different DB VMs. How many DB VMs do you have?

aakel

  • Newbie
  • *
  • Posts: 17
  • Karma: +3/-0
Re: SPECvirt Database/Appserver Validation
« Reply #10 on: November 10, 2016, 11:44:29 AM »
Quote
Are the atomicity tests intended to be run in parallel?  That's the condition that's failing.  The tests pass if the they're run one appserver at a time.

No, they are not intended to be run in parallel. Up to 4 AppServers share a Database, during the run only the 1st AppServer is expected to run Atomicity test. So, tile 1,5,9,... will be the ones running the atomicity tests.

Perhaps there's an issue with how I've set up my run?  I'm running 8 tiles and have 8 different *-0_Atomicity.html files in my results folder:
1-1.0/0-0_Atomicity.html
1-1.0/1-0_Atomicity.html
1-1.0/2-0_Atomicity.html
1-1.0/3-0_Atomicity.html
1-1.0/4-0_Atomicity.html
1-1.0/5-0_Atomicity.html
1-1.0/6-0_Atomicity.html
1-1.0/7-0_Atomicity.html

Is it possible that these are run in a staggered fashion?  When things fail, they fail differently for each tile's appserver, regardless of which DB VM they're connected to.

You should be looking at 0-0_Atomicity.html and 4-0_Atomicity.html only, rest can be ignored. Your tile 1-4 and 5-8 shoudl be going against different DB VMs. How many DB VMs do you have?

Ok.  Thanks.
My tile->DB setup should be correct: Tile 1-4 -> DB1 and Tile 5-8 -> DB2.  So, I have a total of 2 DB VMs.

AnoopGupta

  • Moderator
  • Jr. Member
  • *****
  • Posts: 60
  • Karma: +0/-0
Re: SPECvirt Database/Appserver Validation
« Reply #11 on: November 10, 2016, 01:24:45 PM »
Quote
Are the atomicity tests intended to be run in parallel?  That's the condition that's failing.  The tests pass if the they're run one appserver at a time.

No, they are not intended to be run in parallel. Up to 4 AppServers share a Database, during the run only the 1st AppServer is expected to run Atomicity test. So, tile 1,5,9,... will be the ones running the atomicity tests.

Perhaps there's an issue with how I've set up my run?  I'm running 8 tiles and have 8 different *-0_Atomicity.html files in my results folder:
1-1.0/0-0_Atomicity.html
1-1.0/1-0_Atomicity.html
1-1.0/2-0_Atomicity.html
1-1.0/3-0_Atomicity.html
1-1.0/4-0_Atomicity.html
1-1.0/5-0_Atomicity.html
1-1.0/6-0_Atomicity.html
1-1.0/7-0_Atomicity.html

Is it possible that these are run in a staggered fashion?  When things fail, they fail differently for each tile's appserver, regardless of which DB VM they're connected to.

You should be looking at 0-0_Atomicity.html and 4-0_Atomicity.html only, rest can be ignored. Your tile 1-4 and 5-8 shoudl be going against different DB VMs. How many DB VMs do you have?

Ok.  Thanks.
My tile->DB setup should be correct: Tile 1-4 -> DB1 and Tile 5-8 -> DB2.  So, I have a total of 2 DB VMs.

Your tile to DB mapping is correct. Could you please attach your SPECvirt/Control.config? I was not expecting to see Atomicity.html files for tiles that do not need to run atomicity tests.
 

aakel

  • Newbie
  • *
  • Posts: 17
  • Karma: +3/-0
Re: SPECvirt Database/Appserver Validation
« Reply #12 on: November 10, 2016, 02:08:58 PM »
Your tile to DB mapping is correct. Could you please attach your SPECvirt/Control.config? I was not expecting to see Atomicity.html files for tiles that do not need to run atomicity tests.

Sure.  I've attached it to this message.  It should be very similar to the multi-tile example provided from the SPECvirt team; however, I've extended the 5-tile case to 8 tiles.

aakel

  • Newbie
  • *
  • Posts: 17
  • Karma: +3/-0
Re: SPECvirt Database/Appserver Validation
« Reply #13 on: November 15, 2016, 05:15:03 PM »
Any thoughts as to if I have errors in my Control.config file?  Thanks again for your help.

AnoopGupta

  • Moderator
  • Jr. Member
  • *****
  • Posts: 60
  • Karma: +0/-0
Re: SPECvirt Database/Appserver Validation
« Reply #14 on: November 15, 2016, 05:47:49 PM »
Any thoughts as to if I have errors in my Control.config file?  Thanks again for your help.

It looks fine to me. You can ignore Atomicity.html files for tiles that do not need to run atomicity tests. If you see any Validation errors in the benchmark report generated at the end of the run, then those issues will need to be addressed.