Author Topic: bach_interval[1]:FAILED batch interval was never started  (Read 9691 times)

scharel

  • Newbie
  • *
  • Posts: 3
  • Karma: +1/-0
bach_interval[1]:FAILED batch interval was never started
« on: November 09, 2015, 10:56:32 AM »
I get above message when running a benchmark. Has anyone ever seen this message or knows where the problem could be?
It looks like one batch interval runs successfully (high CPU load) and than no other interval starts (idle CPU for the rest of the benchmark).

lroderic

  • Moderator
  • Full Member
  • *****
  • Posts: 167
  • Karma: +6/-0
Re: bach_interval[1]:FAILED batch interval was never started
« Reply #1 on: November 09, 2015, 12:35:06 PM »
We'll need much more info. On the client please zip up /opt/SPECvirt/*.out and Control.config then post it here.

scharel

  • Newbie
  • *
  • Posts: 3
  • Karma: +1/-0
Re: bach_interval[1]:FAILED batch interval was never started
« Reply #2 on: November 12, 2015, 04:22:08 AM »
Thank you for the reply. This issue seems to be solved now. It was probably related to shortened intervals for testing.

We probably had a successful run over night but the output to the raw file suddenly stopped at a file size of exactly 64k.
Load on the system was visible for the whole test interval.
OS is Centos 6.4.

lroderic

  • Moderator
  • Full Member
  • *****
  • Posts: 167
  • Karma: +6/-0
Re: bach_interval[1]:FAILED batch interval was never started
« Reply #3 on: November 12, 2015, 09:39:22 AM »
Yes, a shortened measurement interval will mess with the batch workload. Glad to hear it's working.

tdeneau

  • Jr. Member
  • **
  • Posts: 51
  • Karma: +1/-1
Re: bach_interval[1]:FAILED batch interval was never started
« Reply #4 on: January 11, 2017, 12:52:38 PM »
I noticed this FAILED batch interval was never started message as well (I had shortened the run time to 1 hour instead of 2).  Can you provide more detail as to what is meant by messing with the batch workload?   When I look at the results directory I see CPU2006 .log files created for about the first 10 minutes of the run.

Yes, a shortened measurement interval will mess with the batch workload. Glad to hear it's working.

AnoopGupta

  • Jr. Member
  • **
  • Posts: 60
  • Karma: +0/-0
Re: bach_interval[1]:FAILED batch interval was never started
« Reply #5 on: January 18, 2017, 01:52:18 PM »
Batchserver runs workload once per hour. With POLL_INTERVAL_SEC reduced to 3600, batchserver would not have gotten chance to start the 2nd iteration, resulting in the error. However, the 1st iteration of batchserver should have run fine. If batchserver workload did not run at all, you will have to debug further.