Are you using 10GbE (or greater) network infrastructure on your client VMs and SUT VMs? (i.e., 10GbE NICs on client hosts, 10GbE network on SUT VMs, 10GbE network switch)
If you have 1 tile working successfully and passing QoS, but QoS fails at higher tile counts, this indicates a tuning problem - not a benchmark harness or workload functional problem. All tiles drive the same load for the same workload type - therefore if higher tile counts are not passing webserver QoS, the culprit is likely a bottleneck on the SUT network or client network. Each tile will require ~800Gbit on both the SUT and Client infrastructure.