Webserver crashes on a Saturday due to high disk IO
I've an odd problem on a Serverpilot VPS. It's fine through the week - responsive and not under stress hosting about 60 low traffic Wordpress sites. Then on Saturday between 830-6pm occasionally the server runs out of resources and crashes. Once the server is rebooted through the VPS control panel then the sites all come back again and are fine for at least 1 hour. It maybe happens 3-5 times through the day. I've not been able to consistently watch top / iotop during a Saturday so I can't see what is happening second to second but the VPS provider says the disk is seeing high read/write on and off on the day of the crashes and the one time I did manage to watch the top command I could see the requests for pages from the webserver in the list but not seemingly being served as they didn't drop off. I think each request sits on the list consuming a small amount of processor and memory but over time it gradually builds up until the server runs out of memory and the CPU is maxed and all I can do is reboot through the VPS control panel. Nothing seemed to be at the top taking large amounts of processor - I think it was some process that was thrashing the disk but not the CPU. It's happened 3 weekends in a row after being stable for around 6 months. It also happened one single Sunday morning at 830am.I need some tool that will watch the disk usage and record what is happening so I can view a logfile after the crash and see what process / webserver site pool was hammering the disk. I installed SAR from SYSSTAT but I can't seem to get stats on which webserver site pool is thrashing the disk, just that the disk is under pressure.
I have checked cron and can't see anything specifically running on a Saturday and I'm fairly confident it isn't a traffic issue. It could definitely be some rogue Wordpress plugin doing something but with 60 sites I'm at a loss how to easily track that down.
Any help would be much appreciated, it's ruining my weekends!