Monday, January 25, 2016

Stuttering Sound Problems with KOTOR 2 in Linux

Sometimes I just have to shake my head in disbelief -- how do some things pass through QA?

I recently picked up Knights of the Old Republic 2 on Steam.  It's been on my list of games to play for a while and it being on sale at the time and also being Linux compatible made me pull the trigger on it.

Pretty quickly I noticed that the sound was stuttering really badly. After putting up with it for way too long, some google fu lead me to a Steam forum where the Windows users where saying that they had the same problems on multicore CPU's.   Apparently, setting the CPU affinity resolves the issue.  I can confirm that this is easy to do on Linux and it does solve the problem.

Start up KOTOR2, then Alt-Tab to a xterm and run the following:


ps -ef |grep KOTOR| grep -v grep|grep -v 'bin/sh' |awk '{print $2}'` | xarg taskset -p 03

This will figure out the PID of KOTOR2 process and lock it to the 4th CPU.  You can close the xterm at this point.  Now Alt-Tab back to KOTOR2 and the sound stuttering should be fixed.  The CPU's are counted from 0, so if you have a quad core, you can lock the process to CPU's 0 through 3.   Based on your rig, there might be a better CPU to lock too. One thing to look at is which CPU is handling various IRQ requests.  I didn't have to look into this even with an AMD FX-6300 so I would suspect any recent chipset wouldn't require any further tuning either.



https://steamcommunity.com/app/208580/discussions/0/648814842278312559/

http://manpages.ubuntu.com/manpages/lucid/man1/taskset.1.html

Monday, January 18, 2016

SSD Tuning for Linux Servers




I got a call from a client that was having issues with really slow database queries.   Part of the solution was to upgrade to the latest version of Percona Server 5.6 but while we where in, we took the time to look into the tuning of the machine.

The machine was a beast -- over a 100G of RAM, a few SSD's in a RAID controller but it's reported IO was dismal in production.   UPDATE queries that should have only taken microseconds where taking minutes.

We optimized tables, removed duplicate indexes -- pretty typical stuff.   During the optimization of tables, the SSD array was getting over 500Mb/s for writes.  I was reasonably sure the hardware wasn't failing given those numbers.

We found that is was running EXT4 with barriers.   I have found that MySQL performs poorly with EXT4's barrier feature enabled. We disabled that in fstab and rebooted.

After that was all done, I looked into what we can do with respect to kernel tuning for this workload. I found that these settings should help in theory for this type of situation.

echo 0 > /sys/block/sda/queue/rotational
echo 0 > /sys/block/sda/queue/add_random
 echo 2 > /sys/block/sda/queue/rq_affinity  
echo noop > /sys/block/sda/queue/scheduler
echo 2 > /sys/block/sda/queue/nomerges
/sbin/blockdev --setra 0  /dev/sda

Given that it is an array of SSD's attached to RAID controller card, I went with the idea that the kernel should just let MySQL and the RAID controller do as much of the work as possible and stop second guessing it.  This is why the scheduler is set to 'noop' and I've set the read ahead on this logical drive to 0. 

I found that the OS was not able to ID the drive as an array of SSD's so we had to set the 'rotational' setting to 0 which is supposed to trigger certain optimizations with in the kernel for SSD's.

Next, the 'add_random' setting is disabled so that we can stop collecting entropy from this device.   Since it is an SSD, I would suspect that the entropy provided would be poor and it's adding a small amount of overhead to each IO.  The node also doesn't do much crypto work either so this was a obvious choice.

Alas, the client was keen on getting this node back into production so I was not able to do any real benchmarking on these configuration changes but in the end we did improve the situation considerably so the client was happy about that.

Monday, January 11, 2016

An alternative to 'top' for Mysql

I think that one of my most used tools at work is 'top'.   It helps me figure out what is going on with a server very quickly.  While it's not perfect, it is a good all purpose tool. Unfortunately, there is no viable alternative that I've found yet that is prepackaged so I resort to using the following commands run from the shell:

for i in `seq 1 1000`;do clear; uptime;free -m ;mysql -uxxx -pxxx information_schema  -B -e "select * from PROCESSLIST where command != 'Sleep';" 2>/dev/null | grep -v 'from PROCESSLIST where ' ;sleep 4;done

Modify these commands to suit your needs -- I decided that updating every 4 seconds was reasonable enough for my needs and that displaying the memory info in MB works for me.

This allows me to monitor what a particular MySQL server is doing, hopefully helping me spot any long running queries at a glance with out relying any application debugging support or MySQL's general_log.

It may not be as pretty or polished as iotop or top, but it can give an insight into what is going with a MySQL server



The problem with relying on any particular applications debugging logs is that any access done directly by other MySQL users will not be logged.  MySQL's general log does capture everything and definitely has it's purpose both during application testing as well as during fault resolution in the field but it only reports on completed queries.

This solution does have some short comings, it doesn't capture every single query, nor does it show if a user is hammering the database with thousands of tiny queries every second but as an equivalent tool to top for peering into MySQL, this can get the job done.