We are in process of moving our mysql database to docker swarm. We have performance issue:
We have 3 nodes of 1TB SSD drive each portworx cluster. They are connected over 10GBit network.
I created volume for mysql database, like this:
pxctl volume create cmtsdb_vl --sharedv4 -s 1024 -r 2 --io_priority high --io_profile=db_remote
The volume is mounted by docker swarm onto /var/lib/mysql
The volume is created on two machines of the 3 available.
The container runs on one of the two machines that host the volume.
Sustained writes work at acceptable 200GB speed:
root@db-cmts_nabp:/var/lib/mysql# time dd if=/dev/zero bs=1G count=10 of=delete_me oflag=sync
10+0 records in
10+0 records out
10737418240 bytes (11 GB, 10 GiB) copied, 51.4847 s, 209 MB/s
real 0m51.582s
user 0m0.005s
sys 0m27.818s
But mysql has very poor performance - miserable 2MBytes/s.
I suspect that’s because SYNC takes unusual long time to complete, when mysql is running, sync takes 600-1000 milliseconds:
root@db-cmts_nabp:/# time sync
real 0m0.660s
user 0m0.003s
sys 0m0.000s
I already tried to minimize number of SYNC issued by mysql:
innodb_flush_log_at_trx_commit=2
I also remounted the ext4 filesystem of the volume to have nobarrier option, but that did not help (I know it is unacceptable for production).
Do you have other suggestions for me to try?
Portworx Version: 2.6.x
Deployment Type: On Premise
Hi Arie,
Why are you creating --sharedv4
volume ? for database you don’t need SharedV4.
You can create a normal volume like this pxctl volume create cmtsdb_vl -s 1024 -r 2 --io_priority high --io_profile=db_remote
Also your provisioning 1TB of volume which is equivalent to backend disk, can you verify if it is exactly 1TB or less then 1TB ? I would recommend to create 900 or 950 GB and leave some buffer space.
Can you share the output of pxctl status
.
I rebuilt the volume without the --sharedv4
and with 950GB, but performance is still the same 1.5-2MB/s of write I/O (and sync takes 0.2-0.5 seconds):
root@cmts03-nabp:~# pxctl v l
ID NAME SIZE HA SHARED ENCRYPTED PROXY-VOLUME IO_PRIORITY STATUS SNAP-ENABLED
418766255164877162 cmtsdb_vl 950 GiB 2 no no no HIGH up - attached on 10.3.8.193no
877398645523791336 cmtsdb_vl_20210108 950 GiB 2 no no no HIGH up - detached no
pxctl status output:
root@cmts03-nabp:~# pxctl status
Status: PX is operational
License: PX-Developer
Node ID: 5def87aa-3c02-432a-b4c3-4bfdaa94dc8f
IP: 10.3.8.193
Local Storage Pool: 1 pool
POOL IO_PRIORITY RAID_LEVEL USABLE USED STATUS ZONE REGION
0 HIGH raid0 1.0 TiB 285 GiB Online default default
Local Storage Devices: 1 device
Device Path Media Type Size Last-Scan
0:1 /dev/sdd STORAGE_MEDIUM_SSD 1.0 TiB 06 Jan 21 10:13 UTC
total - 1.0 TiB
Cache Devices:
* No cache devices
Kvdb Device:
Device Path Size
/dev/sdb 100 GiB
* Internal kvdb on this node is using this dedicated kvdb device to store its data.
Metadata Device:
1 /dev/sdc STORAGE_MEDIUM_SSD
Cluster Summary
Cluster ID: pwx.cmts-db.nabp
Cluster UUID: c407d355-b559-4059-81b5-adc6ef796799
Scheduler: swarm
Nodes: 3 node(s) with storage (3 online)
IP ID SchedulerNodeName StorageNode Used Capacity Status StorageStatus Version Kernel OS
10.3.8.192 83280ada-8cc3-4c41-8c47-9568c6ca3c38 axblwfvg0f8fwb775sfupb5co Yes 285 GiB 1.0 TiB Online Up 2.6.2.0-f0dd370 5.4.0-60-generic Ubuntu 20.04.1 LTS
10.3.8.193 5def87aa-3c02-432a-b4c3-4bfdaa94dc8f ypm1g1hqv4r8y2jcl1r80ti0h Yes 285 GiB 1.0 TiB Online Up (This node) 2.6.2.0-f0dd370 5.4.0-58-generic Ubuntu 20.04.1 LTS
10.3.8.191 00774d33-e530-4416-8f3e-eab63e5f1df8 nsf8a7e4dvdeez2vcpfsvmmgm Yes 10 GiB 1.0 TiB Online Up 2.6.2.0-f0dd370 5.4.0-58-generic Ubuntu 20.04.1 LTS
Warnings:
WARNING: Swap is enabled on this node.
Global Storage Pool
Total Used : 580 GiB
Total Capacity : 3.0 TiB
root@cmts03-nabp:~#
Hi Aries,
Can you describe more in details about your platform used, are they on any VM’s/Cloud/BareMetal ?
And you mentioned your moving your database to docker swarm, so where was this previously running ?
How are you comparing the metrics/numbers do you have any comparison chart ?
Yes, the three machines are three proxmox 6.2-15 VMs.
Each machine has allocated 64GB RAM and 6 CPUs.
8 physical SSD drives are set into 10TB LVM that are then used to provide virtual disks to VMs.
The virtual disks have Cache option “Default (No cache)”.
Per Performance Tweaks - Proxmox VE it seems it is safe to enable Writeback caching, as in worst scenario of one machine failure (that we want to protect against), the other two machines would continue to work normally. What do you think?
I found the reason for the slowness. Posting here, may be it will help someone…
Turned out the default setting for binlog is sync_binlog=1 - meaning issue fsync after each binlog save. And in my case, the binlogs were written to a separate portworx volume with --sharedv4 flag on.
After changing sync_binlog to 0, we get solid 20MB/s of writes.
Thank you for your help!
Hi Arie,
Thank you for the update, yes it will sure help others if they ran in same issue.