Px-etcd: Kernel error /source/fs/fs-writeback.c:2337

I have the problem that my VM guest coreos (running Kubernetes) throwing kernel errors all the time.
The system is running and Portworx status is all green and everything is working, but…

System:
Container Linux by CoreOS stable (2023.5.0)
4.19.25-coreos #1 SMP Sat Mar 9 01:05:06 -00 2019 x86_64 QEMU Virtual CPU version 2.5+ AuthenticAMD GNU/Linux

Running on Ubuntu 18.04 KVM Host:
qemu 1:2.11+dfsg-1ubuntu7.12

Portworx:
portworx/px-enterprise:2.0.3.3

Installed on Kubernetes 1.13.15 with:
https://install.portworx.com/?mc=false&kbver=1.13.5&b=true&j=auto&c=flxc-staging-a5e251bf-a665-420e-85ed-9a2d320baca2&stork=true&lh=true&st=k8s

Kernel error (every few seconds):
[40302.956647] WARNING: CPU: 3 PID: 5370 at …/source/fs/fs-writeback.c:2337 __writeback_inodes_sb_nr+0xc1/0xd0
[40302.961390] Modules linked in: xt_set xt_multiport iptable_mangle iptable_raw ip_set_hash_ip ip_set_hash_net ipip tunnel4 ip_tunnel tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag veth xt_statistic xt_nat ipt_REJECT nf_reject_ipv4 ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs ip6table_nat nf_nat_ipv6 ip6_tables xt_comment xt_mark ipt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 xt_addrtype iptable_filter xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc overlay dm_thin_pool dm_persistent_data dm_bio_prison xfs px(OE) nls_ascii nls_cp437 vfat fat mousedev kvm_amd kvm psmouse i2c_piix4 i2c_core button virtio_balloon irqbypass evdev sch_fq_codel ext4 crc16 mbcache jbd2 fscrypto btrfs xor zstd_decompress zstd_compress
[40302.980694] xxhash lzo_compress dm_verity dm_bufio raid6_pq libcrc32c crc32c_generic uhci_hcd ata_piix virtio_net ehci_pci libata net_failover ehci_hcd failover scsi_mod usbcore virtio_blk virtio_console usb_common qemu_fw_cfg dm_mirror dm_region_hash dm_log dm_mod
[40302.988725] CPU: 3 PID: 5370 Comm: px-etcd Tainted: G W OE 4.19.25-coreos #1
[40302.991160] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
[40302.993805] RIP: 0010:__writeback_inodes_sb_nr+0xc1/0xd0
[40302.996508] Code: 0f b6 d1 e8 71 fc ff ff 48 89 ee 48 89 df e8 d6 e1 ff ff 48 8b 44 24 48 65 48 33 04 25 28 00 00 00 75 0b 48 83 c4 50 5b 5d c3 <0f> 0b eb ca e8 46 75 e0 ff 66 0f 1f 44 00 00 66 66 66 66 90 31 c9
[40303.004176] RSP: 0018:ffffb1e945493d98 EFLAGS: 00010246
[40303.005569] RAX: 0000000000000000 RBX: ffff95cddf7e5c00 RCX: 0000000000000000
[40303.007346] RDX: 0000000000000002 RSI: 000000000001106e RDI: ffff95cde32e3800
[40303.009169] RBP: ffffb1e945493d9c R08: 0000000000000000 R09: ffffb1e945493d98
[40303.010964] R10: 0000000000004000 R11: 0000000000003000 R12: ffff95cb56398200
[40303.012964] R13: ffff95cd97d0b1f0 R14: ffff95cd97d0b218 R15: ffff95cd6d116470
[40303.015409] FS: 00007ff00e7fc700(0000) GS:ffff95cdebac0000(0000) knlGS:0000000000000000
[40303.019795] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[40303.021453] CR2: 00007ff26f2ca038 CR3: 0000000522f40000 CR4: 00000000000006e0
[40303.023498] Call Trace:
[40303.024399] btrfs_commit_transaction+0x708/0xc00 [btrfs]
[40303.026000] ? dput+0x96/0x110
[40303.026982] ? btrfs_log_dentry_safe+0x54/0x70 [btrfs]
[40303.028462] btrfs_sync_file+0x31b/0x940 [btrfs]
[40303.029753] do_fsync+0x38/0x60
[40303.030730] __x64_sys_fdatasync+0x13/0x20
[40303.031885] do_syscall_64+0x4e/0x100
[40303.033008] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[40303.034381] RIP: 0033:0x4d15d0
[40303.035446] Code: 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 49 c7 c2 00 00 00 00 49 c7 c0 00 00 00 00 49 c7 c1 00 00 00 00 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
[40303.040313] RSP: 002b:000000c42030d318 EFLAGS: 00000212 ORIG_RAX: 000000000000004b
[40303.042674] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004d15d0
[40303.044657] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000000000b
[40303.047740] RBP: 000000c42030d360 R08: 0000000000000000 R09: 0000000000000000
[40303.052720] R10: 0000000000000000 R11: 0000000000000212 R12: ffffffffffffffff
[40303.055805] R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000017
[40303.057740] —[ end trace 34b85f625346855d ]—

Exact same problem here, Ubuntu 16.04, 2 machines, i can provide a log with the exact same trace but in ubuntu flavor but the error itself the same

What is the config of guest VM machine ? @Ignacio_J_Ortega_Lop @nwild

Just installed to see it again and contribute some more weirdness, is in the same code it seems, but the stacktrace is not the same, but for me can be linux differences, i continue thinking is the same, btw i’m using baremetal machines, with docker of course… and i don’t get the same from slightly older kernel…

uname -a for the problematic one is :

Linux pan 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4 09:03:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

for the non problematic one:

Linux mercurio 4.10.0-37-generic #41~16.04.1-Ubuntu SMP Fri Oct 6 22:42:59 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Maybe is something related to kernel versions?


[Fri Aug 23 12:07:18 2019] WARNING: CPU: 12 PID: 21596 at /build/linux-hwe-O570pP/linux-hwe-4.15.0/fs/fs-writeback.c:2340 __writeback_inodes_sb_nr+0xa6/0xb0
[Fri Aug 23 12:07:18 2019] Modules linked in: af_packet_diag netlink_diag dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio px(OE) target_core_user uio target_core_pscsi target_core_iblock target_core_file tcm_loop target_core_mod xt_REDIRECT nf_nat_redirect cpuid dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag ip_vs_rr xt_ipvs ip_vs xt_nat veth vxlan ip6_udp_tunnel udp_tunnel xt_mark ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter bridge stp llc aufs overlay xt_tcpudp xt_conntrack iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables x_tables xfs intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel
[Fri Aug 23 12:07:18 2019] pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd intel_cstate intel_rapl_perf wmi_bmof i2c_i801 intel_wmi_thunderbolt intel_pch_thermal shpchp acpi_pad mac_hid autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear raid1 e1000e nvme ptp ahci nvme_core libahci pps_core wmi video
[Fri Aug 23 12:07:18 2019] CPU: 12 PID: 21596 Comm: btrfs-transacti Tainted: G W OE 4.15.0-43-generic #46~16.04.1-Ubuntu
[Fri Aug 23 12:07:18 2019] Hardware name: Gigabyte Technology Co., Ltd. B360 HD3P-LM/B360HD3PLM-CF, BIOS F2 HZ 02/14/2019
[Fri Aug 23 12:07:18 2019] RIP: 0010:__writeback_inodes_sb_nr+0xa6/0xb0
[Fri Aug 23 12:07:18 2019] RSP: 0018:ffff9e3a8848bdb0 EFLAGS: 00010246
[Fri Aug 23 12:07:18 2019] RAX: 0000000000000000 RBX: ffff90348b2a0800 RCX: 0000000000000000
[Fri Aug 23 12:07:18 2019] RDX: 0000000000000002 RSI: 000000000004d39c RDI: ffff9e3a8848bdf8
[Fri Aug 23 12:07:18 2019] RBP: ffff9e3a8848be10 R08: ffff9e3a8848bdb8 R09: ffff902c4a790800
[Fri Aug 23 12:07:18 2019] R10: 0000000000000000 R11: 0000000000001000 R12: ffff9e3a8848bdb4
[Fri Aug 23 12:07:18 2019] R13: ffff90373ce7c8c8 R14: ffff90373ce7c8f0 R15: ffff90338bdfcd98
[Fri Aug 23 12:07:18 2019] FS: 0000000000000000(0000) GS:ffff903bbed00000(0000) knlGS:0000000000000000
[Fri Aug 23 12:07:18 2019] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Fri Aug 23 12:07:18 2019] CR2: 00007fd1a76839a8 CR3: 00000009e2e0a006 CR4: 00000000003606e0
[Fri Aug 23 12:07:18 2019] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[Fri Aug 23 12:07:18 2019] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[Fri Aug 23 12:07:18 2019] Call Trace:
[Fri Aug 23 12:07:18 2019] writeback_inodes_sb+0x27/0x30
[Fri Aug 23 12:07:18 2019] btrfs_commit_transaction+0x7bd/0x930 [btrfs]
[Fri Aug 23 12:07:18 2019] ? start_transaction+0x9b/0x440 [btrfs]
[Fri Aug 23 12:07:18 2019] transaction_kthread+0x1a6/0x1c0 [btrfs]
[Fri Aug 23 12:07:18 2019] kthread+0x105/0x140
[Fri Aug 23 12:07:18 2019] ? btrfs_cleanup_transaction+0x550/0x550 [btrfs]
[Fri Aug 23 12:07:18 2019] ? kthread_destroy_worker+0x50/0x50
[Fri Aug 23 12:07:18 2019] ? do_syscall_64+0x73/0x130
[Fri Aug 23 12:07:18 2019] ? SyS_exit_group+0x14/0x20
[Fri Aug 23 12:07:18 2019] ret_from_fork+0x35/0x40
[Fri Aug 23 12:07:18 2019] Code: 0f b6 d2 e8 cd fc ff ff 4c 89 e6 48 89 df e8 d2 e1 ff ff 48 8b 45 e8 65 48 33 04 25 28 00 00 00 75 0d 48 83 c4 50 5b 41 5c 5d c3 <0f> 0b eb ca e8 b1 33 de ff 90 0f 1f 44 00 00 55 31 c9 48 89 e5
[Fri Aug 23 12:07:18 2019] —[ end trace 1b788201da953ffd ]—

@Ignacio_J_Ortega_Lop : Yes it could be, this are the supported list of kernels:

@sanjay.naikwadi I don’t fully understand that page i use kernel 4.15.0-43-generic and portworx 2.0.3.7-8d3c0dc, is there any problem with this combination ? which should i use?

@Ignacio_J_Ortega_Lop - Yes, we have fully tested kernel list which I have shared with you, I would recommend to use the listed/supported version of kernel and check if your facing seeing any errors.

@sanjay.naikwadi

I’m using ubuntu 16.04, and none of the kernels listed is available to install for me, none of the list in the page https://2.1.docs.portworx.com/reference/knowledge-base/supported-kernels/ it available to install in my ubuntu version, at least i can not find them, which version of ubuntu is this list for?

Any idea on how to get to install that specific kernel version for my beloved ubuntu 16.04?

All my problems solved using the latest version 2.1.4 i was using 2.0.3,