* what does Tx Unit Hang means?
We check to make sure transmits are occurring after a set period of time. If
during that time a transmit has not occurred and we have transmits pending we
set a bit saying we are concerned. This bit will be cleared if we receive a
pause frame. If during the next check the same conditions are true and the bit
is still set we say we have a TX hang.
* what is the state of the driver after this kind of messages?
After entering this state we should schedule a reset. The idea being that it
will restart transmits. Sounds like you are not seeing this.
10G NIC 에서 1G mode로 negotiation 하면서 발생하기도 함
ethtool -A ethX autoneg off rx off tx off NOTE: For 82598 backplane cards entering 1 gig mode, flow control default behavior is changed to off. Flow control in 1 gig mode on these devices can lead to Tx hangs.
http://downloadmirror.intel.com/14687/eng/readme.txt
[ 1762.997753] igb 0000:01:00.0: Detected Tx Unit Hang
[ 1762.997753] Tx Queue <0>
[ 1762.997753] TDH <f4>
[ 1762.997753] TDT <f4>
[ 1762.997753] next_to_use <f6>
[ 1762.997753] next_to_clean <f4>
[ 1762.997753] buffer_info[next_to_clean]
[ 1762.997753] time_stamp <23981>
[ 1762.997753] next_to_watch <c08edf50>
[ 1762.997753] jiffies <239ec>
[ 1762.997753] desc.status <150200>
[ 1764.997784] igb 0000:01:00.0: Detected Tx Unit Hang
[ 1764.997784] Tx Queue <0>
[ 1764.997784] TDH <f4>
[ 1764.997784] TDT <f4>
[ 1764.997784] next_to_use <f6>
[ 1764.997784] next_to_clean <f4>
[ 1764.997784] buffer_info[next_to_clean]
[ 1764.997784] time_stamp <23981>
[ 1764.997784] next_to_watch <c08edf50>
[ 1764.997784] jiffies <23ab4>
[ 1764.997784] desc.status <150200>
[ 1766.997761] igb 0000:01:00.0: Detected Tx Unit Hang
[ 1766.997761] Tx Queue <0>
[ 1766.997761] TDH <f4>
[ 1766.997761] TDT <f4>
[ 1766.997761] next_to_use <f6>
[ 1766.997761] next_to_clean <f4>
[ 1766.997761] buffer_info[next_to_clean]
[ 1766.997761] time_stamp <23981>
[ 1766.997761] next_to_watch <c08edf50>
[ 1766.997761] jiffies <23b7c>
[ 1766.997761] desc.status <150200>
[ 1768.997849] igb 0000:01:00.0: Detected Tx Unit Hang
[ 1768.997849] Tx Queue <0>
[ 1768.997849] TDH <f4>
[ 1768.997849] TDT <f4>
[ 1768.997849] next_to_use <f6>
[ 1768.997849] next_to_clean <f4>
[ 1768.997849] buffer_info[next_to_clean]
[ 1768.997849] time_stamp <23981>
[ 1768.997849] next_to_watch <c08edf50>
[ 1768.997849] jiffies <23c44>
[ 1768.997849] desc.status <150200>
[ 1770.997818] igb 0000:01:00.0: Detected Tx Unit Hang
[ 1770.997818] Tx Queue <0>
[ 1770.997818] TDH <f4>
[ 1770.997818] TDT <f4>
[ 1770.997818] next_to_use <f6>
[ 1770.997818] next_to_clean <f4>
[ 1770.997818] buffer_info[next_to_clean]
[ 1770.997818] time_stamp <23981>
[ 1770.997818] next_to_watch <c08edf50>
[ 1770.997818] jiffies <23d0c>
[ 1770.997818] desc.status <150200>
[ 1771.041677] ------------[ cut here ]------------
[ 1771.046311] WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0x258/0x278)
[ 1771.053611] NETDEV WATCHDOG: eth0 (igb): transmit queue 0 timed out
[ 1771.059943] Modules linked in:
[ 1771.063070] [<80015d28>] (unwind_backtrace+0x0/0xf8) from [<8001e1ec>] (warn)
[ 1771.072536] [<8001e1ec>] (warn_slowpath_common+0x4c/0x6c) from [<8001e2a0>] )
[ 1771.082170] [<8001e2a0>] (warn_slowpath_fmt+0x30/0x40) from [<80437b2c>] (de)
[ 1771.091288] [<80437b2c>] (dev_watchdog+0x258/0x278) from [<8002a52c>] (call_)
[ 1771.100752] [<8002a52c>] (call_timer_fn.isra.31+0x24/0x84) from [<8002a6fc>])
[ 1771.110641] [<8002a6fc>] (run_timer_softirq+0x170/0x1f0) from [<80024e98>] ()
[ 1771.119846] [<80024e98>] (__do_softirq+0xe0/0x1b8) from [<80024fb0>] (run_ks)
[ 1771.128521] [<80024fb0>] (run_ksoftirqd+0x40/0x5c) from [<800405e4>] (smpboo)
[ 1771.137657] [<800405e4>] (smpboot_thread_fn+0xf0/0x178) from [<800394b4>] (k)
[ 1771.146216] [<800394b4>] (kthread+0xa4/0xb0) from [<8000ec98>] (ret_from_for)
[ 1771.154388] ---[ end trace 4107bd53718c6753 ]---