2009年4月20日星期一

Deadlock stack trace

最近遇见了一个死锁问题,由于除了处理死锁的经验不足,不知道利用堆栈轨迹(thread stack trace)来调试问题,花费了不少时间比较不同版本的不同和插入debug信息来跟踪deadlock发生的位置,最后客户提醒了一下lock stack trace才意识到自己之前的方式有问题,最后google了一下,发现很多很不错的相关信息(不过,暂时没时间好好总结实践一下)。

1. Java thread dump
http://www.myloadtest.com/java-thread-dump/
该文简单介绍了用Thread Dump Analyzer分析thread dump,以Dining Philosophers 为例。

2. Of Thread dumps and stack traces
http://www.0xcafefeed.com/2004/06/of-thread-dumps-and-stack-traces/

该文包括下面几方面内容:
  • Anatomy of a stack trace
  • Innovative usage of stack traces
  • Thread dump 101
  • Debugging run away CPU
  • Debugging performance issues
  • Debugging “hang” problems

3. An Introduction to Java Stack Traces
http://java.sun.com/developer/technicalArticles/Programming/Stacktrace/
该文比较详细介绍了Java Stack Traces 和如何分析stack traces.

"What is a Java stack trace? A Java stack trace is a user-friendly snapshot of the threads and monitors in a Java1 Virtual Machine (JVM). Depending on how complex your application or applet is, a stack trace can range from fifty lines to thousands of lines of diagnostics. .."


Sun 也提供了一个产生死锁了例子:http://java.sun.com/docs/books/faq/src/thr/DeadlockExample.html

下面是我的运行结果:

D:\java DeadlockExample

threadA.getLockCounter(1) = 0
threadA.getLockCounter(2) = 0
threadB.getLockCounter(1) = 0
threadB.getLockCounter(2) = 0
Started DeadlockThread(a, b)
Started DeadlockThread(b, a)

threadA.getLockCounter(1) = 440
threadA.getLockCounter(2) = 439
threadB.getLockCounter(1) = 1
threadB.getLockCounter(2) = 0

threadA.getLockCounter(1) = 440
threadA.getLockCounter(2) = 439
threadB.getLockCounter(1) = 1
threadB.getLockCounter(2) = 0

(enter the key sequence ctrl+break)

2009-04-20 21:41:57
Full thread dump Java HotSpot(TM) Client VM (11.3-b02 mixed mode, sharing):

"DestroyJavaVM" prio=6 tid=0x002f6400 nid=0x1f8 waiting on condition [0x00000000
..0x0091fd50]
java.lang.Thread.State: RUNNABLE

"Thread-1" prio=6 tid=0x02b0f000 nid=0xa64 waiting for monitor entry [0x02eaf000
..0x02eafd18]
java.lang.Thread.State: BLOCKED (on object monitor)
at DeadlockThread.grabSecondLock(DeadlockExample.java:137)
- waiting to lock <0x2b77f0a0> (a java.lang.String)
at DeadlockThread.grabFirstLock(DeadlockExample.java:131)
- locked <0x2b77f0b8> (a java.lang.String)
at DeadlockThread.grabLocks(DeadlockExample.java:125)
at DeadlockThread.run(DeadlockExample.java:105)

"Thread-0" prio=6 tid=0x02b0e800 nid=0x614 waiting for monitor entry [0x02e5f000
..0x02e5fd98]
java.lang.Thread.State: BLOCKED (on object monitor)
at DeadlockThread.grabSecondLock(DeadlockExample.java:137)
- waiting to lock <0x2b77f0b8> (a java.lang.String)
at DeadlockThread.grabFirstLock(DeadlockExample.java:131)
- locked <0x2b77f0a0> (a java.lang.String)
at DeadlockThread.grabLocks(DeadlockExample.java:125)
at DeadlockThread.run(DeadlockExample.java:105)

"Low Memory Detector" daemon prio=6 tid=0x02b01800 nid=0xe90 runnable [0x0000000
0..0x00000000]
java.lang.Thread.State: RUNNABLE

"CompilerThread0" daemon prio=10 tid=0x02af8400 nid=0xbf4 waiting on condition [
0x00000000..0x02d6f844]
java.lang.Thread.State: RUNNABLE

"Attach Listener" daemon prio=10 tid=0x02af6c00 nid=0xbec runnable [0x00000000..
0x00000000]
java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x02b13c00 nid=0x608 waiting on condition
[0x00000000..0x00000000]
java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=8 tid=0x02ab0000 nid=0xd48 in Object.wait() [0x02c7f000.
.0x02c7fc98]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x22940b28> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(Unknown Source)
- locked <0x22940b28> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(Unknown Source)
at java.lang.ref.Finalizer$FinalizerThread.run(Unknown Source)

"Reference Handler" daemon prio=10 tid=0x02aaec00 nid=0x1f0 in Object.wait() [0x
02c2f000..0x02c2fd18]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x22940a30> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:485)
at java.lang.ref.Reference$ReferenceHandler.run(Unknown Source)
- locked <0x22940a30> (a java.lang.ref.Reference$Lock)

"VM Thread" prio=10 tid=0x02aad400 nid=0x9ec runnable

"VM Periodic Task Thread" prio=10 tid=0x02b03000 nid=0x90c waiting on condition


JNI global references: 594


Found one Java-level deadlock:
=============================
"Thread-1":
waiting to lock monitor 0x02b20ca4 (object 0x2b77f0a0, a java.lang.String),
which is held by "Thread-0"
"Thread-0":
waiting to lock monitor 0x02ab492c (object 0x2b77f0b8, a java.lang.String),
which is held by "Thread-1"

Java stack information for the threads listed above:
===================================================
"Thread-1":
at DeadlockThread.grabSecondLock(DeadlockExample.java:137)
- waiting to lock <0x2b77f0a0> (a java.lang.String)
at DeadlockThread.grabFirstLock(DeadlockExample.java:131)
- locked <0x2b77f0b8> (a java.lang.String)
at DeadlockThread.grabLocks(DeadlockExample.java:125)
at DeadlockThread.run(DeadlockExample.java:105)
"Thread-0":
at DeadlockThread.grabSecondLock(DeadlockExample.java:137)
- waiting to lock <0x2b77f0b8> (a java.lang.String)
at DeadlockThread.grabFirstLock(DeadlockExample.java:131)
- locked <0x2b77f0a0> (a java.lang.String)
at DeadlockThread.grabLocks(DeadlockExample.java:125)
at DeadlockThread.run(DeadlockExample.java:105)

Found 1 deadlock.

Heap
def new generation total 960K, used 280K [0x22940000, 0x22a40000, 0x22e20000)

eden space 896K, 31% used [0x22940000, 0x229861a8, 0x22a20000)
from space 64K, 0% used [0x22a20000, 0x22a20000, 0x22a30000)
to space 64K, 0% used [0x22a30000, 0x22a30000, 0x22a40000)
tenured generation total 4096K, used 0K [0x22e20000, 0x23220000, 0x26940000)
the space 4096K, 0% used [0x22e20000, 0x22e20000, 0x22e20200, 0x23220000)
compacting perm gen total 12288K, used 365K [0x26940000, 0x27540000, 0x2a94000
0)
the space 12288K, 2% used [0x26940000, 0x2699b768, 0x2699b800, 0x27540000)
ro space 8192K, 63% used [0x2a940000, 0x2ae58810, 0x2ae58a00, 0x2b140000)
rw space 12288K, 53% used [0x2b140000, 0x2b7ad300, 0x2b7ad400, 0x2bd40000)

没有评论:

发表评论