实战Java内存泄漏问题分析 -- hazelcast2.0.3使用时内存泄漏 -- 2
第一種是在ConcuurentMapManager的構造函數中,通過調用node的executorManager中的ScheduledExecutorService來創建每秒運行一次cleanup操作的線程(代碼例如以下)。
因為這是ConcuurentMapManager構造函數的代碼,所以這樣的調用startCleanup的操作是默認就會有的。
另外一種是通過配置文件來觸發startCleanup的運行。配置 PutOperationhandlerif overcapacity policy。我們系統的配置文件沒有配置這方面的policy,全部這樣的方式在我們系統中沒有使用。
第三種是自己直接寫代碼去調用startCleanup函數(public方法。線程安全的). 這個沒有實如今我們的系統中。
所以我的調查方向放在了第一種調用的情況,hazelcast里面的ScheduledExecutorService是通過java.util.ScheduledThreadPoolExecutor 來實現的.
我們通過scheduleAtFixdRate提交了task,scheduleAtFixedRate先把它打包成反復運行的ScheduleFutureTask
<pre name="code" class="java"> public ScheduledFuture<?> scheduleAtFixedRate(Runnable command,long initialDelay,long period,TimeUnit unit) {if (command == null || unit == null)throw new NullPointerException();if (period <= 0)throw new IllegalArgumentException();RunnableScheduledFuture<?> t = decorateTask(command, new <strong>ScheduledFutureTas</strong>k<Object>(command, null, triggerTime(initialDelay, unit), unit.toNanos(period))); delayedExecute(t); return t; }
ScheduleFutureTask的run方法實現又一次schedule:
public void run() {boolean periodic = isPeriodic();if (!canRunInCurrentRunState(periodic))cancel(false);else if (!periodic)ScheduledFutureTask.super.run();else if (ScheduledFutureTask.super.runAndReset()) {setNextRunTime();<strong> reExecutePeriodic(outerTask);</strong>} }delayedExecute里面假設當前worker的數目小于初始化定義的CorePool的數目,就創建新的worker線程,然后把task放到queue里面 private void delayedExecute(Runnable command) {if (isShutdown()) {reject(command);return;}// Prestart a thread if necessary. We cannot prestart it// running the task because the task (probably) shouldn't be// run yet, so thread will just idle until delay elapses.if (getPoolSize() < getCorePoolSize())prestartCoreThread();<strong> super.getQueue().add(command);</strong> } public boolean prestartCoreThread() {return addIfUnderCorePoolSize(null);}private boolean addIfUnderCorePoolSize(Runnable firstTask) {Thread t = null;final ReentrantLock mainLock = this.mainLock;mainLock.lock();try {if (poolSize < corePoolSize && runState == RUNNING)t = addThread(firstTask);} finally {mainLock.unlock();}return t != null;} private Thread addThread(Runnable firstTask) {Worker w = new Worker(firstTask);Thread t = threadFactory.newThread(w);boolean workerStarted = false;if (t != null) {if (t.isAlive()) // precheck that t is startablethrow new IllegalThreadStateException();w.thread = t;workers.add(w);int nt = ++poolSize;if (nt > largestPoolSize)largestPoolSize = nt;try {t.start();workerStarted = true;}finally {if (!workerStarted)workers.remove(w);}}return t; }全部啟動的worker就做一件事情,從queue中取task運行try {hasRun = true;Runnable task = firstTask;firstTask = null;while (task != null || (task = <strong>getTask</strong>()) != null) {<strong>runTask(task);</strong>task = null;}} finally {workerDone(this);}}}Runnable getTask() {<strong> for (;;) {</strong>try {int state = runState;if (state > SHUTDOWN)return null;Runnable r;if (state == SHUTDOWN) // Help drain queuer = workQueue.poll();else if (poolSize > corePoolSize || allowCoreThreadTimeOut)r = workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS);else<strong> r = workQueue.take();</strong>if (r != null)return r;if (workerCanExit()) {if (runState >= SHUTDOWN) // Wake up othersinterruptIdleWorkers();return null;}// Else retry} catch (InterruptedException ie) {// On interruption, re-check runState}} } private void runTask(Runnable task) {final ReentrantLock runLock = this.runLock;runLock.lock();try {if ((runState >= STOP ||(Thread.interrupted() && runState >= STOP)) &&hasRun)thread.interrupt();boolean ran = false;beforeExecute(thread, task);<strong> try {task.run();ran = true;afterExecute(task, null);++completedTasks;} catch (RuntimeException ex) {if (!ran)afterExecute(task, ex);throw ex;}</strong>} finally {runLock.unlock();}}了解了java threadpool的工作原理之后。我們能夠知道。startCleanup是代碼pass給ScheduledThreadPoolExecutor的runnable task,它不被運行,可能的原因有:
1. ScheduledThreadPoolExecutor初始化時候出錯,task全然沒有提交成功。因為lastCleanup并非系統應用的啟動時間,已經過了幾個月了,所以。非常明顯在系統初始化的時候,esScheduled(ScheduledThreadPoolExecutor)還是正常工作的,僅僅是突然在2月4號停止了工作,所以這樣的可能性能夠排除。
2.?? ?Worker 沒有正常工作。不在從ScheduledThreadPoolExecutor的queue里面取數據,這個非常快就被我排除了:
首先heap dump中有5個pending workers in esScheduled (0/2/3/5/9):
其次從thread dump中能夠看出,這五個線程都是在等著從queue里面取數據:
……<strong> at java/util/concurrent/locks/AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)[optimiz</strong>ed]at java/util/concurrent/DelayQueue.take(DelayQueue.java:164)[optimized]at java/util/concurrent/ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:609)[inlined]at java/util/concurrent/ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:602)[optimized]at java/util/concurrent/ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)[optimized]at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)at java/lang/Thread.run(Thread.java:662)at jrockit/vm/RNI.c2java(JJJJJ)V(Native Method)-- end of trace hz._hzInstance_1_com.ericsson.ngin.session.ra.hazelcast.scheduled.thread-2" id=51 idx=0xd8 tid=32639 prio=5 alive, parked, native_blocked hz._hzInstance_1_com.ericsson.ngin.session.ra.hazelcast.scheduled.thread-3" id=52 idx=0xdc tid=32640 prio=5 alive, parked, native_blocked hz._hzInstance_1_com.ericsson.ngin.session.ra.hazelcast.scheduled.thread-4" id=53 idx=0xe0 tid=32641 prio=5 alive, parked, native_blocked hz._hzInstance_1_com.ericsson.ngin.session.ra.hazelcast.scheduled.thread-5" id=75590 idx=0x3cc tid=3308 prio=5 alive, parked, native_blocked 所以worker不正常也被排除了。3.? 我們提交給系統的runner task自己主動從queue里面消失了,從memory dump中確實發現queue沒有tasks了
而沒有task的原因非常明顯是由于當前task運行完之后沒有又一次reschedule,至于原因,由于scheduledFutrueTask已經不存在,無法從memory dump和thread dump中分析出結果,成為了一個謎。。
。
。
。。
轉載于:https://www.cnblogs.com/liguangsunls/p/6898371.html
總結
以上是生活随笔為你收集整理的实战Java内存泄漏问题分析 -- hazelcast2.0.3使用时内存泄漏 -- 2的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: MFC的消息循环
- 下一篇: 零基础自学编程前需要知道的知识