虚引用PhantomReference

/ 转载 / 没有评论 / 22浏览

转载地址: https://www.jianshu.com/p/30f4fff2249c

昨天讲了WeakHashMap, 有人私信我说其他的都好理解,但是PhantomReference不知道干啥用的。很多书上都写“为一个对象设置虚引用关联的唯一目的就是能在这个对象被收集器回收时收到一个系统通知"。这句话更不好理解,不是有WeakHashMap了么?在GC的时候可以回收了。为啥再搞一个ghost的幻影 Reference。

WeakHashMap,是内存操作,它的结构中key-value 的value 对应的对象被GC 后,需要把key -value 指针也要清除。但是试想这种场景, 如果不单纯是内存对象呢, 比如你再内存里有个image 的图片对象, 对应的在磁盘上有一份文件,当你内存image 对象被GC后,同时也要把磁盘上的对象删除。这个时候该怎么办?

PhantomReference 在这种情况就就粉墨登场了。 PhantomReference 中有一个最佳实践,可以通过查看:org.apache.commons.io.FileCleaningTracker查看 。

/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *      http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
package org.apache.commons.io;
 
import java.io.File;
import java.lang.ref.PhantomReference;
import java.lang.ref.ReferenceQueue;
import java.util.ArrayList;
import java.util.Collection;
import java.util.Collections;
import java.util.HashSet;
import java.util.List;
 
/**
 * Keeps track of files awaiting deletion, and deletes them when an associated
 * marker object is reclaimed by the garbage collector.
 * <p>
 * This utility creates a background thread to handle file deletion.
 * Each file to be deleted is registered with a handler object.
 * When the handler object is garbage collected, the file is deleted.
 * <p>
 * In an environment with multiple class loaders (a servlet container, for
 * example), you should consider stopping the background thread if it is no
 * longer needed. This is done by invoking the method
 * {@link #exitWhenFinished}, typically in
 * {@code javax.servlet.ServletContextListener.contextDestroyed(javax.servlet.ServletContextEvent)} or similar.
 *
 * @version $Id: FileCleaningTracker.java 1686747 2015-06-21 18:44:49Z krosenvold $
 */
public class FileCleaningTracker {
 
    // Note: fields are package protected to allow use by test cases
 
    /**
     * Queue of <code>Tracker</code> instances being watched.
     */
    ReferenceQueue<Object> q = new ReferenceQueue<Object>();
    /**
     * Collection of <code>Tracker</code> instances in existence.
     */
    final Collection<Tracker> trackers = Collections.synchronizedSet(new HashSet<Tracker>()); // synchronized
    /**
     * Collection of File paths that failed to delete.
     */
    final List<String> deleteFailures = Collections.synchronizedList(new ArrayList<String>());
    /**
     * Whether to terminate the thread when the tracking is complete.
     */
    volatile boolean exitWhenFinished = false;
    /**
     * The thread that will clean up registered files.
     */
    Thread reaper;
 
    //-----------------------------------------------------------------------
    /**
     * Track the specified file, using the provided marker, deleting the file
     * when the marker instance is garbage collected.
     * The {@link FileDeleteStrategy#NORMAL normal} deletion strategy will be used.
     *
     * @param file  the file to be tracked, not null
     * @param marker  the marker object used to track the file, not null
     * @throws NullPointerException if the file is null
     */
    public void track(final File file, final Object marker) {
        track(file, marker, null);
    }
 
    /**
     * Track the specified file, using the provided marker, deleting the file
     * when the marker instance is garbage collected.
     * The speified deletion strategy is used.
     *
     * @param file  the file to be tracked, not null
     * @param marker  the marker object used to track the file, not null
     * @param deleteStrategy  the strategy to delete the file, null means normal
     * @throws NullPointerException if the file is null
     */
    public void track(final File file, final Object marker, final FileDeleteStrategy deleteStrategy) {
        if (file == null) {
            throw new NullPointerException("The file must not be null");
        }
        addTracker(file.getPath(), marker, deleteStrategy);
    }
 
    /**
     * Track the specified file, using the provided marker, deleting the file
     * when the marker instance is garbage collected.
     * The {@link FileDeleteStrategy#NORMAL normal} deletion strategy will be used.
     *
     * @param path  the full path to the file to be tracked, not null
     * @param marker  the marker object used to track the file, not null
     * @throws NullPointerException if the path is null
     */
    public void track(final String path, final Object marker) {
        track(path, marker, null);
    }
 
    /**
     * Track the specified file, using the provided marker, deleting the file
     * when the marker instance is garbage collected.
     * The speified deletion strategy is used.
     *
     * @param path  the full path to the file to be tracked, not null
     * @param marker  the marker object used to track the file, not null
     * @param deleteStrategy  the strategy to delete the file, null means normal
     * @throws NullPointerException if the path is null
     */
    public void track(final String path, final Object marker, final FileDeleteStrategy deleteStrategy) {
        if (path == null) {
            throw new NullPointerException("The path must not be null");
        }
        addTracker(path, marker, deleteStrategy);
    }
 
    /**
     * Adds a tracker to the list of trackers.
     *
     * @param path  the full path to the file to be tracked, not null
     * @param marker  the marker object used to track the file, not null
     * @param deleteStrategy  the strategy to delete the file, null means normal
     */
    private synchronized void addTracker(final String path, final Object marker, final FileDeleteStrategy
            deleteStrategy) {
        // synchronized block protects reaper
        if (exitWhenFinished) {
            throw new IllegalStateException("No new trackers can be added once exitWhenFinished() is called");
        }
        if (reaper == null) {
            reaper = new Reaper();
            reaper.start();
        }
        trackers.add(new Tracker(path, deleteStrategy, marker, q));
    }
 
    //-----------------------------------------------------------------------
    /**
     * Retrieve the number of files currently being tracked, and therefore
     * awaiting deletion.
     *
     * @return the number of files being tracked
     */
    public int getTrackCount() {
        return trackers.size();
    }
 
    /**
     * Return the file paths that failed to delete.
     *
     * @return the file paths that failed to delete
     * @since 2.0
     */
    public List<String> getDeleteFailures() {
        return deleteFailures;
    }
 
    /**
     * Call this method to cause the file cleaner thread to terminate when
     * there are no more objects being tracked for deletion.
     * <p>
     * In a simple environment, you don't need this method as the file cleaner
     * thread will simply exit when the JVM exits. In a more complex environment,
     * with multiple class loaders (such as an application server), you should be
     * aware that the file cleaner thread will continue running even if the class
     * loader it was started from terminates. This can consitute a memory leak.
     * <p>
     * For example, suppose that you have developed a web application, which
     * contains the commons-io jar file in your WEB-INF/lib directory. In other
     * words, the FileCleaner class is loaded through the class loader of your
     * web application. If the web application is terminated, but the servlet
     * container is still running, then the file cleaner thread will still exist,
     * posing a memory leak.
     * <p>
     * This method allows the thread to be terminated. Simply call this method
     * in the resource cleanup code, such as
     * {@code javax.servlet.ServletContextListener.contextDestroyed(javax.servlet.ServletContextEvent)}.
     * Once called, no new objects can be tracked by the file cleaner.
     */
    public synchronized void exitWhenFinished() {
        // synchronized block protects reaper
        exitWhenFinished = true;
        if (reaper != null) {
            synchronized (reaper) {
                reaper.interrupt();
            }
        }
    }
 
    //-----------------------------------------------------------------------
    /**
     * The reaper thread.
     */
    private final class Reaper extends Thread {
        /** Construct a new Reaper */
        Reaper() {
            super("File Reaper");
            setPriority(Thread.MAX_PRIORITY);
            setDaemon(true);
        }
 
        /**
         * Run the reaper thread that will delete files as their associated
         * marker objects are reclaimed by the garbage collector.
         */
        @Override
        public void run() {
            // thread exits when exitWhenFinished is true and there are no more tracked objects
            while (exitWhenFinished == false || trackers.size() > 0) {
                try {
                    // Wait for a tracker to remove.
                    final Tracker tracker = (Tracker) q.remove(); // cannot return null
                    trackers.remove(tracker);
                    if (!tracker.delete()) {
                        deleteFailures.add(tracker.getPath());
                    }
                    tracker.clear();
                } catch (final InterruptedException e) {
                    continue;
                }
            }
        }
    }
 
    //-----------------------------------------------------------------------
    /**
     * Inner class which acts as the reference for a file pending deletion.
     */
    private static final class Tracker extends PhantomReference<Object> {
 
        /**
         * The full path to the file being tracked.
         */
        private final String path;
        /**
         * The strategy for deleting files.
         */
        private final FileDeleteStrategy deleteStrategy;
 
        /**
         * Constructs an instance of this class from the supplied parameters.
         *
         * @param path  the full path to the file to be tracked, not null
         * @param deleteStrategy  the strategy to delete the file, null means normal
         * @param marker  the marker object used to track the file, not null
         * @param queue  the queue on to which the tracker will be pushed, not null
         */
        Tracker(final String path, final FileDeleteStrategy deleteStrategy, final Object marker,
                final ReferenceQueue<? super Object> queue) {
            super(marker, queue);
            this.path = path;
            this.deleteStrategy = deleteStrategy == null ? FileDeleteStrategy.NORMAL : deleteStrategy;
        }
 
        /**
         * Return the path.
         *
         * @return the path
         */
        public String getPath() {
            return path;
        }
 
        /**
         * Deletes the file associated with this tracker instance.
         *
         * @return {@code true} if the file was deleted successfully;
         *         {@code false} otherwise.
         */
        public boolean delete() {
            return deleteStrategy.deleteQuietly(new File(path));
        }
    }
 
}

通过Daemon线程来监控PhantomReference队列。

通过ReferenceQueue引用队列获取无效的文件对象,进行磁盘上的删除

PhantomReference 还有一个经典应用是处理Connection, 具体源码大家可以从mysql-connector-java-5.1.47.jar com.mysql.jdbc.NonRegisteringDriver, 可以看到内部类是要清除网络资源的,当connection对象被回收时。

static class ConnectionPhantomReference extends PhantomReference<ConnectionImpl> {
    private NetworkResources io;
 
    ConnectionPhantomReference(ConnectionImpl connectionImpl, ReferenceQueue<ConnectionImpl> q) {
        super(connectionImpl, q);
 
        try {
            this.io = connectionImpl.getIO().getNetworkResources();
        } catch (SQLException e) {
            // if we somehow got here and there's really no i/o, we deal with it later
        }
    }
 
    void cleanup() {
        if (this.io != null) {
            try {
                this.io.forceClose();
            } finally {
                this.io = null;
            }
        }
    }
}

那connection资源的回收是怎么完成的呢?

public class AbandonedConnectionCleanupThread extends Thread {
    private static boolean running = true;
    private static Thread threadRef = null;
 
    public AbandonedConnectionCleanupThread() {
        super("Abandoned connection cleanup thread");
    }
 
    @Override
    public void run() {
        threadRef = this;
        while (running) {
            try {
                Reference<? extends ConnectionImpl> ref = NonRegisteringDriver.refQueue.remove(100);
                if (ref != null) {
                    try {
                        ((ConnectionPhantomReference) ref).cleanup();
                    } finally {
                        NonRegisteringDriver.connectionPhantomRefs.remove(ref);
                    }
                }
 
            } catch (Exception ex) {
                // no where to really log this if we're static
            }
        }
    }
 
    public static void shutdown() throws InterruptedException {
        running = false;
        if (threadRef != null) {
            threadRef.interrupt();
            threadRef.join();
            threadRef = null;
        }
    }
 
}

AbandonedConnectionCleanupThread这个线程就是从NonRegisteringDriver.refQueue中拿到ConnectionPhantomReference,然后执行cleanup方法,最后删除connectionPhantomRefs这个ConcurrentHashMap中的ConnectionPhantomReference对象,完成connection相关资源的回收。 这里NonRegisteringDriver.refQueue中的PhantomReference就是之前提到由ReferenceHandler线程放进去的ConnectionPhantomReference对象。看到这里大家应该明白,jdbc为每个connection都生成了一个ConnectionPhantomReference,目的是为了当connection对象回收时,顺便回收相关资源。这其实是一个保底操作,是怕connnection资源被上层的连接池或者使用者忘记close,从而导致资源泄漏。

PhantomReference 的弱点 当使用mysql connector一段时间后,性能下降。可以看出old GC 延时严重大概有1秒

alt

看GC log, 大概CMS-remark需要 0.89s,耗时严重。其中weak refs 就花了0.7秒多。

[GC [1 CMS-initial-mark: 2097444K(4194304K)] 2143492K(6081792K), 0.2197240 secs] [Times: user=0.01 sys=0.17, real=0.22 secs] 
[CMS-concurrent-mark-start]
[CMS-concurrent-mark: 0.180/0.180 secs] [Times: user=0.65 sys=0.07, real=0.18 secs] 
[CMS-concurrent-preclean-start]
[CMS-concurrent-preclean: 0.045/0.045 secs] [Times: user=0.05 sys=0.01, real=0.04 secs] 
[CMS-concurrent-abortable-preclean-start]
 CMS: abort preclean due to time 2018-01-13T19:21:41.765: [CMS-concurrent-abortable-preclean: 5.051/5.065 secs] [Times: user=7.08 sys=0.51, real=5.07 secs] 
[GC[YG occupancy: 503315 K (1887488 K)]2018-01-13T19:21:41.768: [Rescan (parallel) , 0.0975470 secs]
[weak refs processing, 0.7034340 secs]
[class unloading, 0.0152410 secs]
[scrub symbol table, 0.0118450 secs]
[scrub string table, 0.0021360 secs] 
[1 CMS-remark: 2097444K(4194304K)] 2600759K(6081792K), 0.8884580 secs] [Times: user=1.17 sys=0.00, real=0.89 secs] 
[CMS-concurrent-sweep-start]
[CMS-concurrent-sweep: 1.571/1.626 secs] [Times: user=2.77 sys=0.15, real=1.62 secs] 
[CMS-concurrent-reset-start]
[CMS-concurrent-reset: 0.079/0.079 secs] [Times: user=0.04 sys=0.08, real=0.08 secs]

那么接下来我们需要做的是确定是哪种Reference比较耗时,然后进行针对性优化。所以加了个参数-XX:+PrintReferenceGC,来具体显示各种Reference的个数和处理时间。

[GC[YG occupancy: 254029 K (1887488 K)]
[Rescan (parallel) , 0.0503640 secs]
[weak refs processing
[SoftReference, 4468 refs, 0.0006040 secs]
[WeakReference, 286808 refs, 0.0336870 secs]
[FinalReference, 35456 refs, 0.0271650 secs]
[PhantomReference, 8041 refs, 3 refs, 0.4335280 secs]
[JNI Weak Reference, 0.0000250 secs], 0.4951020 secs]
[class unloading, 0.0143290 secs]
[scrub symbol table, 0.0110140 secs]
[scrub string table, 0.0015380 secs] [1 CMS-remark: 2098695K(4194304K)] 2352725K(6081792K), 0.6112680 secs] [Times: user=0.76 sys=0.00, real=0.61 secs]

从上面的日志就能很明显看出来是PhantomReference处理时间较长。dump一下heap文件, 发现ConnectionPhantomReference对象非常多有8000多个。

通过GC日志看约有8000多个,也就是有8000多个数据库连接资源,为啥都在Old Gen? 因为连接资源一般存活时间比较久,经过多次Young GC都能存活到Old区。 8000多个连接这确实有点多,其实存活的没那么多,但是因为Old GC的时间间隔比较长,很多废弃的连接都不会及时回收。

解决的方案也简单,自己在代码里主动删掉connectionPhantomRefs这个ConcurrentHashMap中的数据就好了,不要等到GC 回收的时候再清除。

重要的事情大点字:

所以在PhantomReference 跟踪的对象生命力比较长的时候,大家就不能单纯依靠GC 来清除他, 会导致系统性能下降。这个我们在代码中一定要特别注意,不需要就设置成null,防止过多instance进入old gen。