Java进程挂在IOUtils上.疑似死锁

我有一个Java进程挂在使用以下代码的IOUtils.toString的调用中:

String html = "";
try {
    html = IOUtils.toString(someUrl.openStream(), "utf-8"); // process hangs on this line
} catch (Exception e) {
    return null;
}

它不能可靠地重现此内容.它是Web爬网程序的一部分,因此成功执行了数千行,但最终导致该过程在几天后停止.

jstack的输出:

2013-09-25 09:09:36
Full thread dump OpenJDK 64-Bit Server VM (20.0-b12 mixed mode):

"Attach Listener" daemon prio=10 tid=0x00007f2b1c001000 nid=0x225a waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Thread-0" prio=10 tid=0x00007f2b34122000 nid=0x187b runnable [0x00007f2b30970000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:146)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
        at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
        - locked <0x00000000e3d2d160> (a java.io.BufferedInputStream)
        at sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:552)
        at sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609)
        at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696)
        - locked <0x00000000e3d30558> (a sun.net.www.http.ChunkedInputStream)
        at java.io.FilterInputStream.read(FilterInputStream.java:133)
        at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:2582)
        at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:282)
        at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:324)
        at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:176)
        - locked <0x00000000e3d317d0> (a java.io.InputStreamReader)
        at java.io.InputStreamReader.read(InputStreamReader.java:184)
        at java.io.Reader.read(Reader.java:140)
        at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1364)
        at org.apache.commons.io.IOUtils.copy(IOUtils.java:1340)
        at org.apache.commons.io.IOUtils.copy(IOUtils.java:1315)
        at org.apache.commons.io.IOUtils.toString(IOUtils.java:525)

我看不到任何在toString方法上设置超时的方法.有什么建议么?这是Apache Commons中的错误吗?还是在我的OpenJDK中?

解决方法:

您对toString()的调用最终被转发到copyLarge().在这里,您可以看到继续从流中读取数据,直到InputStream.read()检测到文件结尾(EOF)标记为止.根据this post,read()可以读取0个字节,即,如果您从中读取的URLConnection没有返回EOF标记,则该方法可能会永远读取0个字节.

也许您可以找出导致问题的URL?

无论如何,要实现超时,您可以在单独的线程中开始每次读取,并在经过一定时间后终止该线程.

上一篇:物流相关


下一篇:python函数基础