snippet of RS1 around the problem time:
2011-05-18 12:04:34,227 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=33.23 MB, total=235.57 MB, single=59.72 MB, multi=206.43 MB, memory=0 KB
2011-05-18 12:04:39,518 WARN org.apache.hadoop.hdfs.DFSClient: DFSOutputStream ResponseProcessor exception for block blk_-2450368273351152067_836091java.io.IOException: Bad response 1 for block blk_-2450368273351152067_836091 from datanode 194.109.159.67:50010
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2497)
2011-05-18 12:04:39,520 INFO org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_-2450368273351152067_836091 waiting for responder to exit.
2011-05-18 12:04:40,521 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_-2450368273351152067_836091 bad datanode[1] 194.109.159.67:50010
2011-05-18 12:04:40,522 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block blk_-2450368273351152067_836091 in pipeline 194.109.159.60:50010, 194.109.159.67:50010: bad datanode 194.109.159.67:50010
2011-05-18 12:04:40,764 WARN org.apache.hadoop.hbase.regionserver.wal.HLog: HDFS pipeline error detected. Found 1 replicas but expecting 2 replicas. Requesting close of hlog.
2011-05-18 12:04:40,923 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction started; Attempting to free 31.57 MB of total=267.1 MB
2011-05-18 12:04:40,927 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Block cache LRU eviction completed; freed=32.06 MB, total=235.04 MB, single=73.87 MB, mul
Snippet from DN at 194.109.159.67
2011-05-18 12:04:40,953 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /194.109.159.67:50010, dest: /194.109.159.67:39072, bytes: 12900, op: HDFS_READ, cliID: DFSClient_hb_rs_ia200121,60020,1305711529274_1305711534462, offset: 323584, srvID: DS-1529689956-194.109.159.67-50010-1294156443216, blockid: blk_-3745720375296353115_619809, duration: 21376431
2011-05-18 12:04:40,966 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock for block blk_-2450368273351152067_836091 java.io.EOFException: while trying to read 153 bytes
2011-05-18 12:04:40,966 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block blk_-2450368273351152067_836091 Interrupted.
2011-05-18 12:04:40,966 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 0 for block blk_-2450368273351152067_836091 terminating
2011-05-18 12:04:40,968 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /194.109.159.67:50010, dest: /194.109.159.67:39073, bytes: 6192, op: HDFS_READ, cliID: DFSClient_hb_rs_ia200121,60020,1305711529274_1305711534462, offset: 104427008, srvID: DS-1529689956-194.109.159.67-50010-1294156443216, blockid: blk_4868596848067919046_626740, duration: 3599803
2011-05-18 12:04:40,990 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: writeBlock blk_-2450368273351152067_836091 received exception java.io.EOFException: while trying to read 153 bytes
2011-05-18 12:04:40,990 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(194.109.159.67:50010, storageID=DS-1529689956-194.109.159.67-50010-1294156443216, infoPort=50075, ipcPort=50020):DataXceiver
java.io.EOFException: while trying to read 153 bytes
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:268)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:312)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:376)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:532)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:377)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:118)
2011-05-18 12:04:41,018 INFO org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace: src: /194.109.159.67:50010, dest: /194.109.159.67:39075, bytes: 66564, op: HDFS_READ, cliID: DFSClient_hb_rs_ia200121,60020,1305711529274_1305711534462, offset: 161104896, srvID: DS-1529689956-194.109.159.67-50010-1294156443216, blockid: blk_-5655743078160046479_626737, duration: 26511604
RS2 snippet:
2011-05-18 12:04:45,377 WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /194.109.159.60:50010 for file /hbase-0.90/arc_contents/
597391443/content/3379373961652576545 for block -5945244681390741817:java.net.ConnectException: Connection timed out
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:408)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchBlockByteRange(DFSClient.java:1889)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1959)
at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:46)
at org.apache.hadoop.hbase.io.hfile.BoundedRangeFileInputStream.read(BoundedRangeFileInputStream.java:101)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at org.apache.hadoop.io.compress.BlockDecompressorStream.rawReadInt(BlockDecompressorStream.java:122)
at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:68)
at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:75)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:102)
at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1094)
at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:1036)
at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.loadBlock(HFile.java:1442) at org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.seekTo(HFile.java:1299)
at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:136) at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:96)
at org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:77) at org.apache.hadoop.hbase.regionserver.Store.getScanner(Store.java:1345)
at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.<init>(HRegion.java:2274) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateInternalScanner(HRegion.java:1131)
at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1123) at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1107)
at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:2996)
at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:2898)
snippet of DN at 194.109.159.60:
2011-05-18 12:04:39,516 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder blk_-2450368273351152067_836091 1 Exception java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129) at java.io.DataInputStream.readFully(DataInputStream.java:178) at java.io.DataInputStream.readLong(DataInputStream.java:399)
at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:119) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:892) at java.lang.Thread.run(Thread.java:662)