Turns out that all the Name Node problems we've been having were leaving 'temporary' files in HDFS and for whatever reason when we restarted the Name Node it wouldn't fix them.
I found them under: /log/hadoop/tmp/mapred/staging/<
After confirming that the users weren't running active jobs, removing these directories via the command line reduced the number of blocks in the report and eventually all were cleared.
FYI our Name Node problems APPEAR to have been resolved in Cloudera CDH3 u3. Name Node has been up for 3 days now. Previously we were lucky if it lasted 48 hours.
No comments:
Post a Comment