Prevent File Loss on the HPCC

From time to time, we receive requests from users to restore their deleted files. In this article, we would like to share with our users some tips and tricks for preventing file loss.

First, users will need to understand the file system on HPCC so that they can store their files in the right location. On HPCC, there are several disk spaces that each user has. They could be put into two categories: One with system automatic backup (standard portion of Home and research group space), and the other with system automatic purge (local space and scratch space, node portion of Home or research group space). For more information on file systems and policy, see our File Systems wiki page. The guidelines for choosing file storage locations are listed on this wiki page: Guidelines for Choosing File Storage and I/O. Next, there are some tips for using these two types of storage respectively.

The space with automatic purge is good for the production phase because the access is either faster, such as LOCAL space, or larger, such as SCRATCH. Users are responsible for protecting their files from being purged. Here are some tips:

  • Some commands used to download or copy files from one server to another server may preserve the file's timestamps on its origin, such as “wget” or “cp”. It is good practice to check the timestamps of the files right after the operation. If the newly downloaded files do not have the new timestamps, check the options of the command. For example, the option “--no-use-server-timestamps” in wget, “--no-preserve=timestamps” in “cp”. You want the timestamps to be as new as possible. 

The space with the system automatic backup is good for the development and testing phases of your research computing because this phase usually involves more manipulations and operations on files and has a higher chance of accidentally removing and destroying files. Here is a list of some operations that users should use cautiously:

  • Command “rm”. There is no such mechanism as “trash bin” on the Linux command line terminal where you could trash your files and later restore them as long as you have not emptied your trash bin. Once the “rm” command is executed, the file is removed immediately. 
  • Command “mv”. This command is used to rename a file or move a file to another location. Before you hit the return key for the “mv sourceFile destFile” command, make sure that the destFile is either not an existing file or safe to be overwritten by sourceFile if it exists.
  • Use wildcards (such as “*”, “?”, “[ ]”). In Linux, a wildcard character is often used to substitute for any other character or characters in a string when specifying file names. It is convenient in the case of removing a large number of files at one command, but it is also in many cases the source of mistakenly removing some files. Users can run “ls” command with the wildcard to see the list of files before running “rm” with the wildcard to remove them.
  • Name your output files. There are ways to designate your program’s output to a named file. Be careful to make sure that the filename is unique.  If you have multiple instances of execution of your program with the same output filename, the output of the different instances will be overwritten. This will lead to the destruction of the output file. 

When you accidentally delete your files or find some files were destroyed or disappeared, you should report the incident to system administrators as soon as you can to ask for the files to be restored. The report should contain both the time when the files were deleted or last seen or accessed, and the location of the files, as accurately as possible. Because of the limit of the storage space, the number of hourly and daily backups is limited. After a while, the older backups will be purged and it is impossible to recover your files if the backup file was removed.  

Last but not least, it is always a good practice that you 

  • Closely monitor the storage usage after work; 
  • Remove files that are no longer needed;
  • Make sure that the timestamps of the files of interest have enough time before being purged;
  • Save final result files to the space with automatic system backups. 

Xiaoge Wang
Research Consultant