- français
- English
resources/storage
Managing Research Data
At CHILI, we generate quite often large experimental datasets (videos, eye-tracking logs, etc.). Depending on the nature of these datasets, you need to hanfle them differently.
There are two main categories:
- transient datasets: data that have a temporary value, either for further processsing or for sharing, but do not need to be kept over long periods of time. You can store those files on the "chili-research" (aka. "icchilisrv1" development server, or, if you want the share the datasets only internally at EPFL, you can use the Windows share "scxcraft.epfl.ch".
- archived datasets: if you produce valuable datasets that you want to make public (typically, datasets you refer to in scientific publications or in your PhD thesis), then you must use the Zenodo platform.
Often, the datasets you produce should also be presented on the CHILI lab's webpage. Do not hesitate to extract short samples (like 30 seconds of video) and to add them on your project's page.
At your desk
For the common case, the data on your workstation -- laptop or desktop -- is not considered critical. If you believe it is, then you should find an ad'hoc solution to safely backup the data, for instance using TimeMachine on Mac OS X, using an external harddrive, etc.
If you want to talk about this, please get in touch with your IT staff to discuss what kind of solution could be suggested.
But still, in case of theft while travelling, or someone breaking into your office, all your equipment might disappear, along with your backup medium.
Know and manage the risks regarding your workstation.
Unfortunately, as of now, we cannot backup everyone's harddisk to the IC faculty's storage servers.
Home directory
In the IC faculty, research-related data processing is mostly done on UNiX systems. The "home directory" idiom is very much a UNIX concept.
The home directory is a private storage space meant to help you get your work done on UNIX computers. It is a safe and private storage space.
Safety is achieved by setting up such a workspace on enterprise-class storage servers providing advanced features, such as redundancy and snapshots.
Privacy is implemented using standard UNIX access control feaures such as user and group permissions, and/or access control lists (acls).
On data processing servers, the home directories are automatically made available using a feature called "automounting". This is the place you will be logged in by default on such servers.
It is strongly NOT recommended to create local user accounts with local home directories, on data processing servers.
The only way to access your home directory is from a data processing server, using NFS. It is not allowed to access it from any other location. If you need to move files around, use an ad'hoc method such as remote copy or file transfers.
How to access your home directory and more information
MyNAS
See "mynas.epfl.ch".
Very very safe. Snaphsots. 10Gb per collaborator. 3Gb per Guest.
Group storage
Very very safe. Capacity == number of collaborators x 10Gb