How to store my design data safely? 1

hard-disk-driveWhatever is your working structure – big company, small non-profit organization, group of friends or even you alone – your design data are precious and must be kept in a safe storage place. Why should bother? Because those data are the synthesis of hours of design work: you can’t afford losing them!

In the current post I am going to discuss several ways of storing your data with their respective pros and cons:

  1. The old-fashion way: Sheets of paper
  2. Personal computer with local storage
  3. Local computer storage with external back-up
  4. Network storage as the principal files location
  5. Version-controlled storage
  6. Specific, work-oriented proprietary services

Edit: A synthesis table of this post is now available!

Zero risk does not exist. Ever.

Even if I try to list “safer” way of storage, you must be aware that none of them are perfect. Even if it’s provided by professional services, even if it’s installed in your own facilities by certified companies, even if you’re in contract with Superman to keep your data safe! (an other super hero won’t do the job either)
So what’s the point in spending time to “store safer”? The idea is to reduce the probability of a data loss, as close to zero as your organization can afford.

The old-fashion way: Sheets of paper

Sorry for those of you who I hurt using the term “old”, I don’t want to say that the act to work with paper is over. Working on a sheet of paper is handy and fast, perfect for personal or group reflection (whiteboard included). Beyond the fact that the paper work is not necessarily easy to edit nor to copy or share, the risks reside in the storage medium. Paper itself is fragile, the lifetime of an ink is not infinite, and the more data you have the more you increase the number of sheets, which makes it possible to lose a part of a whole work.
Of course paper can be used as an output, without much risk. It is well suited for a project deliverable for example, as soon as you keep a copy in your own archives. For all other uses, manipulating design data on paper should be done with much organization and rigor. You should also be particularly careful if the sheets of paper need to be moved from a location to an other.

Risks:
  • Storage in a single location: theft, fire, natural catastrophy
  • Storage based on single copies: tear up, volatility of each sheet (easy to lose on travel)
How to improve:
  • Make copies and distribute them in several locations
  • Make the transition to a backed-up digital solution, but don’t forget to stay organized even with digital files

Personal computer with local storage

That’s probably the starting point for any small organization, or for anyone who buys a computer to work with. This is handy, everything is right in there and ready to be edited, presented, sent, etc. In my mind, this way of doing is NOT BETTER than the sheets of paper. The reason is that your data are stored in a single location: if a physical breakdown occurs, the chance that everything is lost is quite high. If something is recoverable, you will spend several hours or hundreds of euros/dollars to get a small portion of your files back.
The danger is even bigger if your computer is a laptop. If you travel, it is subject to physical damage or theft, or other threats like a “USB killer“.

Risks:
  • Storage on a single disk: physical or logical breakdown
  • Storage in a single location: theft, fire, natural catastrophy
  • Moving storage (in case of a laptop): physical damages, theft, other malicious actions
How to improve:

Local computer storage with external back-up

When you have a computer and care about your data, you most probably plan to back them up on a second device. I have found 2 categories of external back-ups:

  • Removable media:
    • Optical: CDs, DVDs
    • Mass storage: memory sticks, external HDDs
  • More advanced solutions: tape backup, NAS and other file servers -> this is the way to go if there are several people in your organization

Whatever the back-up media, you need to define a strategy:

  • With overwrite (previous version of the files are not kept):
    • manual raw copy
    • intelligent copy (do not copy a file that already exists and is identical, for performance improvements)
    • scheduled intelligent copy
  • Without or with limited overwrite
    • scheduled rotating copies (always keep at least 2 back-ups at a time)
    • incremental copies (copy only what need to be copied, based on the modifications against the last back-up)

This strategy would probably be based on a reliable and powerful backup software. Be careful about software which do not support bidirectional copies, which makes them unsuitable for group work. And be aware that even with bidirectional support, managing conflicts is a nightmare. That’s why I highly recommend to use a centralized shared storage + backup, which would make your life easier for the sharing part and for the data safety. You could even use a specific box to share your USB disk to the network.

To be considered “backed-up”, the data must exist on the original storage AND the back-up storage at the same time! A move or “cut and paste” is not a back-up in my opinion. Please also note that an “external back-up” is not suited for sharing files in any way. For a better way to share files, have a look at the network storage solution.
Be aware: storage space may be expensive, and so back-ups are. Always tell you that this is the price to pay for more safety ; prevention is better than cure! You must also know that storage mediums have their own lifetime. For example, optical disks burnt at home are known to have very limited robustness over time. On the contrary, using an external RAID1 USB disk will decrease the probability of a data loss. You must do some research about user experiences and take the lifetime into account in the overall budget calculation.

Risks:
  • The back-up may not be done if not scheduled automatically
  • Doing a back-up of several computers may be tedious
  • The lifetime of your back-up medium may be worse than the main storage itself!
How to improve:
  • Schedule the back-ups
  • Never overwrite the previous back-up (prefer rotating or incremental copies)
  • Do not attempt to use the back-up storage as a way to share some files, use a network storage instead

Network storage as the principal files location

In a world of “Internet of everything” you may wish to store your data over there in the so called “Cloud”. This is fair and could give you a high level of service, but what about the data safety? Have you read the Terms and conditions? Do you plan to use a free or a professional service?
Here is an uncomplete list of some services that offer data sharing features:

An alternative is to “be your own cloud” by managing some network shared space(s), accessible either locally or by the world-wide Internet (WAN). This means that you need to run a computer on which there are one or several storage mediums. It could be:

To keep it easy for the user, a relevant solution should offer the possibility to access the network files directly through your computer’s file explorer. Some services require a proprietary software to be installed ; it will do the job but will be harder to manage than the built-in tools. And of course a web interface is cool but might not be enough to work on the files, i.e. open, edit and save them.
Whatever the solution you will choose or have already chosen, the data safety issue remains intact. You are strongly advised to:

  • Monitor the health of the storage space, with live notifications and means of recovery if available,
  • Back it up to an other (safe) place.

“Again a backup” you said? Yes, again! What we are talking about in this paragraph is a place to store and share data, not necessarily a “safe” place. You must find a way to make back-ups, like you would do with a local storage. Fortunately this is simpler to back-up a single “shared” drive than dozens of user computers!
Please note that a mirror RAID is not considered as a back-up ; it is only a high availability feature. For example a user can corrupt some data, and those data will automatically be replicated as is, i.e. corrupted. And what if your hardware RAID controller breaks down?

You might say that, with network storage solutions, working offline is a problem. Yes, it probably is, but I consider this topic as out of scope because this is not about data safety.

Risks:

Same risks as if it were a local storage, except for some integrated failover and back-up mechanisms

How to improve:

Same solutions as if it were a local storage:

  • Schedule the back-ups
  • Never overwrite the previous back-up (prefer rotating or incremental copies)

Version-controlled storage

Version or revision control systems are absolutely priceless for designers and engineers. That’s a way to keep track of what has been done, even by other teammates, always knowing that we can go back to an earlier version very easily.
Revision control systems exist for documents, file content, databases. Example: an article of a blog may be “versioned” by the blog system.
One of the first version control systems (VCS) for files were designed for software designers (source code files) but to my mind today systems can satisfy almost everyone. They are to be used preferably for design files but they are not limited to them.

You can get some free or paid storage space on a server equiped with a VCS hosted service. Here is a non-exhaustive list:

Or you can install an on-premise system on your own server. Check-out SVN, Git, Mercurial, or some non-free systems.

Like with RAID mirroring, a VCS cannot be considered as a back-up system because all the data are stored in a single place.
That is why you are strongly advised to:

  • Monitor the health of the VCS server, in particular its storage
  • Back-up the VCS data and configuration to an other (safe) place
  • Try the restoration procedure at least once to understand how it works

Hosted VCS services often have integrated back-up features but you may need to pay for that. In any case, always check how simple it is to restore the backed-up data before going to a “live” system.

Risks:

Same risks as if it were a local storage

How to improve:

Same solutions as if it were a local storage:

  • Schedule the back-ups
  • Never overwrite the previous back-up (prefer rotating or incremental copies)

Specific, work-oriented proprietary software and services

For a number of professional software, the editor offers a solution to manage the associated data. I don’t know those tools well, the only things I can say are that they might be very expensive and that some of them can include a form of version control, but not all.
Some examples:

I let you get some more information about them. If they seem to be interesting to your business, always ask to test before buying!

Be careful: For most of those software, they do not deal with data safety. Again, you need to back-up the data in an other location. Because they are proprietary software, identifying the exact location(s) of what need to be backed-up may not be obvious…

Risks:
  • Proprietary software which may be opaque in their way of saving data
  • May use their own “cloud” servers to back-up your local data -> should you trust them?
How to improve:

This can’t be answered because it is software dependent.

Conclusion

Now that you have read all this, you think: “does he really apply all these rules for himself?”

Well, I must admit: not really. As said at the beginning, you should adapt your solutions to your time and financial resources, and to the criticality of your data!
In the personal context, I own a NAS with 2 disks in RAID1 that I use as my main storage. As a result my files are shared between my computers, and their respective local storage are used at a minimum. The NAS also runs a VCS server (Subversion a.k.a. SVN). I have no external backup plan for the moment, which is not good because RAID1 is not considered to be a way to back-up files. But I want to install a scheduled incremental backup on a 3rd disk.
In a professional environment, the disk management/aggregation would probably be a RAID10 (4 disks) instead of RAID1 (I personally prefer to avoid RAID5). I might also add a second NAS in parallel to build a high availability cluster which would allow a failover mechanism. Ideally the “master” NAS would also be backed-up off site (Cloud? Other site of the company?).

Be careful about…

Last but not least, some of the traps in which you must not fall:

  • RAID0: This type of disks aggregation does not work as a mirror and does not offer any fault tolerance. It is very attractive for the space it offers but if one disk fails, all the data of all the disks of the array are lost!
  • Restoration: You must test your backup system by restoring some backup files before going “live”. What’s the point of doing backups if they can’t properly be restored?
  • Viruses: Any file can contain viruses, that’s why you must regularly check your main AND backup storage for viruses. The antivirus might even be included in your storage solution! (e.g. advanced NAS)

Keep cool and good luck!

Edit: A synthesis table of this post is now available!

One comment on “How to store my design data safely?

  1. Pingback: Gotomation // SVN version control for MS Office documents

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: