Pages: 1 [2]   Go Down

Author Topic: Data Storage on HDD's - Longevity Question  (Read 16035 times)

OldRoy

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 470
    • http://
Data Storage on HDD's - Longevity Question
« Reply #20 on: December 27, 2008, 07:07:41 am »

As ever, these discussions come down to semantics to a certain extent. For most non-commercial users a secondary drive with a copy of the data is a "backup". A mirror RAID array does almost the same job. And before anyone says it, the issue of mirror copies of corrupted data seems to me to be nearly identical to a "backup" of corrupted (or accidentally destroyed) data: finger trouble is a greater hazard than hardware failure. A fully redundant RAID array (only had any experience of RAID 5, and then not much hands on) offers the equivalent of backup since any single disk failure in the array is operationally recoverable. So a RAID system offers both storage and the equivalent of "backup" - for most amateurs at least.

Personally I use a RAID 1 array as primary data storage on one of my PCs, backed up manually by a Robocopy script to (1) an independent internal drive and (2) an external usb HD. Where "2" is located depends a bit where I am. On another PC, which is now my primary box I use one drive for os and applications, plus non-critical data or ad hoc copies, an internal primary data drive plus backup drive - Robocopied as previously - plus an external eSata drive as secondary backup. The latter only powered up at backup time. I'm as nervous about hardware failures as anyone but in my experience HDs are amazingly reliable and a drive that's not running continually for years is likely to last a very long time; of course the point about validation is well made. As the per Gb cost of storage comes down I see no reason not to treat HDs as write-seldom, read-seldom devices.
Roy
Logged

Jack Flesher

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 2592
    • www.getdpi.com
Data Storage on HDD's - Longevity Question
« Reply #21 on: December 27, 2008, 12:14:53 pm »

Quote from: Roy
For my needs, backup should protect against equipment failure, human error, fire, theft, and natural disaster. The key is a frequently refreshed, off-site, off-line copy, and workflow that ensures there are always two or three copies of new data on-site until a copy goes off site.

Well we're getting a bit OT for the OP, but you raise a good point.  My setup is similar to yours:

Striped array on main box for read/write performance on current working files and the last 18 months or so of historical images.  That array is auto backed up using "Carbon Copy Cloner" to an easy RAID-5 device (in my case a Drobo with 4@1TB drives) which also contains all my historical data.  The Drobo is then backed up monthly and after every large shoot to individual drives stored off-site, also using CCC.  

Carbon Copy Cloner runs on a regular schedule or can be run manually whenever desired, or will run when it detects a specific drive being plugged in, a convenient feature for keeping the offsite set in synch.  In addition, CCC allows you to not copy file erasures from the source to the target, so it alleviates that issue.
Logged
Jack
[url=http://forum.getdpi.com/forum/

Farmer

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 2848
Data Storage on HDD's - Longevity Question
« Reply #22 on: December 27, 2008, 03:32:55 pm »

Quote from: OldRoy
As ever, these discussions come down to semantics to a certain extent. For most non-commercial users a secondary drive with a copy of the data is a "backup". A mirror RAID array does almost the same job. And before anyone says it, the issue of mirror copies of corrupted data seems to me to be nearly identical to a "backup" of corrupted (or accidentally destroyed) data: finger trouble is a greater hazard than hardware failure. A fully redundant RAID array (only had any experience of RAID 5, and then not much hands on) offers the equivalent of backup since any single disk failure in the array is operationally recoverable. So a RAID system offers both storage and the equivalent of "backup" - for most amateurs at least.

Personally I use a RAID 1 array as primary data storage on one of my PCs, backed up manually by a Robocopy script to (1) an independent internal drive and (2) an external usb HD. Where "2" is located depends a bit where I am. On another PC, which is now my primary box I use one drive for os and applications, plus non-critical data or ad hoc copies, an internal primary data drive plus backup drive - Robocopied as previously - plus an external eSata drive as secondary backup. The latter only powered up at backup time. I'm as nervous about hardware failures as anyone but in my experience HDs are amazingly reliable and a drive that's not running continually for years is likely to last a very long time; of course the point about validation is well made. As the per Gb cost of storage comes down I see no reason not to treat HDs as write-seldom, read-seldom devices.
Roy

Sorry, I disagree.  It's not a matter of semantics - it's a matter of being precise.  RAID, in and of itself, is not a backup.  Only a detached copy is a backup.  It's a dangerous, dangerous trap to think that "amateurs" will get by with just RAID.  The people here, by and large, have more data than most small businesses that aren't in the imaging industry.  For a digital photographer, the entire capital value of your business or hobby is almost completely represented by that data.  Given the lack of cost in maintaining a real backup, we should never allow a single RAID to be considered a backup.  It's merely a means of increasing the MTBF or a means of increasing performance, or both.  Even if you had a series of striped drives handled in RAID 6 and then Mirrored, you still are only increasing the MTBF and the performance.  It's not a detached copy so it's not a backup.  Any single system is exactly that - a single system.  A backup is another system providing another copy (whatever that system may be - single drive, tape, optical, HDD, etc).

The reason I harp on about this is because it's critical that people understand that RAID won't magically protect you from data loss.  No backup scheme is perfect, of course, but having copies on other detached devices is far, far better than a single RAID.

Robocopy is an excellent suggestion for the PC side.  Synctoy 2.0 (also available as a 64bit binary) is another very useful tool on the PC side.
Logged
Phil Brown

dalethorn

  • Guest
Data Storage on HDD's - Longevity Question
« Reply #23 on: December 27, 2008, 10:22:21 pm »

At least as important as the hardware are the decisions that the software MAY or may not allow you.  For example, do you always assume that backup means creating all new files?  Or that new (newer date/time) files always write over old files?  I don't auto-assume that.  I sometimes write old files over new ones.  And can every user assume that the date/time of existing files on an external HDD are always in sync with the computer they're backing up from?  I don't, since most of my backup drives are FAT-32 format, and my computers are NTFS, where the time shifts one hour twice a year (NTFS -vs- FAT32).

I cannot imagine software that configures these things and then runs automatically and flawlessly.  Of course, it's a lot easier if you work on one project at a time and then archive it forever.
Logged

Jack Flesher

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 2592
    • www.getdpi.com
Data Storage on HDD's - Longevity Question
« Reply #24 on: December 28, 2008, 10:34:00 am »

Quote from: dalethorn
At least as important as the hardware are the decisions that the software MAY or may not allow you.  For example, do you always assume that backup means creating all new files?  Or that new (newer date/time) files always write over old files?  I don't auto-assume that.  I sometimes write old files over new ones.  And can every user assume that the date/time of existing files on an external HDD are always in sync with the computer they're backing up from?  I don't, since most of my backup drives are FAT-32 format, and my computers are NTFS, where the time shifts one hour twice a year (NTFS -vs- FAT32).

I cannot imagine software that configures these things and then runs automatically and flawlessly.  Of course, it's a lot easier if you work on one project at a time and then archive it forever.

I agree with you in theory, but in practice if one has successfully saved all of their original raw files and uses a standardized workflow, regenerating a very similar looking final should not be too difficult...  (In actuality, I find myself often reworking older files anyway since raw converters have improved so much  )
« Last Edit: December 28, 2008, 10:35:20 am by Jack Flesher »
Logged
Jack
[url=http://forum.getdpi.com/forum/

Chris_Brown

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 975
  • Smile dammit!
    • Chris Brown Photography
Data Storage on HDD's - Longevity Question
« Reply #25 on: December 28, 2008, 07:50:22 pm »

Although disk drives are not "archival" in the strict sense of the word, it is the cheapest media available. The current technique amongst archivists, for example this project, is to use large RAID 1 arrays and simply swap out drives as they go bad.
Logged
~ CB
Pages: 1 [2]   Go Up