Luminous Landscape Forum

Equipment & Techniques => Digital Asset Management => Topic started by: indusphoto on November 09, 2016, 03:08:06 am

Title: Uploading 3TB to Online Backup
Post by: indusphoto on November 09, 2016, 03:08:06 am
I have signed up with Amazon Drive with unlimited plan to make an off-site backup to my local backup.

Most of my backup is original Raw file and Lightroom and Capture One catalogs. I started uploading using a backup software GoodSync. However I am only getting about 1MB/sec speed, Which, when I think about it, should be as expected as my upload speeds are capped around 10Mb/s (about 1.25MB/sec).

The backup software automatic ETA calculator is showing about 800 hours remaining (over a month of continuous upload). It is not looking feasible for me as not only will this take a month of continuous upload (assuming things go smoothly for the entire month), new uploads and edits will take days for a shoot lasting couple of hours.

I can't imagine that my data size is out of the normal for photographers out there, and neither is my upload speed. So I am curious to know how other photographers have managed to get their data onto Amazon Drive (or other online backups) initially.
Title: Re: Uploading 3TB to Online Backup
Post by: drmike on November 09, 2016, 03:57:24 am
I am in the same position although my data is not photographs. I's all well good if you live in a city with fast speeds but I have slow uploads as do many people I know so I'll be interested in the replies.

As a total aside I was astonished at how good India is for cell coverage and data speeds. Way better than my rural home in the UK.
Title: Re: Uploading 3TB to Online Backup
Post by: elliot_n on November 09, 2016, 06:08:34 am
One month to upload a 3TB archive seems ok. What's the problem?
Title: Re: Uploading 3TB to Online Backup
Post by: Bart_van_der_Wolf on November 09, 2016, 06:41:35 am
I can't imagine that my data size is out of the normal for photographers out there, and neither is my upload speed. So I am curious to know how other photographers have managed to get their data onto Amazon Drive (or other online backups) initially.

Hi,

Have you checked whether your subscription includes the possibility to send them a hard disk with the data?

Cheers,
Bart
Title: Re: Uploading 3TB to Online Backup
Post by: bassman51 on November 09, 2016, 09:05:59 pm
I don't think there's any way around the first upload timeframe.  I would make a local copy on a hard drive and take it offsite until it completes.  Thereafter, you'll complete the upload even on a busy day in hours or less - a 30mb raw file would take you 25 sec or so; 100 such files ~40 minutes. I don't format my SD cards until I've received confirmation from my backup vendor (CrashPlan) that I'm at 100%.  I also use Time Machine for a local copy, which obviously goes much more quickly. 

You might want to look into a higher speed service, if only for the initial load.   

My service is 25mbits/second and I feel like I almost always get backed up quickly enough.   The only time it lags is when I might return from a trip with 100 gig of pictures, then it's days before I'm caught up.  But of course, I've already been exposed for a couple of weeks while I'm away. 
Title: Re: Uploading 3TB to Online Backup
Post by: BobShaw on November 10, 2016, 12:36:54 am
Online backup is basically impractical for that reason.
Title: Re: Uploading 3TB to Online Backup
Post by: davidgp on November 10, 2016, 01:40:35 am
I have uploaded 2,5tb to backblaze... now I have a fast connection, i can upload like 80GB per day... that it is more than ok for the amount of work I do... but when I did the first one I had a worst connection and it took several months...


http://dgpfotografia.com
Title: Re: Uploading 3TB to Online Backup
Post by: davidgp on November 10, 2016, 01:42:33 am
If you think that after you finish the initial backup your upload speed is enough to upload quickly than you input data in your hds it should be ok


http://dgpfotografia.com
Title: Re: Uploading 3TB to Online Backup
Post by: BobShaw on November 10, 2016, 04:53:21 am
Having your hard drives chug away for several months 24 hrs a day is going to make you need that backup. Then if you can afford to have no computer for several months while it restores then great. If you do waste your time on this then don't make it your only backup. Ideally at best the third option.
Title: Re: Uploading 3TB to Online Backup
Post by: Bart_van_der_Wolf on November 10, 2016, 06:18:50 am
If you think that after you finish the initial backup your upload speed is enough to upload quickly than you input data in your hds it should be ok

That's the whole issue. Uploading a whole collection of data that was collected over a long period will be slow due to the sheer volume. After that, maintaining a regular upload stream for new work should be much less of an issue. Especially if the backup software is intelligent (e.g. transferring compressed data, and clever hashing techniques for data integrity verification). A catalog system that only needs to backup the metadata changes (sidecar files) instead of the full image data + metadata as a single file (like DNGs) also makes for lower data volumes during backup.

Another issue might present itself if a less reputable cloud storage provider is used. If they terminate their service, one usually/hopefully gets a limited time to download all data. If the volume is too large, then one may run out of time. So never use it as the only backup. The convenience of cloud storage is mainly in that it is accessible from multiple physical locations, and one may hope for decent server backups to account for drive failures. So using a reputable provider is the safe way to go. It also stands a better chance for protection against DDOS attacks which may cripple access to whole servers, because they probably use higher quality personnel.

Cheers,
Bart
Title: Re: Uploading 3TB to Online Backup
Post by: davidgp on November 10, 2016, 06:50:47 am
That's the whole issue. Uploading a whole collection of data that was collected over a long period will be slow due to the sheer volume. After that, maintaining a regular upload stream for new work should be much less of an issue. Especially if the backup software is intelligent (e.g. transferring compressed data, and clever hashing techniques for data integrity verification). A catalog system that only needs to backup the metadata changes (sidecar files) instead of the full image data + metadata as a single file (like DNGs) also makes for lower data volumes during

Not sure about the amazon software the original poster mentions... backblaze compress and encrypts data before sending it... but don't expect that you are going to reduce too much a loseless compressed raw file... or a zip compressed tiff file



http://dgpfotografia.com
Title: Re: Uploading 3TB to Online Backup
Post by: Joe Towner on November 17, 2016, 10:47:25 pm
So the way you are using the backup is slightly out of order.  Don't worry about stuff from last year, point the backup to your current working directory.  Once that's uploaded, point it to slightly older photos, and keep adding progressively older directories.  If you point it at the base of your entire photo library, it'll start with 1997 photos, which I think while important, isn't anywhere near as important as last weeks photos.

Initial backups will always suck, focus on backing up your current work product and add in older stuff as you have time and bandwidth.
Title: Re: Uploading 3TB to Online Backup
Post by: Farmer on November 18, 2016, 04:38:49 am
A few comments:

Some providers (such as Crashplan) allow you to order a harddrive from them, you connect it to the software and fill it, and send it back - this becomes your seed.  When I did it, it was only 1TB - I'm not sure what the current limit is, but it made for a good head start.

After the initial effort, it's not that hard to maintain, even on a relatively slow connection.

Your hard drives won't have any issues being on and being read for a few months - the data access/transfer rates are tiny compared to what they're capable of when you have such a relatively slow transfer rate.  It will have practically zero effect on their life.

Restoring is easy - download speeds are always much higher (if the server allows for it) and, again, some providers such as Crashplan let you order a hard drive returned up to a certain size.
Title: Re: Uploading 3TB to Online Backup
Post by: Hans Kruse on November 18, 2016, 05:37:44 am
I also use Backblaze for online backup and have about 3TB backed up. I have a 100 Mbps symmetric connection and I only get the full 100 Mbps bandwidth if I select the maximum 10 threads in the performance preferences. The online backup occurs without any intervention. The client installed on the system just does it's work. If files are moved from one drive to another Backblaze will identify the files as already backed up and will not be transferred again. The same applies when I upgraded from one machine to a new machine.

I have used this service since 2012 and it has worked really well. Highly recommended.
Title: Re: Uploading 3TB to Online Backup
Post by: JoeKitchen on November 25, 2016, 09:55:31 am
Not sure if it is still the case, but I read a while back that Amazon (or maybe another backup site) will slow the upload speed as more data is uploaded.  This is to help keep tabs on how data is uploaded. 

I currently keep all of my images backed up on (a set of) three different hard drives, one (set) of which is located at parents house whom live near me. 

In addition to this, I use DropBox Pro to send finished files to clients and just never remove those files.  These are finished files, so technically not a full backup, but still another way to ensure my files are backed up. 

I was going to look at a full online back up of my files, finished, working and raw, but I have over 8 TB of data.  I can not imagine how long that would take to back up. 
Title: Re: Uploading 3TB to Online Backup
Post by: fdisilvestro on November 25, 2016, 06:48:55 pm
Unless you have a high speed upload connection (there are many options, usually expensive) online backup is a slow proces, with associated long times to upload and download. It should be used as a "last resort" resource in case of major disasters. You need to have additional local backup & redundancy options.

A good excercise is to think in relation to business/operations continuity before deciding on a solution. Think about severity of possible events, likelihood of it happening, best solution and cost (which should include how much time is acceptable to you before restoring to full capability).

Example:
- Major: fire, flood, earthquake; Likelihood: low (well, depending on where you live or the location of your facilities); best solution: Online/offsite backup, cost: many options, but you have to balance the cost of operation vs the cost of waiting for weeks or months before restoring your data

- Medium: disk failure; likelihood: high (in my experience); best solution: Reduntant data arrays (I don't recommend Raid 5 BTW), Cost: low
or data corruption by virus/other issues: likelyhood: medium; best solution: local backup in media that is not permanently connected to ensure data corruption did not reach it.

For major events consider also how are you going to recover your software, especially if you are running discontinued/unsupported software, which you may not be able to reinstall. For this a clone of the O/S disk, separate to the other backups will help too.

In addition you should separate backup sets, as the typical case is part of the data does not change and does not have to be updated frequently, in contrast to work-in-process files that you need to backup each time they change. It is very likely that you'll need to restore the work-in-progrees data as fast as possible and you can wait a few weeks before completing the restore of your archive data.

One note about online backups and some specialized tools: in addition to compression, they also perform data deduplication (in a very simple way: looking for similar blocks of information and storing them once) increasing notably the performance and reducing storage space required. It is recommended to have also a local non-deduplicated backup set just in case.

Remember: There is not such as thing as too many backups
Title: Re: Uploading 3TB to Online Backup
Post by: rdonson on November 25, 2016, 07:34:48 pm
Remember also that an untested backup is a false sense of security.  IT shops have to routinely download, install and test backups to make sure they're fine. 
Title: Re: Uploading 3TB to Online Backup
Post by: fdisilvestro on November 25, 2016, 09:19:49 pm
Remember also that an untested backup is a false sense of security.  IT shops have to routinely download, install and test backups to make sure they're fine.

Absolutely, it is essential to do so
Title: Re: Uploading 3TB to Online Backup
Post by: Farmer on November 26, 2016, 01:51:01 am
Also, since RAID was just mentioned, remember that RAID is not a backup.  It provides continuity of function due to hardware failure, but since it's in the same device/controller/logical drive/etc., it's not backup.
Title: Re: Uploading 3TB to Online Backup
Post by: TonyVentourisPhotography on January 11, 2017, 05:21:17 pm
And if a raid controller fails... no fun.  I had data failure when even a replacement identical controller for the raid could not read the data.  I stopped using raids that day.  And it was the week I was out of town so my only latest project was not backed up.  It was that project I needed to deliver to a client.  The last minute data recovery was horrendously expensive.

I use large drives in duplicates.  I always buy in pairs now.  One stays on site one stays off site.  Nothing critical is on the operating system drives.  I use 6tb drives at the moment.  It was cheaper, easier to work with, and immediate to recover from a failure compared to online storage.  Total usage is somewhere around 16tb I think.  I use Syncovery software to keep everything backed up correctly.
Title: Re: Uploading 3TB to Online Backup
Post by: Bart_van_der_Wolf on January 12, 2017, 06:52:06 am
And if a raid controller fails... no fun.  I had data failure when even a replacement identical controller for the raid could not read the data.  I stopped using raids that day.  And it was the week I was out of town so my only latest project was not backed up.  It was that project I needed to deliver to a client.  The last minute data recovery was horrendously expensive.

Hi Tony,

I agree. Most RAID configurations offer some protection against hardware failure, like one drive suddenly going bad. But it is not a backup solution like real redundancy can offer. When the controller of a RAID goes bad, and it is an additional component that can (and ultimately will) fail, then a lot of costly issues can be the result.

While less efficient for storage space, and a bit more work to maintain, IMHO it's hard to beat multiple copies of different drives with duplicates of the same files. The only thing is that they should be periodically refreshed if not used for a long period of time (the level of magnetism slowly decays).

The benefit of Cloud backups is their location independent availability, but it's relatively slow for large volumes of data.

Cheers,
Bart