Luminous Landscape Forum

Equipment & Techniques => Computers & Peripherals => Topic started by: jonathan.lipkin on April 08, 2012, 08:30:08 pm

Title: Drive capacity/fullness vs. performance
Post by: jonathan.lipkin on April 08, 2012, 08:30:08 pm
About a decade ago, I took a backup server course, and was told that for optimal performance drives should be no more than 80% full. Does that still hold today? I have an eSATA RAID 5 with 4x1T drives, for a total capacity of 3T. It's currently at 2.86, meaning it's about 90% full. Should I migrate some data to another drive?

Couldn't find a more descriptive word than 'fullness'.
Title: Re: Drive capacity/fullness vs. performance
Post by: Ken Bennett on April 08, 2012, 08:48:52 pm
Yes.

http://macperformanceguide.com/Storage-BiggerIsBetter.html
Title: Re: Drive capacity/fullness vs. performance
Post by: jonathan.lipkin on April 09, 2012, 02:22:38 pm
Fascinating. Thanks.
Title: Re: Drive capacity/fullness vs. performance
Post by: alain on April 09, 2012, 05:50:31 pm
Yes.

http://macperformanceguide.com/Storage-BiggerIsBetter.html

The link is not something about 80% full, but that newer "bigger" disk are faster.  This is mainly a data density issue (less rotations for the same amount of data).

The Seagate ST3000DM001 3TB drive seems the fastest at the moment, this thanks to it's 3 1TB platters.
Title: Re: Drive capacity/fullness vs. performance
Post by: jonathan.lipkin on April 09, 2012, 07:29:22 pm
There is a section of the article which describes how drive performance degrades as the drive becomes filled with data (to quote the article "greater capacity means higher performance over the drive (for the same amount of storage used)."), though not quite as clearly as the previous article:

http://macperformanceguide.com/Storage-WhyYouNeedMoreThanYouNeed.html

Bigger drives are faster for a particular amount of data, because you are not filling the drives as much.

Quote:

When setting up a striped RAID with speed as the goal, you should get large enough hard disks to “waste” as much as half of the available storage so that you can maintain peak speed for your data...A good rule of thumb is to make the total volume size be at least 50% larger than the amount of data you expect to store (you can backup and redo it if necessary).
...
This is why using the first half of the drive (or perhaps 2/3) offers much higher performance than filling a hard drive (of any capacity) to the 70%+ level.

/quote

As far as I can tell, this is because
Title: Re: Drive capacity/fullness vs. performance
Post by: Schewe on April 09, 2012, 08:04:07 pm
Couldn't find a more descriptive word than 'fullness'.

HDs start writing on the inner portions (sectors) of a drive. Closer to the middle. The distance between sectors is much closer. The heads that read/write the data move in a much smaller physical space/distance. As you fill the drive, the sectors at the outer positions of the platter are used. That makes the head travel a lot longer to read/write the sectors. When you get past 60%-75% or so capacity, subsequent data is only written/read at the outer sectors which are slower to read/write.

Doesn't matter a whole lot whether or not it's a single, multi-platter drive or a RAID system...the inner sectors of the drivea are always faster to read/write.

Also note that sector/block errors only get worse over time. The fuller the drive, the more work the heads have to do to get the data..

You will ALWAYS be better off using a smaller subset of the total drive capacity because it's faster with less errors...in the case of HDs, bigger is better and faster than with smaller cramped HDs. Time to grow upwards...
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 16, 2012, 06:19:16 pm
About a decade ago, I took a backup server course, and was told that for optimal performance drives should be no more than 80% full. Does that still hold today? I have an eSATA RAID 5 with 4x1T drives, for a total capacity of 3T. It's currently at 2.86, meaning it's about 90% full. Should I migrate some data to another drive?

The URL above is for single drive performance, which is going to be quite different to what you see with RAID 5.

Nonetheless, the issue with drive performance over 80% has little to do with the drive mechanism and more to do with filesystem theory - the more data that is stored in a filesystem, the harder the system has to work to find new space, etc. Around 80% is the tipping point. At 90%, you should be planning to upgrade storage because by the time you actually do it, you'll likely be a lot closer to 100%.
Title: Re: Drive capacity/fullness vs. performance
Post by: Tom Frerichs on April 17, 2012, 10:57:34 am
... which is going to be quite different to what you see with RAID 5.

That's not quite true.  Regardless of the RAID level in use, RAID 0, RAID 1, RAID 5, or RAID 10, underlying drive performance will impact system performance. For example, in RAID 5, total performance will be limited by the slowest drive in the array. One advantage of RAID, however, is that it is commonly implemented with a hardware controller which offers substantial performance improvements through buffering, asynchronous reading/writing, etc.

Also, the earlier point about lower numbered cylinders (closer to the center of the drive) having faster performance is also true. Most UNIXes, if given half a chance, will locate the swap partition, which is used to augment physical memory with disk storage, as the first partition on the drive. I don't own a Mac, but as the underlying OS is a UNIX I'm willing to bet that this applies there, too.

Finally, all file systems use some sort of indexing to both locate files on the drives and to handle free space. As file systems fill up--with a large number of "small files" versus a few huge files--the indexing gets more involved. It also gets harder to allocate contiguous free space, which is worsened when files are added and deleted. NTFS, used by Windows, is particularly prone to this latter issue, which is why Windows has a defragmentation routine. Other file systems--UFS, ZFS, Rieser, etc--suffer from this issue less.

Installing a solid state drive for your scratch disk and possibly your catalog--remembering to back up your catalog frequently--is the best way to improve performance. After all, most of your disk activity deals with reading previews and filtering on metadata rather than actually reading RAW files and writing output files.

Your mileage may very.   ;D

Tom Frerichs
Title: Re: Drive capacity/fullness vs. performance
Post by: jonathan.lipkin on April 17, 2012, 11:11:39 am


Installing a solid state drive for your scratch disk and possibly your catalog--remembering to back up your catalog frequently--is the best way to improve performance. After all, most of your disk activity deals with reading previews and filtering on metadata rather than actually reading RAW files and writing output files.


I've thought of that as well. On my next system (whenever Apple gets around to refreshing the Mac Pro drive, if that ever even happens) I'm thinking of installing an internal striped RAID and keeping the LR catalog there.

Interestingly, the article mentioned above notes that there is no performance penalty when filling an SSD. The performance vs. fullness chart for a traditional HD starts to fall around 50% of capacity. The SSD chart is flat - performance is unchanged by fullness.
Title: Re: Drive capacity/fullness vs. performance
Post by: Tom Frerichs on April 17, 2012, 01:40:32 pm
I've thought of that as well. On my next system (whenever Apple gets around to refreshing the Mac Pro drive, if that ever even happens) I'm thinking of installing an internal striped RAID and keeping the LR catalog there.

Interestingly, the article mentioned above notes that there is no performance penalty when filling an SSD. The performance vs. fullness chart for a traditional HD starts to fall around 50% of capacity. The SSD chart is flat - performance is unchanged by fullness.

The reason SSD doesn't have that kind of penalty is that there is no "head seek time."

In a physical hard drive, the read/write head is on an arm that moves from the center to the outside (and of course back again -- grin).  Think old fashioned phonograph. Instead of a long spiral track, like you had on a phonograph record, the data is recorded in concentric circles--sort of like the rings of a tree or the layers of an onion. Each one of these circles is a "cylinder"

When you read or write to a hard drive, you have to move the read/write head to the correct cylinder, and that takes time. With an SSD, all you do is flip some bits to access a different chunk of memory...no physical movement at all.

If you have a choice, don't put your catalog on a RAID 5 array. For redundancy, RAID 1 (mirrored drives) is a faster alternative. RAID 5, because of the way it writes parity blocks, can have write performance problems.  RAID 1, or the closely allied RAID 10, which is RAID 1+0, is more expensive because you lose 1/2 of your total drive space versus one drive out of the array for RAID 5.

So, RAID 5 for storing your RAW and output files, RAID 1 or even a bare drive with good backups or SSD for your catalog.


BTW, I deal with arrays with 60 TB capacities...but not for photographs.  I don't take anywhere near that number of pictures.  ;)

Tom Frerichs
Title: Re: Drive capacity/fullness vs. performance
Post by: Chris Kern on April 17, 2012, 08:10:32 pm
Quote from: Tom Frerichs link=topic=65613
The reason SSD doesn't have that kind of penalty is that there is no "head seek time."

No rotational latency, either.  In my experience, that's a more serious performance drain than seek time.

Quote from: Tom Frerichs
Finally, all file systems use some sort of indexing to both locate files on the drives and to handle free space. As file systems fill up--with a large number of "small files" versus a few huge files--the indexing gets more involved. It also gets harder to allocate contiguous free space, which is worsened when files are added and deleted. NTFS, used by Windows, is particularly prone to this latter issue, which is why Windows has a defragmentation routine. Other file systems--UFS, ZFS, Rieser, etc--suffer from this issue less.

I'd really like to see broader implementation of ZFS.  It's so easy to manage and the performance with cheap drives—and without a hardware RAID controller—is more than adequate for even a high-end desktop system.  And I couldn't agree more about NTFS.  Microsoft has made enormous improvements in its OS releases in recent years, but NTFS is still quite primitive in some respects.  If I recall correctly, we geezers stopped needing to defragment UFS back around the 7th Edition.

Chris
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 18, 2012, 01:24:57 pm
Quote
Nonetheless, the issue with drive performance over 80% has little to do with the drive mechanism and more to do with filesystem theory - the more data that is stored in a filesystem, the harder the system has to work to find new space, etc. Around 80% is the tipping point. At 90%, you should be planning to upgrade storage because by the time you actually do it, you'll likely be a lot closer to 100%.
...
Finally, all file systems use some sort of indexing to both locate files on the drives and to handle free space. As file systems fill up--with a large number of "small files" versus a few huge files--the indexing gets more involved. It also gets harder to allocate contiguous free space, which is worsened when files are added and deleted. NTFS, used by Windows, is particularly prone to this latter issue, which is why Windows has a defragmentation routine. Other file systems--UFS, ZFS, Rieser, etc--suffer from this issue less.

I don't know where you heard this myth (from vendors?) about other filesystems suffering less from this issue because it simply isn't true. Most other filesystems (UFS, etc) prevent you from completely filling the filesystem for a number of reasons. The performance drop off being one of them.

SSD is good as long as you don't write to it very often. So in light of that, it is good as a data store where you don't delete things. Using SSD for the working store of your catalogue is a bad idea although it will start out quick, performance will degrade.
Title: Re: Drive capacity/fullness vs. performance
Post by: Tom Frerichs on April 19, 2012, 12:28:41 am
This topic has become too technical already, and I don't want to spend the time to look up the resources regarding file systems. However, one point does need to be corrected.

SSD is good as long as you don't write to it very often. So in light of that, it is good as a data store where you don't delete things. Using SSD for the working store of your catalogue is a bad idea although it will start out quick, performance will degrade.

I just decommissioned four servers running PostgreSQL databases. These servers were not as high-performance as many of the readers of this forum may be running right now.

Each database received between 20 to 22 million updates/inserts every 24 hours. When PostgreSQL updates a record, it doesn't write the data over the same location on the drive. Instead it writes a new record in a different location and marks the old one as dead and available for later harvesting and reuse. I think you must agree that this represents a much higher level of disk read/write activity than even the most rabid LR user would cause.

Those servers had been running for over two years without failure and without any performance degradation.

And the database files were on inexpenisive SSD drives.

I'll stand by what I said. Other than a slightly worse MTTF on SSD drives--which is why I recommended being very scrupulous about backing up the catalog--there is no penalty storing the catalog and scratch files on an SSD and significant performance gains in doing so.
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 19, 2012, 03:48:20 am
To compare "fresh" performance with "used" performance.

http://www.tomshardware.com/charts/ssd-charts-2010/Used-state-Access-Time-Read,2326.html
Title: Re: Drive capacity/fullness vs. performance
Post by: Rhossydd on April 19, 2012, 04:26:26 am
Using SSD for the working store of your catalogue is a bad idea although it will start out quick, performance will degrade.
From what I've read although SSDs get slower with use, they never dip below normal HDD performance levels. There's just so much more speed there to begin with.
Title: Re: Drive capacity/fullness vs. performance
Post by: Walter Schulz on April 19, 2012, 03:34:29 pm
From what I've read although SSDs get slower with use, they never dip below normal HDD performance levels. There's just so much more speed there to begin with.

c't (german IT magazine) thinks different. There is a duration stress test running. A "Solid 3" SSD reached serious degration after 18 TBytes (write) and  dropped below 20 percent of it's initial write performance after ca. 22 TBytes. A "FM 25" performed with its maximum performance until 152 TBytes were written and dropped below 10 percent after.
Issue 3/2012.

I suppose truth is somewhere between "quick" and "never".

Ciao, Walter
Title: Re: Drive capacity/fullness vs. performance
Post by: Rhossydd on April 19, 2012, 04:03:29 pm
c't (german IT magazine) thinks different. There is a duration stress test running. A "Solid 3" SSD reached serious degration after 18 TBytes (write) and  dropped below 20 percent of it's initial write performance after ca. 22 TBytes. A "FM 25" performed with its maximum performance until 152 TBytes were written and dropped below 10 percent after.
Sorry, but without some sort of credible detail and explanation that doesn't make a lot of sense.
Title: Re: Drive capacity/fullness vs. performance
Post by: Walter Schulz on April 19, 2012, 05:07:38 pm
Sorry, but without some sort of credible detail and explanation that doesn't make a lot of sense.

I agree, never trust data without background information about procedures and so on.
But I cannot copy the article for obvious reasons. I have to contact the author and ask if there is more information available. Tests are still running AFAIK.

Ciao, Walter
Title: Re: Drive capacity/fullness vs. performance
Post by: mephisto42 on April 20, 2012, 02:33:57 am
Hi,

Sorry, but without some sort of credible detail and explanation that doesn't make a lot of sense.

I'm the one, who wrote that article in c't. What kind of detail would You be interested in? From all what we've seen with SSDs until now, the behaviour depends heavily on the controller chip used. Some controllers (especially the Sandforce family) are using compression algorithms, so performance also depends on the type of data. (e.g. plain text is highly compressible and therefore writte extremly fast, while jpgs will be written substantially slower). As the tests are still running, our pre conclusion is something like this: If and how much a SSD degrades with time depends on a) the controller chip, b)  the amount of data written and sometimes c) the type of data written. There is no rule of thumb for all the controllers out there, but there are SSDs available, where all the degradation stuff happens so late, that You shouldn't worry about. In our tests, we write with maximum speed 24 hours a day, 7 days a week. And with the better SSDs it still takes month to degrade performance. I'm not sure, what Your everyday usecase is, but probably not this. ;-) On the other hand, there are also SSDs which I wouldn't use if they were for free.

Best Regards

Benjamin Benz
c't magazine
http://www.heise.de/ct
Title: Re: Drive capacity/fullness vs. performance
Post by: Rhossydd on April 20, 2012, 03:47:36 am
I'm the one, who wrote that article in c't.
Thanks for taking the trouble to join in here and clarify things.
[/quote]What kind of detail would You be interested in?[/quote]
Well any details would do really. I doubt the quote above (A "Solid 3" SSD reached serious degration after 18 TBytes (write) and  dropped below 20 percent of it's initial write performance after ca. 22 TBytes. A "FM 25" performed with its maximum performance until 152 TBytes were written and dropped below 10 percent after.)
I'm sure you'd agree that in isolation that 'quote' doesn't really tell us anything useful.

If you work back up the thread the fundamental question is: Will an SSD's performance ever degrade below a normally plattered HDD ? If so when ?
(in real world use, after the sort of number of writes a photographer will make during the working life of the drive, not some 24hr a day high intensity benchmark)
Title: Re: Drive capacity/fullness vs. performance
Post by: mephisto42 on April 20, 2012, 04:34:30 am
I doubt the quote above (A "Solid 3" SSD reached serious degration after 18 TBytes (write) and  dropped below 20 percent of it's initial write performance after ca. 22 TBytes.

Basically that quote is correct, but I can give You some additional details: After 18 TByte written (continiously 24/7, incrompessible data) the write performance degraded. After 25 TByte it was as low as 10 MByte/s (the drive startet at 140 MByte/s with incrompessible data). This seems to be a self protection mechanism of the Sandforce-Controller called Life Time Throttling. As far as we know, the controller lowers this brake again, if the drive has reached a better data written per hour value again. This means: leave the drive powered on but without writes for a while. In a real world use case this probably never happens at all.

A "FM 25" performed with its maximum performance until 152 TBytes were written and dropped below 10 percent after.
That is basically, what we have observed.

If you work back up the thread the fundamental question is: Will an SSD's performance ever degrade below a normally plattered HDD ?
Yes, You can construct such cases with theoretical workloads (or very stupid configurations). But it depends on the controller and the use case. With a real world use case it doesn't seem very probable, as long as You don't try to use truecrypt in combination with some Sandforce-controllers. And it always only regards the write but not the read performance.

To give You an idea of more realistic numbers: Intel guarantees, that You can write 20 GByte each day for 5 years to their SSD 320. I don't know, how many photos You take per day, but I don't do 20 GByte a day. I also think, that typical for a photographer's use case is, that a certain amount of pictures stay on the drive for a long period of time. Compared to this static data the amount of permanetly changed data changed will be relatively small. So You won't ever reach lot's of "TByte written". This might be different, if You use a SSD as photoshop scratch disk.

Best regards Benjamin Benz
Title: Re: Drive capacity/fullness vs. performance
Post by: Rhossydd on April 20, 2012, 04:58:43 am
Basically that quote is correct,
Well, I'm sure it is, and I'm sure it makes sense to you, but I haven't a clue what a 'super 3' or 'FM25' is.
Quote
To give You an idea of more realistic numbers: Intel guarantees, that You can write 20 GByte each day for 5 years to their SSD 320. I don't know, how many photos You take per day, but I don't do 20 GByte a day. I also think, that typical for a photographer's use case is, that a certain amount of pictures stay on the drive for a long period of time. Compared to this static data the amount of permanetly changed data changed will be relatively small. So You won't ever reach lot's of "TByte written". This might be different, if You use a SSD as photoshop scratch disk.
I think the users most likely to hit those sort of numbers are press guys using laptops in the field for sports work. They may well be moving multi-gigs of photos onto the drive for sorting, labelling and transmission on, then archiving. However few work every day and not many will hit 20gb either, so that drive life could be approaching ten years, probably more than the working life of the machine in practice.

The question remains when does SSD performance fall below HDD performance ?

Title: Re: Drive capacity/fullness vs. performance
Post by: mephisto42 on April 20, 2012, 05:26:58 am
Hi,

I haven't a clue what a 'super 3' or 'FM25' is.

I guess the original posting meant with FM25 our test drive of the type FM-25S2S-100GBP1 from GSkill and a Solid 3 from OCZ. I'm sure You can find detailed data for these drives with google.

I think the users most likely to hit those sort of numbers are press guys using laptops in the field for sports work. They may well be moving multi-gigs of photos onto the drive for sorting, labelling and transmission on, then archiving. However few work every day and not many will hit 20gb either, so that drive life could be approaching ten years, probably more than the working life of the machine in practice.

And as far as we know this 20 GByte/day number is a very, very, very safe number, which Intel gives to keep their return rates extremly low. If You multiply it with 365 days and 5 years You get only 36,5 TByte. In our torture test the Media wear out indicator of the SSD320 from Intel droped from 100 to 83 after 35 TByte written. There was no performance loss at that time. So I guess, that long before even the press photographer runs into this kind of problem, his Notebook has been stolen or is damaged or outdated long before.

The question remains when does SSD performance fall below HDD performance ?
There is no simple answer to this question. I'll try it with three extreme approaches:
* If You don't buy a crappy SSD and don't have a very stupid setup (like enryption on a compressing SSD) probably never.
* For torture tests in the lab You might run into problems after some weeks.
* With a very stupid setup and a bad SSD You can observer performance problems at once.

Perhaps You might be interested in reading the original (german) articles:
http://www.heise.de/artikel-archiv/ct/2012/3/66_kiosk
http://www.heise.de/artikel-archiv/ct/2011/22/150_kiosk
http://www.heise.de/artikel-archiv/ct/2011/26/102_kiosk

best regards
Benjamin Benz
Title: Re: Drive capacity/fullness vs. performance
Post by: Rhossydd on April 20, 2012, 07:37:45 am
Thanks for the contribution Benjamin;

I think the quote below sums it up
Quote
The question remains when does SSD performance fall below HDD performance ?
There is no simple answer to this question. I'll try it with three extreme approaches:
* If You don't buy a crappy SSD and don't have a very stupid setup (like enryption on a compressing SSD) probably never.
* For torture tests in the lab You might run into problems after some weeks.
* With a very stupid setup and a bad SSD You can observer performance problems at once.
The first scenario is probably the most appropriate here. In routine, average use they'll always out perform HDDs.
Title: Re: Drive capacity/fullness vs. performance
Post by: PierreVandevenne on April 20, 2012, 12:55:42 pm
As far as conventional hard disk drives are concerned, the main cause of performance degradation for sequential access when becoming "full" is simply the fact that the length of the tracks decreases as one comes closer to the center - there are simply less sectors travelling in front of the head per rotation on the inside than there is on the outside of the platter. A nice explanation of this and HDD zones can be found here (http://hddscan.com/doc/HDD_Tracks_and_Zones.html). That's why we have nicely decreasing curves towards the center in the tests. The difference is not something like 3 or 4 to 1 as it would be expected "geometrically" but less than that because, in most tests, the data being written or read is spread on many consecutive tracks and track to track + 1 switching is a fairly constant factor.
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 20, 2012, 02:05:54 pm
HDD Geometry:

Outer sectors of a hard disk drive outperform the inner sectors.

Assuming the number of sectors per cylinder are the same from the inner to outer cylinder, the exact same number of sectors are being read in the same amount of time.

Modern drives aren't configured this way. They pack more sectors on outer tracks than on inner tracks. Because there are more sectors read for each rotation, outer tracks out perform inner tracks.

The first sector of a hard drive, is referred to as LBA 0, and it's location is on the outer cylinder of the disk. Drives read from outside to inside. This leads to the very real experience that a drive seems to get slower as it fills up, because as it fills up more inner tracks are being used, which have fewer sectors per cylinder, and commensurately higher seek times.

A technique referred to as short stroking limits the disk to that of perhaps just 10-20% of the disk, using only outer cylinders, in order to limit storage of data to just the fastest portion of the disk. This comes at a cost of not using the full available storage, but performance benefit is significant, depending on application it can double the performance of a drive.

RAID 1, 5, 6:

As for RAID 5 write performance, I think this is less of a concern than the write hole problem, and with larger arrays the long rebuild times during which a 2nd disk could die leading to the collapse of the entire array. If there is a single unrecoverable error in parity with RAID 5 during a single disk rebuild, the array is toast. So RAID 5 is a tenuous case of high(er) data availability at best, perhaps suitable for smaller arrays and aggressive backups. It's better than nothing, I supposed, but I'd sooner go to the effort of a RAIDZ1 based array, on a FreeNAS setup. While both RAID 5 and RAIDZ1 use single parity, ZFS doesn't suffer from the write hole problem.

ZFS:

I will try to set aside the appalling state of affairs with Windows and Mac OS file systems, and volume management. NTFS and JHFS+/X lack features available, some of which have been around for 15+ years such as logical volume management, in free software available to anyone.

http://arstechnica.com/apple/reviews/2011/07/mac-os-x-10-7.ars/12
http://research.cs.wisc.edu/adsl/Publications/corruption-fast08.html

Considering Microsoft has at least publicly announced (for Windows 8 server) a new resilient file system, which appears to have many of the features of modern file systems like ZFS and Btrfs, I'll consider that sufficient evidence of NTFS's present inadequacy. Apple was about to pull the trigger on ZFS, then pulled back for reasons speculated, but there is presently no publicly announced alternative or even acknowledgement of JHFS+/X's inadequacies. Nevertheless, its a problem. I'd consider them strictly local file systems only. And I'd keep the data on them limited (operating system, applications and short term data such as scratch/working disks including a smallish RAID 0 or 10). I'd put the bulk of the data I actually care about an another file system.

As for the wider implementation of ZFS, previously I mentioned FreeNAS. There is also a more commercialized version with support called TrueNAS. And there is a free and commercialized (based on array size, free up to ~17TB) ZFS based NAS called Nexenta.

For smaller scale implementations, the ZEVO product from http://tenscomplement.com/ is interesting. Presently it is a single disk, native ZFS on Mac OS X implementation. Soon, their RAIDZ capable versions are expected to ship. The dollar cost is reasonable. But changing your file system is something most people simply expect their operating system supplier to…well, supply. And I don't think that's unreasonable to expect. Yet here we are with inadequate, ancient file system supplied by both of the major desktop operating system platforms.
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 20, 2012, 10:00:42 pm
...
RAID 1, 5, 6:

As for RAID 5 write performance, I think this is less of a concern than the write hole problem, and with larger arrays the long rebuild times during which a 2nd disk could die leading to the collapse of the entire array. If there is a single unrecoverable error in parity with RAID 5 during a single disk rebuild, the array is toast. So RAID 5 is a tenuous case of high(er) data availability at best, perhaps suitable for smaller arrays and aggressive backups. It's better than nothing, I supposed, but I'd sooner go to the effort of a RAIDZ1 based array, on a FreeNAS setup. While both RAID 5 and RAIDZ1 use single parity, ZFS doesn't suffer from the write hole problem.

ZFS:
...
For smaller scale implementations, the ZEVO product from http://tenscomplement.com/ is interesting. Presently it is a single disk, native ZFS on Mac OS X implementation. Soon, their RAIDZ capable versions are expected to ship. The dollar cost is reasonable. But changing your file system is something most people simply expect their operating system supplier to…well, supply. And I don't think that's unreasonable to expect. Yet here we are with inadequate, ancient file system supplied by both of the major desktop operating system platforms.

If performance is important then you want to avoid RADIZ because RAIDZ effectively has all of your hard drives acting and moving as one, rather than being able to drive them all independently. Great for data reliability, terrible for performance (RAIDZ is slower than non-mirror/non-RAID ZFS and slower than mirrored ZFS.)

Similarly, ZFS is not immune to the problems associated with RAID, where a disk dieing during rebuild can take out your entire data set.

The most significant feature of ZFS for photographers is that data is checksum'd by the operating system before being stored on disk. This means that if the disk or the path to the disk (including controller) causes corruption then you'll find out and hopefully be able to correct it (if using mirroring or raidz.) Prior to ZFS, this was only available on expensive NAS solutions. Hopefully more vendors will include data checksum'ing in their disk storage products/operating systems. But what you've got to ask yourself is how likely is this to happen?

The answer to that question is perhaps most easily found in this forum - how many posters here have posted complaining about the operating system corrupting their image or that when they went back to look at an image from 5 years ago that they couldn't read the file on their hard drive?

You're more likely to find people complaining about hard drives themselves failing than the data becoming corrupt - but that doesn't mean data corruption isn't a problem. One of the first USB-Compact Flash adapters I used randomly corrupted data during long transfers and this wasn't visible until I attempted to open up the image in Lightroom (at first I didn't realise that it was the USB adapter that was at fault - I thought the camera had written out bad data!)
Title: Re: Drive capacity/fullness vs. performance
Post by: John.Murray on April 21, 2012, 12:09:21 am
Windows ReFS and ZFS for Apple is available now.

Server 2012 public Beta is available to anyone who cares to sign up:
http://blogs.technet.com/b/windowsserver/archive/2012/03/01/windows-server-8-beta-available-now.aspx

ZFS is commercially available from Tens http://tenscomplement.com/

One nice thing about ReFS is that is largely API compatible with NTFS - this means most applications will immediately be able to take advantage of it.  The downside, is that it's only available on Server at this point, it will not be available on Windows 8:

http://blogs.msdn.com/b/b8/archive/2012/01/16/building-the-next-generation-file-system-for-windows-refs.aspx


Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 22, 2012, 02:27:21 pm
If performance is important then you want to avoid RADIZ because RAIDZ effectively has all of your hard drives acting and moving as one, rather than being able to drive them all independently. Great for data reliability, terrible for performance (RAIDZ is slower than non-mirror/non-RAID ZFS and slower than mirrored ZFS.)

Solvable by striped RAIDZ groups. And the fact a single disk saturates GigE anyway, it's unlikely this is a real performance problem, let alone one without a work around.

Quote
Similarly, ZFS is not immune to the problems associated with RAID, where a disk dieing during rebuild can take out your entire data set.

That is expected with single parity RAID of any sort, however RAIDZ single parity is still more reliable than single parity conventional RAID due to the lack of the write hole.

Quote
The most significant feature of ZFS for photographers is that data is checksum'd by the operating system before being stored on disk. This means that if the disk or the path to the disk (including controller) causes corruption then you'll find out and hopefully be able to correct it (if using mirroring or raidz.)

In the case of mirrored data, including RAIDZ, any error detection is automatically corrected, the corrupt version automatically repaired. You'd need to check dmesg to learn of it.


Quote
Hopefully more vendors will include data checksum'ing in their disk storage products/operating systems. But what you've got to ask yourself is how likely is this to happen?

Sun, Oracle, Microsoft, various Linux distros are all offering, or imminently offering, disk storage with file systems that include data checksumming. Missing is Apple.

Quote
The answer to that question is perhaps most easily found in this forum - how many posters here have posted complaining about the operating system corrupting their image or that when they went back to look at an image from 5 years ago that they couldn't read the file on their hard drive?

Flawed logic. The forum is not a scientific sample. Users have no reliable way of determining a problem may be the result of corrupt data. You assume the problem in every case is a.) restricted to an image; and b.) results in visible artifacts; c.) that they would post any sort of "could not read the file" experience on a forum, rather than go to a backup copy, and move along with the rest of their day.

I've had perhaps 1/2 dozen image files, so far, unreadable. I'm a certified storage and file system geek (perhaps secondary to color geek) and I cannot tell you whether this was bit rot, silent data corruption, or file system corruption. But the files, as read, were not considered recognizably encoded image data by any image viewer, or any version of Photoshop going back to 5.5 (and I mean 5.5 not CS 5.5). The backups were also affected, presumably because all of the backups were copies of corrupted files. Fortunately, they were synthetic test images and were relatively easily recreated. Nevertheless, if you think this problem is anything like a mouse, you've got one mouse with this anecdote, and as they say, where there's one mouse there's bound to be more. We really have no idea how big of a problem this is based on forums. So I refuse that premise entirely.

Clearly some significant companies think it's a problem or there wouldn't be so much active development on Btrfs: Oracle, Red Hat, Fujitsu, IBM, HP, and others.

Here's some research data on the subject. Google has also done a study.
http://research.cs.wisc.edu/adsl/Publications/corruption-fast08.html

Quote
You're more likely to find people complaining about hard drives themselves failing than the data becoming corrupt - but that doesn't mean data corruption isn't a problem. One of the first USB-Compact Flash adapters I used randomly corrupted data during long transfers and this wasn't visible until I attempted to open up the image in Lightroom (at first I didn't realise that it was the USB adapter that was at fault - I thought the camera had written out bad data!)

It's an example of SDC, which the paper above addresses and attributes, rather significantly, to firmware induced corruption. Message boards, unscientific samples though they are, are littered with people having "hardware" RAID controller firmware induced corruption of their RAIDs, obliterating all data upon a single disk failing because data couldn't be reconstructed from (corrupted) parity.

So it's a bigger problem than we think it is, just because we're thinking that the corruption would be obvious. And that's probably untrue. How many people get a RAID 1, 5, or 6 up and running, and actually yank a drive, pop in a spare and see if the RAID correctly rebuilds? Professionals do this. Most people doing it themselves do not. They assume the reconstruction will work. And too many people consider RAID a backup solution rather than about increasing the availability of data.
Title: Re: Drive capacity/fullness vs. performance
Post by: alain on April 22, 2012, 03:02:50 pm
...
The answer to that question is perhaps most easily found in this forum - how many posters here have posted complaining about the operating system corrupting their image or that when they went back to look at an image from 5 years ago that they couldn't read the file on their hard drive?

...

I have.  I run "global" md5 checks from time to time (read not frequent enough) and found errors.  I was "lucky" to recover those files from backups.
Title: Re: Drive capacity/fullness vs. performance
Post by: alain on April 22, 2012, 03:11:38 pm
Solvable by striped RAIDZ groups. And the fact a single disk saturates GigE anyway, it's unlikely this is a real performance problem, let alone one without a work around.

That is expected with single parity RAID of any sort, however RAIDZ single parity is still more reliable than single parity conventional RAID due to the lack of the write hole.

In the case of mirrored data, including RAIDZ, any error detection is automatically corrected, the corrupt version automatically repaired. You'd need to check dmesg to learn of it.


Sun, Oracle, Microsoft, various Linux distros are all offering, or imminently offering, disk storage with file systems that include data checksumming. Missing is Apple.

I suppose that running ZFS means setting up a dedicated linux box with quite some drives and thus limited by the LAN?
Is there a "simple" windows solution, read: inside the same pc with local speeds?
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 22, 2012, 03:22:32 pm
I suppose that running ZFS means setting up a dedicated linux box with quite some drives and thus limited by the LAN?

Yes.

Sortof a non-relevant technicality but ZFS primarily runs on Solaris and BSD derivatives, rather than Linux. FreeNAS for example is FreeBSD based.

Quote
Is there a "simple" windows solution, read: inside the same pc with local speeds?

No. If you want a better file system locally, you'll have to wait for ReFS to appear possibly circa Windows 9. If you want speed and a more resilient file system, you'll need a 10 GigE network. Actually, you just need the link between the NAS and the workstations to be 10 GigE.

And Mac users are in the same boat, until tenscompliment ships their RAID and multidisk pool capable versions of ZEVO. Or Apple gets us a new file system.

Realize that GigE NAS speeds are in the realm of Firewire 800 speeds.
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 22, 2012, 03:41:40 pm
I run "global" md5 checks from time to time (read not frequent enough) and found errors.

Adobe DNG files, since spec 1.2.0.0, have a tag containing an MD5 digest of the raw image data. Any DNG reader can read the raw image data in the DNG, produce an MD5 digest of it, and compare to the MD5 digest embedded in the DNG to verify that no raw image data pixels have changed since the original (embedded) MD5 was computed.

I haven't done exhaustive testing, but my understanding is that Adobe Camera Raw, Lightroom and DNG converter, do this automatically.
Title: Re: Drive capacity/fullness vs. performance
Post by: alain on April 22, 2012, 04:34:34 pm
Yes.

Sortof a non-relevant technicality but ZFS primarily runs on Solaris and BSD derivatives, rather than Linux. FreeNAS for example is FreeBSD based.

No. If you want a better file system locally, you'll have to wait for ReFS to appear possibly circa Windows 9. If you want speed and a more resilient file system, you'll need a 10 GigE network. Actually, you just need the link between the NAS and the workstations to be 10 GigE.

And Mac users are in the same boat, until tenscompliment ships their RAID and multidisk pool capable versions of ZEVO. Or Apple gets us a new file system.

Realize that GigE NAS speeds are in the realm of Firewire 800 speeds.
Well 10GigE is expensive and not that common.  As a pure backup solution GigE is probably more than fast enough.
Title: Re: Drive capacity/fullness vs. performance
Post by: alain on April 22, 2012, 04:36:15 pm
Adobe DNG files, since spec 1.2.0.0, have a tag containing an MD5 digest of the raw image data. Any DNG reader can read the raw image data in the DNG, produce an MD5 digest of it, and compare to the MD5 digest embedded in the DNG to verify that no raw image data pixels have changed since the original (embedded) MD5 was computed.

I haven't done exhaustive testing, but my understanding is that Adobe Camera Raw, Lightroom and DNG converter, do this automatically.

I use hashdeep and that does all files, also non image files...  Not perfect, because it was made for another reason. 
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 22, 2012, 05:26:05 pm
I use hashdeep and that does all files, also non image files...  Not perfect, because it was made for another reason. 

The problem is that there's metadata inside of files that can legitimately change, which will also cause the MD5 hash to change, but the data you care about (just the raw image data) may not have.
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 23, 2012, 01:34:28 am
Quote
If performance is important then you want to avoid RADIZ because RAIDZ effectively has all of your hard drives acting and moving as one, rather than being able to drive them all independently. Great for data reliability, terrible for performance (RAIDZ is slower than non-mirror/non-RAID ZFS and slower than mirrored ZFS.)
Solvable by striped RAIDZ groups. And the fact a single disk saturates GigE anyway, it's unlikely this is a real performance problem, let alone one without a work around.

This only works in configurations where you have more than 6 disks. Otherwise you do not have enough disks to create more than one pool with a pair of disks plus parity. I suspect that nearly everyone here is looking at something with around half that number of disks.
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 23, 2012, 01:47:16 am
I suppose that running ZFS means setting up a dedicated linux box with quite some drives and thus limited by the LAN?
Is there a "simple" windows solution, read: inside the same pc with local speeds?

Unless you're a full on computer geek, ZFS should not be part of the equation. There are other aspects of the solution that will be far more important, such as what interfaces the product provides, how do you connect it, manage it, etc.

Probably the best solution is to find an external RAID thing that does USB 3.0 and get a USB 3.0 card for your computer if it doesn't have USB 3.0 in it. If you can't do USB 3.0, check if your computer has eSATA ports and go that path. Both USB and eSATA will give you performance with an external drive to match that of those inside. Failing both of those, you're left with USB.
Title: Re: Drive capacity/fullness vs. performance
Post by: Farmer on April 23, 2012, 05:39:24 am
http://www.lian-li.com/v2/en/product/product06.php?pr_index=549&cl_index=12&sc_index=42&ss_index=115&g=f
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 23, 2012, 05:53:47 am
http://www.lian-li.com/v2/en/product/product06.php?pr_index=549&cl_index=12&sc_index=42&ss_index=115&g=f

Looks great as it ticks all of the important boxes for connecting as a directly attached external storage device.
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 23, 2012, 06:18:56 pm
This only works in configurations where you have more than 6 disks. Otherwise you do not have enough disks to create more than one pool with a pair of disks plus parity. I suspect that nearly everyone here is looking at something with around half that number of disks.

Now you're shifting the goal posts. If you're going to complain about performance, but then take away the whole reason for the problem in the first place, it obviates the problem. Let's leave it at, the problem you are talking about is not really a problem. People can get plenty of performance from ZFS over either GigE or 10 GigE. That's a configuration question. It's not a reason to not use ZFS.

Unless you're a full on computer geek, ZFS should not be part of the equation. There are other aspects of the solution that will be far more important, such as what interfaces the product provides, how do you connect it, manage it, etc. ...

Probably the best solution is to find an external RAID thing that does USB 3.0

It's interesting you discount the file system, in favor of something like USB, which the vast majority of chipsets out there do not pass through ATA commands, meaning no ATA Secure Erase, no hardware-based full disk encryption, and no SMART monitoring. i.e. it is important to check the capabilities of your USB controller if such things are important to you.

If one is concerned primarily with performance, DAS with a native file system for your operating system is the way to go. But the combination of DAS with these file systems, without battery backup is a recipe for data loss. The file systems are not guaranteed to be consistent through such events, they weren't designed with that in mind. ZFS, Btrfs, ReFS are.

In contrast, NAS is better suited for high(er) availability, better file systems for data integrity, UPS integration, and SMART support. If you also want speed, you can get it but it'll cost you a 10 GigE network. And in reality, ZFS is no more difficult to set up in a NAS than any other file system, so it's hardly the realm of "full on computer geek" as if a NAS is not.

Quote
http://www.lian-li.com/v2/en/product/product06.php?pr_index=549&cl_index=12&sc_index=42&ss_index=115&g=f

External interface eSATA is 3Gb/s which is 300MB/s. Adequate but for something that holds five disks, it's bandwidth limited by the external interface. Each disk will push between 120MB/sec and 150MB/sec. So your interface saturation occurs at two disks, sustained. Not just burst.
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 24, 2012, 11:44:06 am
Now you're shifting the goal posts. If you're going to complain about performance, but then take away the whole reason for the problem in the first place, it obviates the problem. Let's leave it at, the problem you are talking about is not really a problem. People can get plenty of performance from ZFS over either GigE or 10 GigE. That's a configuration question. It's not a reason to not use ZFS.

No I'm not shifting the goal posts, you did by suggesting that RADIZ groups be used and forgot to mention that there are minimum requirements in order to make that work. There are very well known issues with ZFS and RAIDZ performance and that overcoming them isn't trivial because it requires specific configurations to get around them with. RAIDZ just isn't suitable for small disk configurations, such as those most people here will use.

Quote
It's interesting you discount the file system, in favor of something like USB, which the vast majority of chipsets out there do not pass through ATA commands, meaning no ATA Secure Erase, no hardware-based full disk encryption, and no SMART monitoring. i.e. it is important to check the capabilities of your USB controller if such things are important to you.

Replacing USB with eSATA fixes all of the above.

Quote
...
External interface eSATA is 3Gb/s which is 300MB/s. Adequate but for something that holds five disks, it's bandwidth limited by the external interface. Each disk will push between 120MB/sec and 150MB/sec. So your interface saturation occurs at two disks, sustained. Not just burst.

So what?

As mentioned, what people want is to access disk at speeds they are used to. USB and firewire that most people have today is slower than that. Further, they're not likely to be running a huge DB or file serving lots of clients, just their own PC. Thus eSATA is perfect because it is the speed of the local disk except to something external. Whether or not there is interface saturation is beside the point. They're not architecting enterprise storage solutions, just trying to move beyond USB because it is perceptibly slower than internal disk. Similarly, eSATA is going to be quicker for them to access than any NAS (Network Attached Storage) that's connected to Gigabit ethernet (3Gb/s > 1Gb/s). And by the time 10GE is affordable, they'll want something new anyway.
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 24, 2012, 03:28:55 pm
No I'm not shifting the goal posts, you did by suggesting that RADIZ groups be used and forgot to mention that there are minimum requirements in order to make that work.

The problem you referred to, from the outset, occurs with many drives.

Quote
There are very well known issues with ZFS and RAIDZ performance and that overcoming them isn't trivial because it requires specific configurations to get around them with. RAIDZ just isn't suitable for small disk configurations, such as those most people here will use.

For someone claiming very well known issues with ZFS and RAIDZ performance, why did you forget to mention that this RAIDZ performance problem is one of IOPS, not bandwidth? In the case of large files and sequential reads or writes, RAIDZ parallelizes available bandwidth very well. I do not consider Raw/DNG or working PSD/TIFFs for most photographers to be small files. JPEGs might be, it depends.

Do you suppose scaling out IOPS is important for a photographer?

Do you likewise disqualify RAID 5? If not, why not?


Quote
So what?

What's the advantage of this enclosure over sticking disks inside the desktop computer, which would invariably provide better striping performance? By a lot. And would be cheaper.

Also, the specs are confusing: Is it 15TB, 10TB, or 6TB capacity? Is this built-in or host RAID?

Quote
Thus eSATA is perfect because it is the speed of the local disk except to something external.

Whether it's perfect or not depends on the other hardware in the workflow, which we either don't know, or I missed it even though I went back and looked. I think on the one hand this enclosure is overkill on the quantity of storage, but it's bottlenecked by interface. I would consider a different approach, but again it depends on other hardware.

DAS is for performance, it's for hot files. NAS is for availability, it's for cold files.

Hot files are: scratch space, preview/cache files, work-in-progress PSDs and TIFFs.

Cold files are: those pixels that haven't been touched in a month, let alone today.

It really does not make sense spending extra money on fast large DAS for cold files. At least not until we have more reliable file systems that can be both resilient and fast, by pooling (aggregating) those disks together.

So I would bias the budget for DAS to be small, but as fast as practical for the size I need for daily work: hot files.

And I'd bias the budget for NAS to be large, not as fast, but higher availability, for the cold files. Plus I get SMART and UPS monitoring built-in, a more reliable file system, and automated replication features (to either an on-site or off-site duplicate NAS or cloud storage).

I could even have a "sweep" script that moves all fast DAS files to the NAS once or twice a day. And then after 7 days of aging, deletes them off the NAS. This, in case I "forget" to move my hot files to more reliable storage.


Quote
Similarly, eSATA is going to be quicker for them to access than any NAS (Network Attached Storage) that's connected to Gigabit ethernet (3Gb/s > 1Gb/s).

GigE NAS comes close to single local disk performance, ~ 90MB/s is reasonable to expect although I've seen them push 110MB/s.
Title: Re: Drive capacity/fullness vs. performance
Post by: Farmer on April 24, 2012, 07:22:16 pm
The advantage of the Lian Li enclosure is that compared to a regular NAS, it's much, much faster (it's both eSATA and USB 3).  For many users, they don't have the knowledge or don't want to deal with setting up RAID on their own system or perhaps it's already saturated for local performance or storage and they just want more storage at a decent price that's easy to setup and, most importantly, doesn't give them a huge performance hit.

This particular enclosure is also suitable for people who want to increase storage but don't have room in their own computer (perhaps it's a laptop) but want more speed than a NAS.

If you really want high performance in your desktop you'd add something like a Revodrive (one of the larger, workstation class devices), but that would also empty a large portion of your back account :-)
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 24, 2012, 11:50:28 pm
Quote from: chrismurphy link=topic=65613.msg523739#msg523739 date=1335295735
For someone claiming [i
very well known issues with ZFS and RAIDZ performance[/i], why did you forget to mention that this RAIDZ performance problem is one of IOPS, not bandwidth?

Go back and read my comments, specifically where I state that RAIDZ works all drives in sync rather than independently.

Quote
What's the advantage of this enclosure over sticking disks inside the desktop computer, which would invariably provide better striping performance? By a lot. And would be cheaper.

Ok, given your other comments, I'm going to call "troll" on a bunch of what you said here.

Quote
DAS is for performance, it's for hot files. NAS is for availability, it's for cold files.

... other comments deleted ...

I'd recommend that you go back and either buy or re-watch some of the Luminous Landscape videos that talk on this topic. One of the early ones by Seth that recommends keeping all of your images on one device comes to mind. He's a professional photographer that clearly has his work flow and IT issues sorted without all of this nonsense about DAS/NAS.

The complexity required to use shell scripts to do what you suggest is far beyond what I'd expect for any photographer and on top of that, you're managing photographs outside of an application such as Lightroom, thus requiring manual steps inside of that (or whatever application is being used) too. If you feel comfortable doing that, fine but be aware that not everyone is a technical expert in these areas and nor should they have to be.

The advantage of the Lian Li enclosure is that compared to a regular NAS, it's much, much faster (it's both eSATA and USB 3).  For many users, they don't have the knowledge or don't want to deal with setting up RAID on their own system or perhaps it's already saturated for local performance or storage and they just want more storage at a decent price that's easy to setup and, most importantly, doesn't give them a huge performance hit.

This particular enclosure is also suitable for people who want to increase storage but don't have room in their own computer (perhaps it's a laptop) but want more speed than a NAS.

Exactly! The only question I had was whether there were any inbuilt limitations on hard drive sizes that it would work with but once you've got 8TB of storage, filling that is going to take time and by the time it is full, you may want to replace the storage solution in total for something newer.
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 25, 2012, 03:51:00 pm
The advantage of the Lian Li enclosure is that compared to a regular NAS, it's much, much faster . . . easy to setup and, most importantly, doesn't give them a huge performance hit.

Scroll half way down, there are USB 3.0 benchmarks putting it at 160MB/s.
http://homeservershow.com/lian-li-ex-503b.html

4th comment on RAID5 performance with eSATA.
http://forums.whirlpool.net.au/archive/1667234

You're saying ~160 MB/s is "much much" faster than ~100 MB/s, and ~100MB/s is a "huge performance hit" compared to ~160MB/s? OK.

One internal WD VelocityRaptor is 200+MB/s sustained.

One internal Mercury Aura Pro is 511+MB/s sustained.

I still question what the RAID implementation is. If it's proprietary, the data on the array is locked behind that. If you plan to do RAID 5, depending on the enclosure's RAID implementation, some consumer disks may not work. WDC specifically says RAID 1 and 0 only for their Green, Blue and Black consumer drives. There are other combinations that can work, but it depends on the RAID implementation. The wrong combinations lead to array failures (not disk failures).

So what's the backup plan with the Lian Li? Get two? Or is the data expendable?

Go back and read my comments, specifically where I state that RAIDZ works all drives in sync rather than independently.

If you meant IOPS, you should have said it. But, setting aside the vague and confusing phraseology, your blanket proscription of RAIDZ doesn't make sense. IOPS is important for small files and random access performance. Photographers working on Raw, DNG, PSD, TIFF, need bandwidth and RAIDZ parallelizes bandwidth across disks quite well.

Quote
Ok, given your other comments, I'm going to call "troll" on a bunch of what you said here.

Please proceed with the name calling at your convenience.

Quote
I'd recommend that you go back and either buy or re-watch some of the Luminous Landscape videos that talk on this topic. One of the early ones by Seth that recommends keeping all of your images on one device comes to mind. He's a professional photographer that clearly has his work flow and IT issues sorted without all of this nonsense about DAS/NAS.

The March 2009 article? What were SSD's going for then, price/performance ratio wise?

The suggested storage solution is a compromise that gets you Jack of some trades, master of none. The aim is ostensibly speed, but while somewhat better than NAS, it's not a master of speed. I keep hearing the "NAS is slow" complaint, and the resulting myopia leads to a solution that is deficient in every other category.

My customers with high volume workflows, with the exception of video, use 1 GigE for designer/retoucher stations. The designer copies project files from the NAS, works on them locally, and then files are pushed back to the NAS. These folks have dedicated IT, but they do not support one, let alone each designer, having their own array attached. Why? Because it's a support nightmare; the data is not high availability; the data tends to not get backed up (in particular when the computer has been shut down); they get no notifications until it's too late.

And there is no contradiction at all with the suggestion you keep all images in one storage pool, while "checking out" files onto much faster media for work in progress, and then "checking in" those files back to storage.

Quote
The complexity required to use shell scripts to do what you suggest is far beyond what I'd expect for any photographer and on top of that,

Try to see this as a continuum rather than as binary. There are apps with a GUI that will script for you. Carbon Copy Cloner uses rsync and Super Duper uses ditto. The web GUI interface on a NAS produces scripts to automate replication features.
Title: Re: Drive capacity/fullness vs. performance
Post by: Farmer on April 25, 2012, 06:41:15 pm
The backup plan?  Whatever you want it to be.  It might be part of your backup, or you might use a second one, or you might use a NAS, etc. etc.  Many options.

Comparing the Lian Li enclosure to a raptor?  Are you kidding?  Do you have an 8TB raptor?

There are other reviews of USB3 connection performing a bit higher, but let's take 160.  That's 60% better and you're assuming that you'll actually get 100MB/s - the reality is you won't get full saturation, particularly if you're doing anything else over that network.  If you think 60% isn't significant, well, so be it (but you're wrong).

And for speed, a raptor is OK (I use some myself - scratch disk), but as I said, if you are really talking about speed there are much better options (that blow away your Mercury Aura Pro).  But we're not talking about the highest possible performance - we're talking about connected/additional storage, reasonable performance, reasonable cost and so on.
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 26, 2012, 05:14:59 pm
If you meant IOPS, you should have said it. But, setting aside the vague and confusing phraseology, your blanket proscription of RAIDZ doesn't make sense.

To anyone that is familiar with RAIDZ, what I said is neither vague nor confusing.

I'm happy for your customers and I hope that they're happy with your work but it doesn't sound like that they're individual photographers each with their own library of photographs and video to work with.

Quote
Quote
I'd recommend that you go back and either buy or re-watch some of the Luminous Landscape videos that talk on this topic. One of the early ones by Seth that recommends keeping all of your images on one device comes to mind. He's a professional photographer that clearly has his work flow and IT issues sorted without all of this nonsense about DAS/NAS.
The March 2009 article?

Yes, but and watch the entire video, not just the free preview.

Quote
Quote
The complexity required to use shell scripts to do what you suggest is far beyond what I'd expect for any photographer and on top of that,
Try to see this as a continuum rather than as binary. There are apps with a GUI that will script for you. Carbon Copy Cloner uses rsync and Super Duper uses ditto. The web GUI interface on a NAS produces scripts to automate replication features.

Both of those solve a problem that is unrelated to the problem of managing your files outside of Lightroom and needing to synchronise Lightrooom manually.
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 27, 2012, 12:17:42 pm
The backup plan?  Whatever you want it to be.

It's easier to take shots if you're more specific. Better that the drawing board idea has leaks than the actual implementation.

Quote
Comparing the Lian Li enclosure to a raptor?  Are you kidding?  Do you have an 8TB raptor?

Consider how you can take advantage of very fast storage to speed up the thing you do most, rather than worrying about the performance of things you don't do very often.

And there are open questions remaining on the enclosure. The enclosure incentivizes the use of RAID 5, and consumer disks mostly proscribe RAID 5. WDC Green, Blue and Black disks are expressly RAID 1 and 0 disks, not RAID 5. If the data is expendable, things are much easier and you don't need to know such things.

Quote
That's 60% better and you're assuming that you'll actually get 100MB/s - the reality is you won't get full saturation, particularly if you're doing anything else over that network.  If you think 60% isn't significant, well, so be it (but you're wrong).

I've pushed data at 100MB/s. I've seen it pushed to 115MB/s, although I'm pretty sure that involved jumbo frames. Gotta have a clean network though, this is true.

Depending on what you mean by "doing anything else over that network" the entire strategy may be altered. It might make sense to use LACP for a 2Gb/s connection NAS to the switch; or 10 GigE from NAS to the switch, for multiple workstations.

Now then, do you consider 60% slower transfers, occurring twice a day (check-out & check-in) with a GigE NAS, to be a real problem if you increase the performance of something you do 24 times a day (Photoshop save to a fast local disk) by 100%? Or 300%? I don't think that's an unreasonable trade off.

To anyone that is familiar with RAIDZ, what I said is neither vague nor confusing.

OK, so then you're either confused, or unfamiliar with RAIDZ, to have categorically admonished its usage due to the inapplicable performance reason you were referencing.

WHEN TO (AND NOT TO) USE RAID-Z
https://blogs.oracle.com/roch/entry/when_to_and_not_to

I quote from the article a relevant point:
Because of  the  ZFS Copy-On-Write (COW) design, we actually do expect this reduction in number of device level I/Os to work extremely well for just about any write intensive workloads. We also expect it to help streaming input loads significantly. The situation of random inputs is one that needs special attention when considering RAID-Z.

Quote
I'm happy for your customers and I hope that they're happy with your work but it doesn't sound like that they're individual photographers each with their own library of photographs and video to work with.

My business remains primarily color management related, while my expertise extends substantially beyond it for historical, as well as interest, reasons. I come across a wide assortment of customers from enterprise to individual photographers. And I'm privy to their various success and fail scenarios regarding networking and storage.

Consider that even the individual photographer approaches, and many individual pros are at, enterprise level storage requirements. Yet they don't have enterprise level IT staff on hand. The solutions I commonly see in non-enterprise environments work fine until they don't. And I don't just mean outright failures or data corruption, but even problems with performance, migration, expansion, and even backup restoration. It's as if data restoration hasn't been modeled let alone practiced.

Quote
Both of those solve a problem that is unrelated to the problem of managing your files outside of Lightroom and needing to synchronise Lightrooom manually.

The example task being automated is a unidirectional push of files/folders from fast local storage to a NAS: today's Photoshop files, as well as the Lightroom preview cache and catalog files.
Title: Re: Drive capacity/fullness vs. performance
Post by: Farmer on April 27, 2012, 06:18:37 pm
You still don't get it.  You're taking the worst possible usage for a particular thing and comparing that against the best possible usage of another (your talk of 115MB/s is a classic).

Personally, the Lian Li would be a very cost effective way of adding external storage that was significantly faster than NAS (at any comparable price point).  My workstation is essentially full, physically.  I will be upgraded in the next few months to an Ivy Bridge system and at that time I may suffle things around.

You're just looking to fine ways to shoot things down (as you say yourself), but you're not actually looking at the same picture as everyone else because you're apparently only interested in your own vision.  Doing individual point comparison and failing to comprehend the big picture is a real problem.
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 27, 2012, 06:37:18 pm
You still don't get it.

Feel free to explain in more verbose and simple terms then, rather than just repeating yourself.

Quote
You're taking the worst possible usage for a particular thing and comparing that against the best possible usage of another (your talk of 115MB/s is a classic).

I cited two benchmarks for the product you mentioned, and used the better of the two. You said you found better benchmarks, but you didn't cite them, and you accepted 160MB/s. Conversely, the other cite I provided benchmarked 110MB/s reads and 40MB/s writes. If I were actually being fair, instead of giving the Lian Li the benefit of the doubt, I'd have averaged the two, bringing its performance to 135MB/s reads, and 100MB/s writes.

Quote
Personally, the Lian Li would be a very cost effective way of adding external storage that was significantly faster than NAS (at any comparable price point).

Speculation. You have no evidence.

The benchmark data submitted thus far doesn't support your contention. You persist in ignoring every deficiency of the product or its specs except this hypothetical performance difference.

Quote
You're just looking to fine ways to shoot things down (as you say yourself), but you're not actually looking at the same picture as everyone else because you're apparently only interested in your own vision.  Doing individual point comparison and failing to comprehend the big picture is a real problem.

And what is it you're doing when you ignore all unknowns and deficiencies of a product except the "individual point comparison" of speed, which really isn't even that significant?
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 28, 2012, 04:43:03 am
Consider how you can take advantage of very fast storage to speed up the thing you do most, rather than worrying about the performance of things you don't do very often.

And there are open questions remaining on the enclosure. The enclosure incentivizes the use of RAID 5, and consumer disks mostly proscribe RAID 5. WDC Green, Blue and Black disks are expressly RAID 1 and 0 disks, not RAID 5. If the data is expendable, things are much easier and you don't need to know such things.

I've pushed data at 100MB/s. I've seen it pushed to 115MB/s, although I'm pretty sure that involved jumbo frames. Gotta have a clean network though, this is true.

Depending on what you mean by "doing anything else over that network" the entire strategy may be altered. It might make sense to use LACP for a 2Gb/s connection NAS to the switch; or 10 GigE from NAS to the switch, for multiple workstations.

Now then, do you consider 60% slower transfers, occurring twice a day (check-out & check-in) with a GigE NAS, to be a real problem if you increase the performance of something you do 24 times a day (Photoshop save to a fast local disk) by 100%? Or 300%? I don't think that's an unreasonable trade off.

Are you trying to sell yourself or your services?

Quote
OK, so then you're either confused, or unfamiliar with RAIDZ, to have categorically admonished its usage due to the inapplicable performance reason you were referencing.

No, I've benchmarked it (run various tests) and found it to be slowest way to use ZFS. I then asked those who wrote it why that was so. Case closed.

Quote
WHEN TO (AND NOT TO) USE RAID-Z
https://blogs.oracle.com/roch/entry/when_to_and_not_to

I quote from the article a relevant point:
Because of  the  ZFS Copy-On-Write (COW) design, we actually do expect this reduction in number of device level I/Os to work extremely well for just about any write intensive workloads. We also expect it to help streaming input loads significantly. The situation of random inputs is one that needs special attention when considering RAID-Z.

Why don't you find a blog entry where they actually measure and report on the performance difference between RAIDZ and other things rather than one that just hand waves about it?

Quote
The example task being automated is a unidirectional push of files/folders from fast local storage to a NAS: today's Photoshop files, as well as the Lightroom preview cache and catalog files.

Previously you were talking about using DAS (hot storage) as a local cache of things that you are working on and the NAS (cold storage) was were everything was held. Now it seems like you want the NAS just to be a backup solution?

Anyway, I still highly recommend watching the entire video from the March 2009 article. At least then you'll have a common point to talk to with everyone else.
Title: Re: Drive capacity/fullness vs. performance
Post by: alain on April 28, 2012, 06:23:46 pm
...
The March 2009 article?


Yes, but and watch the entire video, not just the free preview.
...
There's are very good lessons inside the complete video : 
- Take at least one backup off line, meaning disconnect it physical from the LAN and power connectors.
- Don't trust an on line backup provider.
- Take separate backups from system (make those reboot-able, maybe with a little bit of trouble) and data.
- Remember that everything in the current location can be lost. --> Keep a backup on a separate place "far" away.
  (Think about burglars, lightning, fire,...)

Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 28, 2012, 06:32:57 pm
Are you trying to sell yourself or your services?

Not in the paragraph you quoted. I already said what my primary business is. Storage and networking are presently subjects of R&D, because I find present practices are deficient. The challenge is how to apply enterprise best practices to photographers, without high cost or knowledge requirements. But even Linux/BSD hobbyists have storage technology (software) that's more robust than what most people are using. Even in small-businesses.

Quote
No, I've benchmarked it (run various tests) and found it to be slowest way to use ZFS.

Present your methodology so others can reproduce your results. What platform (Solaris, OpenSolaris, FreeBSD, OpenIndiana)? What version of ZFS filesystem and pool? What controllers and drives, how many of each, were multipliers used? What benchmarking tools did you use? What's the breakdown of the individual read/write, random/sequential testing? Preferably this is already documented on the opensolaris ZFS discussion list.

Quote
I then asked those who wrote it why that was so. Case closed.

Who did you ask, what did you ask, what was their response? ZFS code is licensed under the CDDL, it's developed by a community who monitor the opensolaris ZFS discuss list. Your findings and methods should be asked there, and they would reply there. If you have a URL for this conversation, please provide it.

Quote
Why don't you find a blog entry where they actually measure and report on the performance difference between RAIDZ and other things rather than one that just hand waves about it?

That entry was written by Roch Bourbonnais who is a principle engineer at Oracle, working on ZFS. Oracle is the maintainer of ZFS as they acquired it via Solaris when they bought Sun. Do you care to be more specific about your complaint about what he wrote in the article? You've "asked those who wrote [ZFS]" and yet you're going to describe one of them as waiving hands about it? Who are you?

Quote
Previously you were talking about using DAS (hot storage) as a local cache of things that you are working on and the NAS (cold storage) was were everything was held. Now it seems like you want the NAS just to be a backup solution?

You're confused, again. For me to want NAS as just backup, the DAS would have to be large enough to be primary storage. I have not suggested that. Nor have I suggested any fault tolerance for DAS, so it can hardly be primary storage for important things.[1]

The purpose of scripting (directly or with a GUI wrapper interface) is merely to get work in progress files automatically "swept" (pushed) to primary storage featuring resilience, fault tolerance, and (ideally) data replication. It's not a requirement. Instead you could certainly copy your WIP files to the NAS yourself with a conventional drag-drop file/folder copy, whenever you like. Maybe you've had a bad day and don't even like the WIP files so you delete them instead.


[1] One can create a setup where DAS is fast, large, resilient, faul tolerant, replicated and primary, with NAS as both online secondary and backup. It would be expensive, and there are alternatives to consider, per usual.
Title: Re: Drive capacity/fullness vs. performance
Post by: alain on April 28, 2012, 06:37:22 pm
Personally, the Lian Li would be a very cost effective way of adding external storage that was significantly faster than NAS (at any comparable price point).  My workstation is essentially full, physically.  I will be upgraded in the next few months to an Ivy Bridge system and at that time I may suffle things around.
This is a valid reason, but I would stay away from RAID-5.
- The implementation is probably enclosure specific.  You probably need the same exact enclosure to recover from an enclosure defect.  
- Rebuilding a RAID-5 with large disks runs a very long time and and error then is probably fatal for you're RAID-5 data.

Two 3TB disks in raid1 are fast and the second drive protects you from some disk errors.  If you're using usb-3, I would think about two separate 2 disk enclosures.  Those two will cost less than 1 bigger and can be faster.

It's simple and fast but it won't have: snapshots, testing the data (scrubbing), using two parity drives, easy use from several computers,... [EDIT added:] automatic error correction for "bit rot"
Title: Re: Drive capacity/fullness vs. performance
Post by: alain on April 28, 2012, 06:47:35 pm
...
Hot files are: scratch space, preview/cache files, work-in-progress PSDs and TIFFs.

Cold files are: those pixels that haven't been touched in a month, let alone today.

It really does not make sense spending extra money on fast large DAS for cold files. At least not until we have more reliable file systems that can be both resilient and fast, by pooling (aggregating) those disks together.

So I would bias the budget for DAS to be small, but as fast as practical for the size I need for daily work: hot files.

And I'd bias the budget for NAS to be large, not as fast, but higher availability, for the cold files. Plus I get SMART and UPS monitoring built-in, a more reliable file system, and automated replication features (to either an on-site or off-site duplicate NAS or cloud storage).

I could even have a "sweep" script that moves all fast DAS files to the NAS once or twice a day. And then after 7 days of aging, deletes them off the NAS. This, in case I "forget" to move my hot files to more reliable storage.

GigE NAS comes close to single local disk performance, ~ 90MB/s is reasonable to expect although I've seen them push 110MB/s.

This is indeed a good reason for a NAS.  The more I read on the web, the more ZFS seems to be getting better. 

Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 28, 2012, 07:00:55 pm
This is a valid reason, but I would stay away from RAID-5.

I agree. It can be used in small arrays, with not a lot of data, and good (frequent) backup plans. But TB drives can take days to restripe, as you point out. It might be better than no fault tolerance, but in enterprise situations it's borderline malfeasance to setup RAID 5 for important data these days. Especially big data, many disks, or large disks.

Quote
Two 3TB disks in raid1 are fast and the parity protects you from some disk errors.

There is no parity used in RAID 1. Conventional RAID 1 involves making identical block copies between two block devices (physical or virtual).[1] There is no error correction offered by RAID 1 except what's offered by the disk firmware themselves. If the disks return what they each consider valid data, but actually conflict, neither the RAID implementation nor file system can resolve the ambiguity, to determine which block of data is correct.

Same with RAID 5, while there is parity, it's ambiguous whether the data is correct or the parity is. With RAID 6 dual parity, it's unambiguous.

The advantage of resilient file systems is they can resolve these ambiguities, and self-heal.


[1] Btrfs implements RAID 1 that is chunk based, and doesn't require pairs. So it means "at least two copies of data and metadata on separate disks" rather than identical disk pairs.
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 28, 2012, 07:28:55 pm
This is indeed a good reason for a NAS.  The more I read on the web, the more ZFS seems to be getting better. 

I'm actually substantially more familiar with Btrfs than ZFS, although they have similarities. I've been using Btrfs for about three years, and recently have committed one of my backups to it. But it is not yet considered production ready, as development is still heavy. Of the resilient copy-on-write file systems, ZFS is the most mature and is production stable.

Another plus of such file systems, is very fast file system checking and repairing. With today's journaled file systems, the idea is the journal doesn't actually make the file system more reliable, it simply makes it much faster to check it for consistency after a crash or power failure. If there is any inconsistency in the journal, it requires a fully traversed file system check and repair, and on multi-TB file systems found on arrays this can take hours if you're lucky or days if the file system is very large.

Copy On Write file systems include file system metadata while write operations occur and are considered always consistent (except when they aren't). ZFS doesn't even have an fsck tool to this day. So their consistency by design aids in getting back online quickly in the event of crash or power failure. The ZFS equivalent to a full traversal of the file system is scrubbing, which is an online background process - meaning your data is fully available while the file system is being checked and any corrupted files are repaired. Btrfs scrub works similarly. I'd expect ReFS to have similar features.
Title: Re: Drive capacity/fullness vs. performance
Post by: Farmer on April 28, 2012, 08:14:01 pm
FWIW, I have local drives, direct attached (eSATA) backups, local NAS (RAID5, but I don't mind if it takes time to rebuild or has to be completely recopied it's just for uptime convenience), offsite single disk (multiples, but meaning no RAID) and cloud (Crashplan - via a hard disk upload and online maintenance).

I'm not a pro photog, but my images are important to me and this is a cost effective system that works for me (and has for many years, excluding Crashplan which is new).

With a new PC on the nearby horizon, I'll likely indulge in a vanity-build and get some technology that I don't need, and at the same time I'll have a look at my redundancy, availability/accessability and backup options.

My data is in original raw, copies in DNG (which has a checksum) and derived works in varying formats over the years (TIFF, PSD, .xmp etc).  I use PS and LR (not being a pro, I can afford to dabble and test and experiment and play and <insert appropriate variation here> etc).

If you look at the majority of photographers, I dare say that asking them to become heavily involved in alternate file systems and other operating systems etc is folly.  Many don't impliment currently available techniques nor have any desire to spend more time in front of the computer than necessary (let alone spending it on non-photography related tasks).

Simple products that are reasonably fast and require minimal maintenance and setup are ideal.  This is the big picture that seems to often get missed by technical folks in these discussions (and whilst I'm not at the level of some of the people here with the technology, I am able to follow it and could impliment it reasonably easily, I have a long background with computers, including various *nix flavours etc).  Most photogs and most users generally don't have that level of knowledge nor do they want it.  Offering solutions that require an IT admin usually won't be accepted.  So, we're looking at compromises that will actually be used.
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 28, 2012, 11:43:37 pm
I appreciate the conversation, and the sharing of your setup. Off-site backup (e.g. cloud) is missing from significant majorities of photographer's plans, and there's much to consider: cost, upload times, what to backup (if not everything), encryption and privacy control, and so on.

If you look at the majority of photographers, I dare say that asking them to become heavily involved in alternate file systems and other operating systems etc is folly.

I understand the concern. But I'm not asking what you must think I'm asking.

For example, most everyone has a wireless router. These routers have alternate file systems and operating systems on them (routers don't run Mac OS or Windows). Do users get heavily involved in either the router file or operating system? No. Why not? Default behavior and the web interface abstracts them from such things.

Quote
Many don't impliment currently available techniques nor have any desire to spend more time in front of the computer than necessary (let alone spending it on non-photography related tasks).

I agree. But in some ways I don't blame people for not wanting to implement current ad hoc techniques that even the geeks and pros don't universally agree on.

Why not make your printed editions and throw the digital files away? All of them. Print files, Raws, DNGs. It's not a new concept. Artists destroyed strike plates well before digital came long. So I'll even refuse the premise that backing up and archiving is an inherently good or required behavior. It's only good if you particularly value those photos beyond producing a printed edition, enough to protect photos with a constant supply of cash rather than buying something else.

Quote
Simple products that are reasonably fast and require minimal maintenance and setup are ideal.  This is the big picture that seems to often get missed by technical folks in these discussions (and whilst I'm not at the level of some of the people here with the technology, I am able to follow it and could impliment it reasonably easily, I have a long background with computers, including various *nix flavours etc).

Not every conversation is a recommendation, or one that's designed to call people to action rather than simply consideration of alternatives.

Technical folks come in lazy versions. I don't like fixing the same thing over and over again. I'm not looking for high maintenance and setup. I'm not eager to suggest overly complicated solutions, as they tend to be fragile. No one likes that.

Quote
Most photogs and most users generally don't have that level of knowledge nor do they want it.  Offering solutions that require an IT admin usually won't be accepted.  So, we're looking at compromises that will actually be used.

And that's sensible. I can't imagine it working any other way. The dilemma is really about choosing the familiar old versus the unfamiliar new, not about choosing simple versus complicated. Not many people intentionally (or rationally) choose complicated over simple if the outcomes were the same.
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 29, 2012, 03:07:13 am
Not in the paragraph you quoted. I already said what my primary business is. Storage and networking are presently subjects of R&D, because I find present practices are deficient. The challenge is how to apply enterprise best practices to photographers, without high cost or knowledge requirements. But even Linux/BSD hobbyists have storage technology (software) that's more robust than what most people are using. Even in small-businesses.

Present your methodology so others can reproduce your results. What platform (Solaris, OpenSolaris, FreeBSD, OpenIndiana)? What version of ZFS filesystem and pool? What controllers and drives, how many of each, were multipliers used? What benchmarking tools did you use? What's the breakdown of the individual read/write, random/sequential testing? Preferably this is already documented on the opensolaris ZFS discussion list.

Who did you ask, what did you ask, what was their response? ZFS code is licensed under the CDDL, it's developed by a community who monitor the opensolaris ZFS discuss list. Your findings and methods should be asked there, and they would reply there. If you have a URL for this conversation, please provide it.

That entry was written by Roch Bourbonnais who is a principle engineer at Oracle, working on ZFS. Oracle is the maintainer of ZFS as they acquired it via Solaris when they bought Sun. Do you care to be more specific about your complaint about what he wrote in the article? You've "asked those who wrote [ZFS]" and yet you're going to describe one of them as waiving hands about it? Who are you?

Yes, I'm going to say "those who wrote ZFS" because I don't think it would be very cool to name drop. I still don't understand the ranting above unless you're chest beating. Mentioning the details of the hardware used for testing, would just see this thread descend into debate over the pro's and con's of various hardware, drivers, etc. Further, correcting you on various technical points above would not help anyone. But it seems like you've definitely had lots of ZFS cool-aid to drink.

Quote
You're confused, again. For me to want NAS as just backup, the DAS would have to be large enough to be primary storage. I have not suggested that. Nor have I suggested any fault tolerance for DAS, so it can hardly be primary storage for important things.[1]

The purpose of scripting (directly or with a GUI wrapper interface) is merely to get work in progress files automatically "swept" (pushed) to primary storage featuring resilience, fault tolerance, and (ideally) data replication. It's not a requirement. Instead you could certainly copy your WIP files to the NAS yourself with a conventional drag-drop file/folder copy, whenever you like. Maybe you've had a bad day and don't even like the WIP files so you delete them instead.


[1] One can create a setup where DAS is fast, large, resilient, faul tolerant, replicated and primary, with NAS as both online secondary and backup. It would be expensive, and there are alternatives to consider, per usual.

Ok, the way you're talking here makes it clear that you're completely unfamiliar with Lightroom and its work flow. Since you're an open source freak, I'll suggest that you look at using darktable (which I'm unfamiliar with) since I believe that it is similar in work flow and design to Lightroom. If I was only working with photoshop (or gimp) then what you describe would be relevant.
Title: Re: Drive capacity/fullness vs. performance
Post by: alain on April 29, 2012, 04:28:43 am
There is no parity used in RAID 1. Conventional RAID 1 involves making identical block copies between two block devices (physical or virtual).[1] There is no error correction offered by RAID 1 except what's offered by the disk firmware themselves. If the disks return what they each consider valid data, but actually conflict, neither the RAID implementation nor file system can resolve the ambiguity, to determine which block of data is correct.

Same with RAID 5, while there is parity, it's ambiguous whether the data is correct or the parity is. With RAID 6 dual parity, it's unambiguous.

The advantage of resilient file systems is they can resolve these ambiguities, and self-heal.


[1] Btrfs implements RAID 1 that is chunk based, and doesn't require pairs. So it means "at least two copies of data and metadata on separate disks" rather than identical disk pairs.

Indeed
Title: Re: Drive capacity/fullness vs. performance
Post by: alain on April 29, 2012, 05:10:19 am
If you look at the majority of photographers, I dare say that asking them to become heavily involved in alternate file systems and other operating systems etc is folly.  Many don't impliment currently available techniques nor have any desire to spend more time in front of the computer than necessary (let alone spending it on non-photography related tasks).

Simple products that are reasonably fast and require minimal maintenance and setup are ideal.  This is the big picture that seems to often get missed by technical folks in these discussions (and whilst I'm not at the level of some of the people here with the technology, I am able to follow it and could impliment it reasonably easily, I have a long background with computers, including various *nix flavours etc).  Most photogs and most users generally don't have that level of knowledge nor do they want it.  Offering solutions that require an IT admin usually won't be accepted.  So, we're looking at compromises that will actually be used.
Yes I know to many photographers that (almost) never make backups on an external drive.  In the "best" case they have a single external usb drive.
They need simple solutions.

Unfortunately most DAM programs (including LR) don't have a simple way to check all files they use.  Adding a md-5 checksum for each file to the DAM database and scrubbing all files isn't that difficult to do.
The only program that I know specific for photo's is Imageverifier. 

Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 29, 2012, 11:35:55 am
Unfortunately most DAM programs (including LR) don't have a simple way to check all files they use.  Adding a md-5 checksum for each file to the DAM database and scrubbing all files isn't that difficult to do.
The only program that I know specific for photo's is Imageverifier. 

This is not 100% true: DNG files have a checksum builtin of the image data that ACR and LR will validate when you work on an image.
Title: Re: Drive capacity/fullness vs. performance
Post by: alain on April 29, 2012, 01:07:08 pm
This is not 100% true: DNG files have a checksum builtin of the image data that ACR and LR will validate when you work on an image.
Are you suggesting that opening all DNG files one for one is a valid alternative to do an automatic check of all files? 

BTW.  And this is only for DNG files, not RAW's.
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 29, 2012, 01:17:37 pm
Are you suggesting that opening all DNG files one for one is a valid alternative to do an automatic check of all files? 

No, I'm not. I'm just saying that LR has a way to check one particular type of file that it supports- DNG. The checksum is present in the DNG file, so that part of the equation is solved. What it doesn't allow you to do is verify all image checksums, rather only the current one that you're working with.

Quote
BTW.  And this is only for DNG files, not RAW's.

DNG are RAW files.

Adobe supply a converter so that you can convert your CR2 or NEF files to DNG, including allowing you to embed the original file if you so desire.

So if you're worried about bit-rot, moving to use DNG will allow you to detect when bit rot occurs and potentially (for example if you put the original file inside the DNG) correct it (assuming that only part of the DNG data and not the original data has rotted.)

This is one of those things that makes me wish every camera generated DNG files because that way the data is protected from the moment the camera generates it.
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 29, 2012, 05:43:31 pm

Yes, I'm going to say "those who wrote ZFS" because I don't think it would be very cool to name drop.

Right because the ZFS developers are wildly popular famous people and by saying their names your credibility will actually go down as a result.

I don't think it's cool to appeal to a false (unnamed) authority. None of how ZFS works is some proprietary secret, it's an open source project.

For such "very well known issues with ZFS and RAIDZ performance" it's curious you had to go all the way to those who wrote it for clarification.

Quote
Mentioning the details of the hardware used for testing, would just see this thread descend into debate over the pro's and con's of various hardware, drivers, etc.

Peer review. It's actually just to understand the problem better. Most important is what platform and pool version, what benchmarking tools and what *relative* values between single disk and multidisk RAIDZ the various tests produce.

My strong suspicion is that you've confused the importance of small file random IO in a photographer's context, and diminish the value of large file sequential IO which is where RAIDZ performs quite well. Your data might suggest a net neutral result of RAIDZ for Lightroom catalog performance, which entails small random IO.

But while I'm not suggesting the Lightroom catalog go on a RAIDZ array, it might be a valuable test to confirm/deny this because ZFS is a reality on Mac OS X, and soon so will RAIDZ, as a commercial product.

Quote
Further, correcting you on various technical points above would not help anyone. But it seems like you've definitely had lots of ZFS cool-aid to drink.

Except, I thoroughly enjoy being corrected on technical points because I have an innate affinity for being technically correct. The thing is you haven't provided a single reference, or explanation at all for any of your claims.But you appear to have an ample supply of memes and non-responses.

Quote
Ok, the way you're talking here makes it clear that you're completely unfamiliar with Lightroom and its work flow.

You're right, that's why I keep the 1.0 beta email list archive handy for easy reference. But did you have a question? Or do you think these pot shots you take are adequate distractions from not answering questions you've been asked?

Quote
Since you're an open source freak, I'll suggest that you look at using darktable (which I'm unfamiliar with) since I believe that it is similar in work flow and design to Lightroom.

Pass, I'm reasonably pleased with LR, but thanks for the suggestion. It's the most useful data you've provided so far, insofar as it's the only data you've provided so far. I'd never heard of darktable before.

Quote
If I was only working with photoshop (or gimp) then what you describe would be relevant.

Perhaps. But you're welcome to explain why you think so. I will submit the following:

a.) Lightroom benefits more than Photoshop from small file random IO performing disks. This includes high RPM disks, RAID 0 array, and SSD. Such a fast disk may not be fault tolerant.

b.) The LR Catalog contains image file metadata. In normal (default) operation, metadata is not automatically saved to image XMP (sidecar or in DNG).[1] So the catalog is an important file to backup, and Lightroom has a feature to do this. But if you want a more frequent backup than those options provide, it might be nice to automate pushing the file to more reliable storage.

c.) The LR preview data is expendable. If your fast disk dies or needs to be reformatted, you can rebuild the previews and wait it out.

However, Lightroom stores them as individual files in a database, so a syncing program (such as those mentioned) would only push new and changed previews from fast media to more reliable storage. Totally optional. Might save you a few hours one day. But certainly not the end of the world to not back these up.


DNG are RAW files.

I think it's confusing to conflate DNG and Raw. DNG can contain entirely non-Raw data: e.g. JPEG and TIFF can be converted into DNG. Thus DNG could contain output-referred data, or camera-referred linear-demosaiced data, or camera-referred mosaiced data.[2]

Quote
So if you're worried about bit-rot, moving to use DNG will allow you to detect when bit rot occurs and potentially (for example if you put the original file inside the DNG) correct it (assuming that only part of the DNG data and not the original data has rotted.)

There's a potential for ambiguity in that the GUI may not distinguish which data is corrupt, in which case you don't have an easy way to correct the problem. I think the error detection and correction needs to be more automated than this. Since we need duplicates anyway for fault tolerance, it makes sense for the file system to simply manage the error detection and correction, including removal/replacement of the corrupt image with a known good copy, and leave me alone.

Quote
This is one of those things that makes me wish every camera generated DNG files because that way the data is protected from the moment the camera generates it.

It would be nice. But it's important to distinguish between the ability to detect error and the ability to correct it. Camera generated DNG with checksum would allow for error detection but not correction (of that particular DNG). Detection may be better than nothing, but I think we should expect better than just being notified of a problem.


[1] This is why it's a good idea to periodically "Save Metadata to Files".
[2] Some call it scene-referred. I distinguish between the dynamic range of the scene vs the camera, but it's often a smallish distinction.
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 29, 2012, 08:24:17 pm
... Garbage deleted ...
... even more garbage deleted ...

Again, I strongly recommend buying and watching the entire video from March 2009 (or even one of the current tutorials) on managing your image files. What you're unfamiliar with here is what many consider to be the best practice method for working in and work flow for Lightroom that's been documented in videos that are available on this website. If you don't want to make them available to yourself, then I can't help you further.

Quote
... more textbook comments ...

Rather than comment on Lightroom and how it works from reading about it, I would suggest that you get some operational experience, including when you've got a terabyte or two of images in your Lightroom library.

Oh, except for one thing:
Quote
a.) Lightroom benefits more than Photoshop from small file random IO performing disks. This includes high RPM disks, RAID 0 array, and SSD. Such a fast disk may not be fault tolerant.

Excuse me while I recover from my state of shock as you've recommended a RAID solution that wasn't RAIDZ.

Finally:
Quote
Except, I thoroughly enjoy being corrected on technical points because I have an innate affinity for being technically correct.

At least now I understand why your posts revolve around specific measurements you've made and minute detail of such.
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 29, 2012, 08:48:14 pm
If you don't want to make them available to yourself, then I can't help you further.

You're unwilling or unable to provide cites for statements you've made purporting to be fact. This has absolutely nothing to do with Lightroom or Lightroom videos.

Quote
Rather than comment on Lightroom and how it works from reading about it, I would suggest that you get some operational experience, including when you've got a terabyte or two of images in your Lightroom library.

I've been using it since before it was made public, and I'm amused at your arbitrary metric for determining who is qualified to use or comment on it. Coming from someone who makes false statements, references unnamed authorities, cites no data to back up claims, this is laughable.

Quote
Oh, except for one thing:
Excuse me while I recover from my state of shock as you've recommended a RAID solution that wasn't RAIDZ.

You're clueless. I recommended RAID 0 or 10 in my very first post in this thread:
And I'd keep the data on them limited (operating system, applications and short term data such as scratch/working disks including a smallish RAID 0 or 10).

Quote
Finally:
At least now I understand why your posts revolve around specific measurements you've made and minute detail of such.

Better than being wrong.
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 30, 2012, 11:05:47 am
You're unwilling or unable to provide cites for statements you've made purporting to be fact.

That's because the information being relayed comes from face to face conversations for which there are no URLs.
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 30, 2012, 12:34:02 pm
That's because the information being relayed comes from face to face conversations for which there are no URLs.

I'll propose your ears were clogged during a portion of the conversation, because your conclusion and total lack of explanation for the conclusion, are incongruent with established understanding of how RAIDZ works, what it does and does not do well. In the context you chose, the behavior is rather exhaustively explained in the Oracle blog post I cited, by a ZFS engineer.
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 30, 2012, 12:44:53 pm
Will an SSD Improve Adobe Lightroom Performance? (http://www.computer-darkroom.com/blog/will-an-ssd-improve-adobe-lightroom-performance/)

Ian Lyons did some benchmarking a year ago, and updated it last month. It focuses on the generation of Library and Develop module previews, which are CPU and RAM bound processes. Since this is the time Raw/DNG images would be most aggressively read off disk, it's interesting to note the minimal performance difference of using SSD versus even FW800. It's suggestive that a GigE NAS containing the image library (the Raw/DNGs) would not negatively impact performance in a significant way. The raw performance numbers imply a differential that the application may not be fully utilizing anyway.

The last paragraph suggests more frequent day to day usage tests that are probably difficult to objectively design. But that's where I'd expect faster local storage for lrcat and lrdata to make a difference.

A great suggestion of the article relates to boosting the CR Cache value, which is otherwise an easy to miss setting.
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 30, 2012, 02:07:50 pm
I'll propose your ears were clogged during a portion of the conversation, because your conclusion and total lack of explanation for the conclusion, are incongruent with established understanding of how RAIDZ works, what it does and does not do well. In the context you chose, the behavior is rather exhaustively explained in the Oracle blog post I cited, by a ZFS engineer.

Nope. Roch's blog confirms what I said - that RAIDZ is the slowest way to use ZFS.

What's more, it demonstrates that RAIDZ groups do not overcome the performance penalty associated with RAIDZ.

I really don't understand your argument as you've jumped around all over the place to try and say otherwise.

When I mentioned that RAIDZ was slow [#26] (and why), you then mentioned that RAIDZ groups should be used [#28] (even though this doesn't bring you up to par.) When I mentioned that this required more disks [#36], you complained that I was changing things [#40] whereas the more appropriate comment is that in mentioning RAIDZ groups, you'd forgotten to recognise that there is a minimum configuration that is somewhat larger than normal before it becomes relevant. That was rather bad of you to exclude that rather pertinent information.

Although your posts have made one thing clear - you've never actually had to deal with disk data corruption with ZFS (I'm also given to wonder if you've actually used ZFS at all.)
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 30, 2012, 03:35:16 pm
Nope. Roch's blog confirms what I said - that RAIDZ is the slowest way to use ZFS.

Now you're just making things up.

From the article:
an N-disk RAID-Z group will behave as a single device in terms of delivered random input IOPS. Thus a 10-disk group of devices each capable of 200-IOPS, will globally act as a 200-IOPS capable RAID-Z group.

Since when is 200 less than 200? That's "the same" as in "performance neutral." Exactly where do you get "slow" let alone "slowest"?

Further it's directly stated in the quote, that this neutral performance of adding disks affects random input. The whole article's premise is that random input IOPS doesn't scale by adding disks, to get more random IOPS you need to stripe RAIDZ groups. Even without striping, RAIDZ sequential reads and writes works "extremely well" and helps "streaming input loads significantly".

What planet are you on?

Quote
What's more, it demonstrates that RAIDZ groups do not overcome the performance penalty associated with RAIDZ.

Where? Be exact please.

In the grid, about 1/3 of the way down, it shows how striped RAIDZ groups overcome the random IO performance NEUTRALITY of RAIDZ. Two groups, double the IOPS (including random input). Five groups, you get five times IOPS (including random input).

What are you smoking? No need to be exact, I don't want any. Somehow you don't understand how RAID 0 works because that's all that striped RAIDZ groups employs, to scale random IO.

Quote
I really don't understand your argument as you've jumped around all over the place to try and say otherwise.

Projection.

Here you are now citing the very article a few posts ago you were trying to discredit as hand waiving. And you're completely misrepresenting everything about the article on top of it. It's willful deception at this point.
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 30, 2012, 05:11:06 pm
Now you're just making things up.

From the article:
an N-disk RAID-Z group will behave as a single device in terms of delivered random input IOPS. Thus a 10-disk group of devices each capable of 200-IOPS, will globally act as a 200-IOPS capable RAID-Z group.

Since when is 200 less than 200? That's "the same" as in "performance neutral." Exactly where do you get "slow" let alone "slowest"?

Because RAIDZ is not the only way in which ZFS can be used.

Quote
In the grid, about 1/3 of the way down, it shows how striped RAIDZ groups overcome the random IO performance NEUTRALITY of RAIDZ. Two groups, double the IOPS (including random input). Five groups, you get five times IOPS (including random input).

Yes, and compare the results with the use of ZFS without RAIDZ.

To make it easy for you, I'll cut-n-paste the numbers here:
---
   Config      Blocks Available   FS Blocks /sec
    ------------   ----------------   ---------
    Z 1  x (99+1)    9900 GB             200    
    Z 2  x (49+1)   9800 GB             400    
    Z 5  x (19+1)   9500 GB          1000    
    Z 10 x (9+1)   9000 GB          2000    
    Z 20 x (4+1)   8000 GB          4000    
    Z 33 x (2+1)   6600 GB          6600    

    M  2 x (50)    5000 GB         20000    
    S  1 x (100)   10000 GB      20000    
---
Z# = ZFS RAIDZ with # groups
M = ZFS Mirror
S = ZFS Simple striping

All of the above results are with ZFS, the only difference is how the zpool is created for the filesystem. I don't know how there's any other way to interpret the above results as meaning that RAIDZ is not the slowest way to use ZFS.
Title: Re: Drive capacity/fullness vs. performance
Post by: alain on April 30, 2012, 07:09:41 pm
Because RAIDZ is not the only way in which ZFS can be used.

Yes, and compare the results with the use of ZFS without RAIDZ.

To make it easy for you, I'll cut-n-paste the numbers here:
---
   Config      Blocks Available   FS Blocks /sec
    ------------   ----------------   ---------
    Z 1  x (99+1)    9900 GB             200    
    Z 2  x (49+1)   9800 GB             400    
    Z 5  x (19+1)   9500 GB          1000    
    Z 10 x (9+1)   9000 GB          2000    
    Z 20 x (4+1)   8000 GB          4000    
    Z 33 x (2+1)   6600 GB          6600    

    M  2 x (50)    5000 GB         20000    
    S  1 x (100)   10000 GB      20000    
---
Z# = ZFS RAIDZ with # groups
M = ZFS Mirror
S = ZFS Simple striping

All of the above results are with ZFS, the only difference is how the zpool is created for the filesystem. I don't know how there's any other way to interpret the above results as meaning that RAIDZ is not the slowest way to use ZFS.
It's extremely unlikely that a ZFS system in use for a photographer will be in need for massive random input IOPS.  Certainly not when it's run over "current" LAN speeds.

But for input IOPS inside a ZFS zpool there's the possibility to add a -small- ZIL device (which can be a mirror (or RAIDZ) in itself), for example using a SSD made for caching (several thousand iops sustained) or even a battery back upped RAM drive (>100.000 IOPS).

 
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 30, 2012, 07:14:31 pm
This section is RAIDZ only, all except the first are striped:

    Config            Blocks Available   Random FS Blocks /sec
   ------------       ----------------      ----------------------
    Z 1  x (99+1)   9900 GB             200    
    Z 2  x (49+1)   9800 GB             400    
    Z 5  x (19+1)   9500 GB            1000    
    Z 10 x (9+1)    9000 GB            2000    
    Z 20 x (4+1)    8000 GB            4000    
    Z 33 x (2+1)    6600 GB            6600   
    
You said:
it demonstrates that RAIDZ groups do not overcome the performance penalty associated with RAIDZ

I said on April 22:
Solvable by striped RAIDZ groups.

Anyone can see that as you add striped groups, performance goes up linearly, contrary to your assertion.

The "penalty" you keep referring to happens with random IO. These graphs are "Random FS Blocks" not sequential, and are not bandwidth.

In #42 I asked you to distinguish clearly your performance "penalty" claim. I asked if scaled out IOPS is important for a photographer, vs bandwidth. I asked you if you likewise disqualify RAID 5 which actually has a similar random IO scaling issue for the same basic reason as RAIDZ. You deleted all of those questions and refused to answer them.

Quote
All of the above results are with ZFS, the only difference is how the zpool is created for the filesystem.

The entire frigging blog post is expressly about *RANDOM* IO. It's at the top of the columns you copy-pasted (but conveniently forgot to paste the word random - nice touch by the way).

Quote
I don't know how there's any other way to interpret the above results as meaning that RAIDZ is not the slowest way to use ZFS.

That's because you refuse to distinguish between random and sequential IO. And because you refuse to acknowledge the distinction between IOPS and bandwidth.

And every time I've asked you to acknowledge these distinctions you delete the questions and requests, and don't respond. Yet you keep writing bunk that's simply not true and not relevant at all to a photographer even if it were true.
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on April 30, 2012, 07:22:32 pm
It's extremely unlikely that a ZFS system in use for a photographer will be in need for massive random input IOPS.  Certainly not when it's run over "current" LAN speeds.

Right, exactly. Raw, DNG, large PSD and TIFFs hardly qualify for random IO. These will be sequential IO, the IO requests will be aggregated by ZFS, and RAIDZ will parallelize the data streams from all disks in the RAIDZ group. It would even be fine for 10 GigE usage, but most certainly the entire conversation is insanely moot for 1GigE when a single disk saturates GigE.

Quote
But for input IOPS inside a ZFS zpool there's the possibility to add a -small- ZIL device (which can be a mirror (or RAIDZ) in itself), for example using a SSD made for caching (several thousand iops sustained) or even a battery back upped RAM drive (>100.000 IOPS).

Yeah for large file writing there may not be that much filesystem metadata needing to be cached, which is what the ZIL is for. For caching data read/writes, that's L2ARC where you actually get a sort of automated "hot file" and "cold" file distinction at a file system level. Very cool.
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on April 30, 2012, 11:16:40 pm
This section is RAIDZ only, all except the first are striped:

    Config            Blocks Available   Random FS Blocks /sec
   ------------       ----------------      ----------------------
    Z 1  x (99+1)   9900 GB             200     
    Z 2  x (49+1)   9800 GB             400     
    Z 5  x (19+1)   9500 GB            1000     
    Z 10 x (9+1)    9000 GB            2000     
    Z 20 x (4+1)    8000 GB            4000     
    Z 33 x (2+1)    6600 GB            6600

Anyone can see that as you add striped groups, performance goes up linearly, contrary to your assertion.

So what?

That's not contrary to what I'm saying which is that RAIDZ is slower than non-RAIDZ.

How many times do I have to repeat this?

RAIDZ is the slowest method of using ZFS.

I'm not saying that some forms of RAIDZ aren't faster than others, I'm saying RAIDZ (in any configuration) is slower than using the other methods supported by ZFS.

None of the RAIDZ performance figures come even close to mirror or striping.

I don't know what more I can say here. That you deleted the lines from the performance results that mentioned mirroring and striping is probably significant, I don't know. If you think that RAIDZ is the only way to use ZFS then you're wrong.

For a NAS connected via Gigabit ethernet, sure, RAIDZ vs non-RAIDZ may not be noticable. But for locally hosted storage connected via eSATA or faster, it can be.

Quote
The "penalty" you keep referring to happens with random IO. These graphs are "Random FS Blocks" not sequential, and are not bandwidth.

No, it happens with all IO. For any given operation, using RAIDZ will be slower than any non-RAIDZ equivalent. Again, this is because it treats all of the disks within a given group as a single disk. I'm not interested in comparing RAIDZ with 2 RAIDZ groups, I'm interesting in comparing RAIDZ with the non-RAIDZ equivalents (mirroring and striping.)

Which is to say that RAIDZ will also be slower in terms of raw bandwidth than is available than ZFS with mirroring or striping.

Quote
In #42 I asked you to distinguish clearly your performance "penalty" claim. I asked if scaled out IOPS is important for a photographer, vs bandwidth. I asked you if you likewise disqualify RAID 5 which actually has a similar random IO scaling issue for the same basic reason as RAIDZ. You deleted all of those questions and refused to answer them.

That's because RAID 5 isn't RAIDZ and thus discussion of it (RAID 5) is not relevant in a discussion thread on RAIDZ except to serve as a distraction and a destination for more rat-holing. As for what's important to the photographer, that's likely a subjective thing. The most pertinent message on that topic has been the desire for external disks to perform at the same speed as internal disks. That will mean both IOPS and bandwidth. What they are interested in is connecting an internal disk to their system and getting similar performance to their internal disks. For me, that says they want eSATA, USB 3.0 or better.

Quote
You deleted all of those questions and refused to answer them.

Let he who has not deleted questions and thus refused to answer them cast the first stone.

Quote
The entire frigging blog post is expressly about *RANDOM* IO. It's at the top of the columns you copy-pasted (but conveniently forgot to paste the word random - nice touch by the way).

Ah, that exclusion wasn't deliberate but having said that, you won't believe it.

But to take this a step further, unless you're streaming video content or working with really large files (100s of megabytes in size) then random IO is the correct model to use for determining whether something is good or bad. It's definitely the right model to use for any disk that has images on it.

Quote
That's because you refuse to distinguish between random and sequential IO. And because you refuse to acknowledge the distinction between IOPS and bandwidth.

See above.

Quote
And every time I've asked you to acknowledge these distinctions you delete the questions and requests, and don't respond. Yet you keep writing bunk that's simply not true and not relevant at all to a photographer even if it were true.

ok, so now you've resorted to insults. At this point I'm not going to engage any further after posting this because I don't want to be involved in discussions that are clearly going downhill and I also don't want to further encourage you.
Title: Re: Drive capacity/fullness vs. performance
Post by: John.Murray on May 01, 2012, 02:03:09 am
http://youtu.be/tDacjrSCeq4

:D
Title: Re: Drive capacity/fullness vs. performance
Post by: dreed on May 01, 2012, 04:22:55 am
http://youtu.be/tDacjrSCeq4

:D

Yes, I've seen that before :D
Title: Re: Drive capacity/fullness vs. performance
Post by: Farmer on May 01, 2012, 05:29:46 am
Awesome, John :-)
Title: Re: Drive capacity/fullness vs. performance
Post by: chrismurphy on May 04, 2012, 06:11:45 pm
That's not contrary to what I'm saying which is that RAIDZ is slower than non-RAIDZ.
How many times do I have to repeat this?

Insanity is doing the same thing, over and over again, but expecting different results.
— Albert Einstein

Everyone is entitled to his own opinion, but not his own facts.
— Daniel Moynihan

You could do with supplying some facts. It'd probably make a difference.

Which is to say that RAIDZ will also be slower in terms of raw bandwidth than is available than ZFS with mirroring or striping.

Roch writes in the blog comments himself:
For Streaming purposes RAID-Z performance will be on par with anything else.

Further Jeff Bonwick, who was ZFS engineering lead, writes in the blog comments and also on the zfs-discuss@opensolaris list (http://comments.gmane.org/gmane.os.solaris.opensolaris.zfs/665):
It's only when you're doing small random reads that the difference between RAID-Z and mirroring becomes significant. For such workloads, everything that Roch said is spot on.

Petabyte storage from Aberdeen says its ideal configuration is RAID-Z2 in each of the eight JBODs, with the octet pooled together.[1] Somehow I doubt they'd choose their ideal configuration to be "terrible for performance (http://www.luminous-landscape.com/forum/index.php?topic=65613.msg522579#msg522579)."

That's because RAID 5 isn't RAIDZ and thus discussion of it (RAID 5) is not relevant in a discussion thread on RAIDZ except to serve as a distraction and a destination for more rat-holing.

The original poster mentioned his RAID 5 array in post #1. In my post #25, I compared RAIDZ and RAID 5, making RAID 5 the original context in which RAIDZ was brought up.

RAID 5 is block level striping with single parity. RAIDZ is dynamic striping with single parity. RAIDZ is routinely compared to RAID 5, including the earlier Aberdeen example, and the authors of this book on ZFS:

In ZFS you can also create redundant vdevs similar to RAID-5, called RAID-Z.
Quote from Solaris 10 ZFS Essentials (http://my.safaribooksonline.com/book/operating-systems-and-server-administration/solaris/9780137049639/managing-storage-pools/ch02lev1sec2#X2ludGVybmFsX0ZsYXNoUmVhZGVyP3htbGlkPTk3ODAxMzcwNDk2MzkvY2gwMmxldjFzZWM0)

But to take this a step further, unless you're streaming video content or working with really large files (100s of megabytes in size) then random IO is the correct model to use for determining whether something is good or bad. It's definitely the right model to use for any disk that has images on it.

whiskey tango foxtrot, this must be a joke. Who wants to let me in on it?

Now, after 60 posts, you redefine what "small random IO" is in order to make your past ridiculous arguments somehow correct? Fine, but in so doing, you're essentially saying 99% of all data on the planet is small random IO. Which is equally absurd.

After reading dreed's post, I feel like I've just watched an episode of Charlie the Unicorn Goes to Candy Mountain. (http://www.youtube.com/watch?v=JPONTneuaF4)

All Tom's Hardware tests use a 4KB block size for random IO tests. I found one test (http://www.tomshardware.com/reviews/intel-ssd-310-msata-mini-solid-state-drive,2854-8.html) with 4K and 512KB. No bigger. The default NTFS and JHFS+ allocation block (or cluster) size is 4K. For RAID 0 and 1 it's typically 128KB.

And the fragmentation for some randomly chosen files on my computer:

7MB DNG               1 extent
25MB DNG             1 extent
62MB 8bpc TIFF      3 extents
126MB 16bpc TIFF   3 extents
788MB 16bpc TIFF   9 extents

Considering the worst case scenario, including if Apple and Microsoft weren't doing any optimizations to avoid fragmentation, that 7MB DNG could be made of 14,336 fragments, i.e. one per disk sector, with zero contiguous sectors. Our file systems are, fortunately, engineered to be more efficient than this.

None of the example files, or any of the other dozens I've looked at, does random IO apply. It is overwhelmingly sequential IO because disk seeks are not appreciable, let alone dominate.

Quote
The most pertinent message on that topic has been the desire for external disks to perform at the same speed as internal disks. That will mean both IOPS and bandwidth. What they are interested in is connecting an internal disk to their system and getting similar performance to their internal disks. For me, that says they want eSATA, USB 3.0 or better.

People get similar performance with images on FW800 external disks, compared to SSD. Ian called the difference "negligible". That bodes well for storing Raw/DNG on GigE.

Working through already rendered images quickly in Library, will pull JPEG previews from whatever storage you have the catalog (lrcat and lrdata) on. And Develop previews come from the Camera Raw cache. Both of those should go on your fastest disk regardless of whether it's internal or external.

Quote
ok, so now you've resorted to insults.

Bunk means nonsense. It was an observation.

I'm not thinking you're a moron. I'm thinking that you think everyone else is a moron. Or, again, it's a joke. Either way, in my opinion, it is you who are being insulting.


[1] What Does One Petabyte Of Storage (And $500K) Look Like? (http://www.tomshardware.com/picturestory/582-petarack-petabyte-sas.html), page 6.