Pages: [1] 2 3 ... 5   Go Down

Author Topic: Drive capacity/fullness vs. performance  (Read 37015 times)

jonathan.lipkin

  • Full Member
  • ***
  • Offline Offline
  • Posts: 158
Drive capacity/fullness vs. performance
« on: April 08, 2012, 08:30:08 pm »

About a decade ago, I took a backup server course, and was told that for optimal performance drives should be no more than 80% full. Does that still hold today? I have an eSATA RAID 5 with 4x1T drives, for a total capacity of 3T. It's currently at 2.86, meaning it's about 90% full. Should I migrate some data to another drive?

Couldn't find a more descriptive word than 'fullness'.
Logged

Ken Bennett

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 1797
    • http://www.kenbennettphoto.com
Logged
Equipment: a camera and some lenses. https://www.instagram.com/wakeforestphoto/

jonathan.lipkin

  • Full Member
  • ***
  • Offline Offline
  • Posts: 158
Re: Drive capacity/fullness vs. performance
« Reply #2 on: April 09, 2012, 02:22:38 pm »

Fascinating. Thanks.
Logged

alain

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 465
Re: Drive capacity/fullness vs. performance
« Reply #3 on: April 09, 2012, 05:50:31 pm »

Yes.

http://macperformanceguide.com/Storage-BiggerIsBetter.html

The link is not something about 80% full, but that newer "bigger" disk are faster.  This is mainly a data density issue (less rotations for the same amount of data).

The Seagate ST3000DM001 3TB drive seems the fastest at the moment, this thanks to it's 3 1TB platters.
Logged

jonathan.lipkin

  • Full Member
  • ***
  • Offline Offline
  • Posts: 158
Re: Drive capacity/fullness vs. performance
« Reply #4 on: April 09, 2012, 07:29:22 pm »

There is a section of the article which describes how drive performance degrades as the drive becomes filled with data (to quote the article "greater capacity means higher performance over the drive (for the same amount of storage used)."), though not quite as clearly as the previous article:

http://macperformanceguide.com/Storage-WhyYouNeedMoreThanYouNeed.html

Bigger drives are faster for a particular amount of data, because you are not filling the drives as much.

Quote:

When setting up a striped RAID with speed as the goal, you should get large enough hard disks to “waste” as much as half of the available storage so that you can maintain peak speed for your data...A good rule of thumb is to make the total volume size be at least 50% larger than the amount of data you expect to store (you can backup and redo it if necessary).
...
This is why using the first half of the drive (or perhaps 2/3) offers much higher performance than filling a hard drive (of any capacity) to the 70%+ level.

/quote

As far as I can tell, this is because
Logged

Schewe

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 6229
    • http:www.schewephoto.com
Re: Drive capacity/fullness vs. performance
« Reply #5 on: April 09, 2012, 08:04:07 pm »

Couldn't find a more descriptive word than 'fullness'.

HDs start writing on the inner portions (sectors) of a drive. Closer to the middle. The distance between sectors is much closer. The heads that read/write the data move in a much smaller physical space/distance. As you fill the drive, the sectors at the outer positions of the platter are used. That makes the head travel a lot longer to read/write the sectors. When you get past 60%-75% or so capacity, subsequent data is only written/read at the outer sectors which are slower to read/write.

Doesn't matter a whole lot whether or not it's a single, multi-platter drive or a RAID system...the inner sectors of the drivea are always faster to read/write.

Also note that sector/block errors only get worse over time. The fuller the drive, the more work the heads have to do to get the data..

You will ALWAYS be better off using a smaller subset of the total drive capacity because it's faster with less errors...in the case of HDs, bigger is better and faster than with smaller cramped HDs. Time to grow upwards...
Logged

dreed

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 1715
Re: Drive capacity/fullness vs. performance
« Reply #6 on: April 16, 2012, 06:19:16 pm »

About a decade ago, I took a backup server course, and was told that for optimal performance drives should be no more than 80% full. Does that still hold today? I have an eSATA RAID 5 with 4x1T drives, for a total capacity of 3T. It's currently at 2.86, meaning it's about 90% full. Should I migrate some data to another drive?

The URL above is for single drive performance, which is going to be quite different to what you see with RAID 5.

Nonetheless, the issue with drive performance over 80% has little to do with the drive mechanism and more to do with filesystem theory - the more data that is stored in a filesystem, the harder the system has to work to find new space, etc. Around 80% is the tipping point. At 90%, you should be planning to upgrade storage because by the time you actually do it, you'll likely be a lot closer to 100%.
Logged

Tom Frerichs

  • Guest
Re: Drive capacity/fullness vs. performance
« Reply #7 on: April 17, 2012, 10:57:34 am »

... which is going to be quite different to what you see with RAID 5.

That's not quite true.  Regardless of the RAID level in use, RAID 0, RAID 1, RAID 5, or RAID 10, underlying drive performance will impact system performance. For example, in RAID 5, total performance will be limited by the slowest drive in the array. One advantage of RAID, however, is that it is commonly implemented with a hardware controller which offers substantial performance improvements through buffering, asynchronous reading/writing, etc.

Also, the earlier point about lower numbered cylinders (closer to the center of the drive) having faster performance is also true. Most UNIXes, if given half a chance, will locate the swap partition, which is used to augment physical memory with disk storage, as the first partition on the drive. I don't own a Mac, but as the underlying OS is a UNIX I'm willing to bet that this applies there, too.

Finally, all file systems use some sort of indexing to both locate files on the drives and to handle free space. As file systems fill up--with a large number of "small files" versus a few huge files--the indexing gets more involved. It also gets harder to allocate contiguous free space, which is worsened when files are added and deleted. NTFS, used by Windows, is particularly prone to this latter issue, which is why Windows has a defragmentation routine. Other file systems--UFS, ZFS, Rieser, etc--suffer from this issue less.

Installing a solid state drive for your scratch disk and possibly your catalog--remembering to back up your catalog frequently--is the best way to improve performance. After all, most of your disk activity deals with reading previews and filtering on metadata rather than actually reading RAW files and writing output files.

Your mileage may very.   ;D

Tom Frerichs
Logged

jonathan.lipkin

  • Full Member
  • ***
  • Offline Offline
  • Posts: 158
Re: Drive capacity/fullness vs. performance
« Reply #8 on: April 17, 2012, 11:11:39 am »



Installing a solid state drive for your scratch disk and possibly your catalog--remembering to back up your catalog frequently--is the best way to improve performance. After all, most of your disk activity deals with reading previews and filtering on metadata rather than actually reading RAW files and writing output files.


I've thought of that as well. On my next system (whenever Apple gets around to refreshing the Mac Pro drive, if that ever even happens) I'm thinking of installing an internal striped RAID and keeping the LR catalog there.

Interestingly, the article mentioned above notes that there is no performance penalty when filling an SSD. The performance vs. fullness chart for a traditional HD starts to fall around 50% of capacity. The SSD chart is flat - performance is unchanged by fullness.
Logged

Tom Frerichs

  • Guest
Re: Drive capacity/fullness vs. performance
« Reply #9 on: April 17, 2012, 01:40:32 pm »

I've thought of that as well. On my next system (whenever Apple gets around to refreshing the Mac Pro drive, if that ever even happens) I'm thinking of installing an internal striped RAID and keeping the LR catalog there.

Interestingly, the article mentioned above notes that there is no performance penalty when filling an SSD. The performance vs. fullness chart for a traditional HD starts to fall around 50% of capacity. The SSD chart is flat - performance is unchanged by fullness.

The reason SSD doesn't have that kind of penalty is that there is no "head seek time."

In a physical hard drive, the read/write head is on an arm that moves from the center to the outside (and of course back again -- grin).  Think old fashioned phonograph. Instead of a long spiral track, like you had on a phonograph record, the data is recorded in concentric circles--sort of like the rings of a tree or the layers of an onion. Each one of these circles is a "cylinder"

When you read or write to a hard drive, you have to move the read/write head to the correct cylinder, and that takes time. With an SSD, all you do is flip some bits to access a different chunk of memory...no physical movement at all.

If you have a choice, don't put your catalog on a RAID 5 array. For redundancy, RAID 1 (mirrored drives) is a faster alternative. RAID 5, because of the way it writes parity blocks, can have write performance problems.  RAID 1, or the closely allied RAID 10, which is RAID 1+0, is more expensive because you lose 1/2 of your total drive space versus one drive out of the array for RAID 5.

So, RAID 5 for storing your RAW and output files, RAID 1 or even a bare drive with good backups or SSD for your catalog.


BTW, I deal with arrays with 60 TB capacities...but not for photographs.  I don't take anywhere near that number of pictures.  ;)

Tom Frerichs
Logged

Chris Kern

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 2034
    • Chris Kern's Eponymous Website
Re: Drive capacity/fullness vs. performance
« Reply #10 on: April 17, 2012, 08:10:32 pm »

Quote from: Tom Frerichs link=topic=65613
The reason SSD doesn't have that kind of penalty is that there is no "head seek time."

No rotational latency, either.  In my experience, that's a more serious performance drain than seek time.

Quote from: Tom Frerichs
Finally, all file systems use some sort of indexing to both locate files on the drives and to handle free space. As file systems fill up--with a large number of "small files" versus a few huge files--the indexing gets more involved. It also gets harder to allocate contiguous free space, which is worsened when files are added and deleted. NTFS, used by Windows, is particularly prone to this latter issue, which is why Windows has a defragmentation routine. Other file systems--UFS, ZFS, Rieser, etc--suffer from this issue less.

I'd really like to see broader implementation of ZFS.  It's so easy to manage and the performance with cheap drives—and without a hardware RAID controller—is more than adequate for even a high-end desktop system.  And I couldn't agree more about NTFS.  Microsoft has made enormous improvements in its OS releases in recent years, but NTFS is still quite primitive in some respects.  If I recall correctly, we geezers stopped needing to defragment UFS back around the 7th Edition.

Chris

dreed

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 1715
Re: Drive capacity/fullness vs. performance
« Reply #11 on: April 18, 2012, 01:24:57 pm »

Quote
Nonetheless, the issue with drive performance over 80% has little to do with the drive mechanism and more to do with filesystem theory - the more data that is stored in a filesystem, the harder the system has to work to find new space, etc. Around 80% is the tipping point. At 90%, you should be planning to upgrade storage because by the time you actually do it, you'll likely be a lot closer to 100%.
...
Finally, all file systems use some sort of indexing to both locate files on the drives and to handle free space. As file systems fill up--with a large number of "small files" versus a few huge files--the indexing gets more involved. It also gets harder to allocate contiguous free space, which is worsened when files are added and deleted. NTFS, used by Windows, is particularly prone to this latter issue, which is why Windows has a defragmentation routine. Other file systems--UFS, ZFS, Rieser, etc--suffer from this issue less.

I don't know where you heard this myth (from vendors?) about other filesystems suffering less from this issue because it simply isn't true. Most other filesystems (UFS, etc) prevent you from completely filling the filesystem for a number of reasons. The performance drop off being one of them.

SSD is good as long as you don't write to it very often. So in light of that, it is good as a data store where you don't delete things. Using SSD for the working store of your catalogue is a bad idea although it will start out quick, performance will degrade.
Logged

Tom Frerichs

  • Guest
Re: Drive capacity/fullness vs. performance
« Reply #12 on: April 19, 2012, 12:28:41 am »

This topic has become too technical already, and I don't want to spend the time to look up the resources regarding file systems. However, one point does need to be corrected.

SSD is good as long as you don't write to it very often. So in light of that, it is good as a data store where you don't delete things. Using SSD for the working store of your catalogue is a bad idea although it will start out quick, performance will degrade.

I just decommissioned four servers running PostgreSQL databases. These servers were not as high-performance as many of the readers of this forum may be running right now.

Each database received between 20 to 22 million updates/inserts every 24 hours. When PostgreSQL updates a record, it doesn't write the data over the same location on the drive. Instead it writes a new record in a different location and marks the old one as dead and available for later harvesting and reuse. I think you must agree that this represents a much higher level of disk read/write activity than even the most rabid LR user would cause.

Those servers had been running for over two years without failure and without any performance degradation.

And the database files were on inexpenisive SSD drives.

I'll stand by what I said. Other than a slightly worse MTTF on SSD drives--which is why I recommended being very scrupulous about backing up the catalog--there is no penalty storing the catalog and scratch files on an SSD and significant performance gains in doing so.
« Last Edit: April 19, 2012, 12:46:36 am by Tom Frerichs »
Logged

dreed

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 1715
Re: Drive capacity/fullness vs. performance
« Reply #13 on: April 19, 2012, 03:48:20 am »

Logged

Rhossydd

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 3369
    • http://www.paulholman.com
Re: Drive capacity/fullness vs. performance
« Reply #14 on: April 19, 2012, 04:26:26 am »

Using SSD for the working store of your catalogue is a bad idea although it will start out quick, performance will degrade.
From what I've read although SSDs get slower with use, they never dip below normal HDD performance levels. There's just so much more speed there to begin with.
Logged

Walter Schulz

  • Full Member
  • ***
  • Offline Offline
  • Posts: 105
Re: Drive capacity/fullness vs. performance
« Reply #15 on: April 19, 2012, 03:34:29 pm »

From what I've read although SSDs get slower with use, they never dip below normal HDD performance levels. There's just so much more speed there to begin with.

c't (german IT magazine) thinks different. There is a duration stress test running. A "Solid 3" SSD reached serious degration after 18 TBytes (write) and  dropped below 20 percent of it's initial write performance after ca. 22 TBytes. A "FM 25" performed with its maximum performance until 152 TBytes were written and dropped below 10 percent after.
Issue 3/2012.

I suppose truth is somewhere between "quick" and "never".

Ciao, Walter
« Last Edit: April 19, 2012, 03:37:09 pm by Walter Schulz »
Logged

Rhossydd

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 3369
    • http://www.paulholman.com
Re: Drive capacity/fullness vs. performance
« Reply #16 on: April 19, 2012, 04:03:29 pm »

c't (german IT magazine) thinks different. There is a duration stress test running. A "Solid 3" SSD reached serious degration after 18 TBytes (write) and  dropped below 20 percent of it's initial write performance after ca. 22 TBytes. A "FM 25" performed with its maximum performance until 152 TBytes were written and dropped below 10 percent after.
Sorry, but without some sort of credible detail and explanation that doesn't make a lot of sense.
Logged

Walter Schulz

  • Full Member
  • ***
  • Offline Offline
  • Posts: 105
Re: Drive capacity/fullness vs. performance
« Reply #17 on: April 19, 2012, 05:07:38 pm »

Sorry, but without some sort of credible detail and explanation that doesn't make a lot of sense.

I agree, never trust data without background information about procedures and so on.
But I cannot copy the article for obvious reasons. I have to contact the author and ask if there is more information available. Tests are still running AFAIK.

Ciao, Walter
Logged

mephisto42

  • Newbie
  • *
  • Offline Offline
  • Posts: 3
Re: Drive capacity/fullness vs. performance
« Reply #18 on: April 20, 2012, 02:33:57 am »

Hi,

Sorry, but without some sort of credible detail and explanation that doesn't make a lot of sense.

I'm the one, who wrote that article in c't. What kind of detail would You be interested in? From all what we've seen with SSDs until now, the behaviour depends heavily on the controller chip used. Some controllers (especially the Sandforce family) are using compression algorithms, so performance also depends on the type of data. (e.g. plain text is highly compressible and therefore writte extremly fast, while jpgs will be written substantially slower). As the tests are still running, our pre conclusion is something like this: If and how much a SSD degrades with time depends on a) the controller chip, b)  the amount of data written and sometimes c) the type of data written. There is no rule of thumb for all the controllers out there, but there are SSDs available, where all the degradation stuff happens so late, that You shouldn't worry about. In our tests, we write with maximum speed 24 hours a day, 7 days a week. And with the better SSDs it still takes month to degrade performance. I'm not sure, what Your everyday usecase is, but probably not this. ;-) On the other hand, there are also SSDs which I wouldn't use if they were for free.

Best Regards

Benjamin Benz
c't magazine
http://www.heise.de/ct
Logged

Rhossydd

  • Sr. Member
  • ****
  • Offline Offline
  • Posts: 3369
    • http://www.paulholman.com
Re: Drive capacity/fullness vs. performance
« Reply #19 on: April 20, 2012, 03:47:36 am »

I'm the one, who wrote that article in c't.
Thanks for taking the trouble to join in here and clarify things.
[/quote]What kind of detail would You be interested in?[/quote]
Well any details would do really. I doubt the quote above (A "Solid 3" SSD reached serious degration after 18 TBytes (write) and  dropped below 20 percent of it's initial write performance after ca. 22 TBytes. A "FM 25" performed with its maximum performance until 152 TBytes were written and dropped below 10 percent after.)
I'm sure you'd agree that in isolation that 'quote' doesn't really tell us anything useful.

If you work back up the thread the fundamental question is: Will an SSD's performance ever degrade below a normally plattered HDD ? If so when ?
(in real world use, after the sort of number of writes a photographer will make during the working life of the drive, not some 24hr a day high intensity benchmark)
Logged
Pages: [1] 2 3 ... 5   Go Up