|
|
|
date: Tue, 13 Mar 2007 09:43:47 -0700,
group: microsoft.public.exchange.design
back
IOPS and megacycles calculations
Hello
I am trying to capacity plan for a migration of 30,000 users to Exchange
2003. Currently the users are based on a Sun iPlanet environment using only
POP and IMAP connections.
In the exchange environment initially they will use POP/IMAP and MAPI then
overtime we forecast about 20,000 MAXIMUM users to be configured for MAPI.
Actually we cannot even see more than 15,000 ever being moved to MAPI.
Currently we also have a four node exchange 2003 cluster, 3 Active 1 Passive
hosting 20,000 users in MAPI with a backend SAN environment.
Our clustered environment is hosted at our Data Centre and all our offices
over the city connect back to us via Fibre. All users connect via MAPI and
there a few using RPC over HTTPS. There is also a front end NLB consisting
of 3 servers for OWA but this is mainly used for OWA and OMA connections.
I am using the Microsoft white paper “Optimizing Storage for Microsoft
Exchange Server 2003” to work out IOPS and Megacycles per mailbox.
I have monitored data for 2 weeks period on the current cluster and I seem
to have no 2 hour busy period which is consistent (Monday mornings), it seems
to change from day to day and hour to hour .. my users are in Spain and
working habits here are not the same!!
I intend to collect data for about one month then analysis this data and use
statistical averages etc to come up with a figure. I will come up with an
IOPS and Megacycles for a 2 hour busy period every day. Then analysing this
set of information to come up with an average IOPS and Megacycles based on
data collected every day.
My intention is come up with an IOPS and Megacycle for the current 20,000
users. Then use this calculation to size hardware for the 30,000 users I
intend to migrate.
I wanted to check on two things:
1) If my approach to use the IOPS and Megacycles from the 20,000 users can
be used for planning for the 30,000 users?
2) MORE importantly looking at tonnes of documentation on the web I am
seeing some people suggesting to monitor – logical disk transfers and others
suggesting Physical disk transfer in Event viewer. Which do I use? Four node
cluster on HP hardware backend is Storagetek SAN.
Thanks
T
date: Tue, 13 Mar 2007 09:43:47 -0700
author: TonyP
Re: IOPS and megacycles calculations
1. In some cases, I have found it necessary to take a statistical approach
to sizing. In this case, you would collect data points over a period of
time (weeks to months) at a lower sampling frequency (5 minute interval or
so). Once the data is collected and svaed to csv format, you can work with
the numbers in your favorite statistics package to create a normalized bell
curve, then find the mean and the standard deviation. From that point, you
would add enough standeard deviations to the mean to reach the desired level
of accuracy and use that figure for your sizing. I would go the extra mile
and do this. If you use the average alone then 50% of the time you will be
undersized, ensuring an undesirable end user experience and the failure of
your project. By adding 5 standard deviations to the mean, you will meet
your perfomance objective 99.9485% of the time. If you use six standard
deviations, you'll reach 5 nines.
2. From a disk perspective, the important numbers are:
reads/sec
writes/sec
sec/write
sec/read
user count
active user count
I use the physical disk counters to obtain these numbers. These counters
are collected using perfmon. The sum of reads and writes per second (or
transactions/per sec) tells you the total IOPS while the read and write
numbers are used to generate the read/write ratio. Seconds per read and
write tell you about IO latency. You want latency to average less than 20ms
with no spikes lasting more than a few seconds over 50ms.
The MSExchangeIS counter for user count and active user count can be used to
derive your concurrency rate. Again, don't use averages or you'll be doomed
to failure. Unless there is a clear case (an organization such as a
hospital that works in clear shifts, or an airline where a lage percentage
of employees are in the air and don't have access to mail), you should err
toward the side of caution. I use 100% unless there is a clear case to do
otherwise.
3. You'll need to take the read numbers, plus the write numbers (with the
write penalty for your RAID type factored in) and determine the IOPS that
your storage subsystem must be capable of supporting (the math is in
"Optimizing Storage for Exchange Server 2003", so I won't repeat it here).
You primary concern will be adequate spindle count to support the
performance requirement. Do be aware of the response time requirement.
Many folks make the mistake of using the maximum IOPS a spindle can support
when determining spindle count. This number is not the same as the number
of IOPS at a 20ms response time a spindle can support.
4. Change delta will be a significant factor with a sizable pop/imap user
base. Typically these users pull the mail from the server to the client
each time they check their mail. If users check their mail each day, this
translates to a 100% change delta without deleted items retention or a 12.5%
change delta with 7 days deleted items retention. As you migrate to MAPI
clients, the size of the databases will grow as more mail is retained on the
server, however the change deltas will shrink. Change deltas impact space
and time for incremental disk or tape backup or space for snapshots.
John
"TonyP" wrote in message
news:46B30508-7444-41F4-BAEB-E9961C83ACB5@microsoft.com...
> Hello
>
>
> I am trying to capacity plan for a migration of 30,000 users to Exchange
> 2003. Currently the users are based on a Sun iPlanet environment using
> only
> POP and IMAP connections.
>
> In the exchange environment initially they will use POP/IMAP and MAPI then
> overtime we forecast about 20,000 MAXIMUM users to be configured for MAPI.
> Actually we cannot even see more than 15,000 ever being moved to MAPI.
>
> Currently we also have a four node exchange 2003 cluster, 3 Active 1
> Passive
> hosting 20,000 users in MAPI with a backend SAN environment.
>
> Our clustered environment is hosted at our Data Centre and all our offices
> over the city connect back to us via Fibre. All users connect via MAPI
> and
> there a few using RPC over HTTPS. There is also a front end NLB
> consisting
> of 3 servers for OWA but this is mainly used for OWA and OMA connections.
>
> I am using the Microsoft white paper "Optimizing Storage for Microsoft
> Exchange Server 2003" to work out IOPS and Megacycles per mailbox.
>
> I have monitored data for 2 weeks period on the current cluster and I seem
> to have no 2 hour busy period which is consistent (Monday mornings), it
> seems
> to change from day to day and hour to hour .. my users are in Spain and
> working habits here are not the same!!
>
> I intend to collect data for about one month then analysis this data and
> use
> statistical averages etc to come up with a figure. I will come up with an
> IOPS and Megacycles for a 2 hour busy period every day. Then analysing
> this
> set of information to come up with an average IOPS and Megacycles based on
> data collected every day.
>
> My intention is come up with an IOPS and Megacycle for the current 20,000
> users. Then use this calculation to size hardware for the 30,000 users I
> intend to migrate.
>
> I wanted to check on two things:
>
> 1) If my approach to use the IOPS and Megacycles from the 20,000 users can
> be used for planning for the 30,000 users?
>
> 2) MORE importantly looking at tonnes of documentation on the web I am
> seeing some people suggesting to monitor - logical disk transfers and
> others
> suggesting Physical disk transfer in Event viewer. Which do I use? Four
> node
> cluster on HP hardware backend is Storagetek SAN.
>
> Thanks
>
> T
>
date: Tue, 13 Mar 2007 15:52:17 -0700
author: John Fullbright [MVP] fjohn@donotspamnetappdotcom
Re: IOPS and megacycles calculations
Ok a little more complicated than I had envisaged but like you said it is
worth the extra mile.
So how do I obtain an normalized bell curve!! and then find mean and
average? I found this link
http://support.microsoft.com/kb/213930
Also I dont understand change delta?
Tony
"John Fullbright [MVP]" wrote:
> 1. In some cases, I have found it necessary to take a statistical approach
> to sizing. In this case, you would collect data points over a period of
> time (weeks to months) at a lower sampling frequency (5 minute interval or
> so). Once the data is collected and svaed to csv format, you can work with
> the numbers in your favorite statistics package to create a normalized bell
> curve, then find the mean and the standard deviation. From that point, you
> would add enough standeard deviations to the mean to reach the desired level
> of accuracy and use that figure for your sizing. I would go the extra mile
> and do this. If you use the average alone then 50% of the time you will be
> undersized, ensuring an undesirable end user experience and the failure of
> your project. By adding 5 standard deviations to the mean, you will meet
> your perfomance objective 99.9485% of the time. If you use six standard
> deviations, you'll reach 5 nines.
>
> 2. From a disk perspective, the important numbers are:
>
> reads/sec
> writes/sec
> sec/write
> sec/read
> user count
> active user count
>
> I use the physical disk counters to obtain these numbers. These counters
> are collected using perfmon. The sum of reads and writes per second (or
> transactions/per sec) tells you the total IOPS while the read and write
> numbers are used to generate the read/write ratio. Seconds per read and
> write tell you about IO latency. You want latency to average less than 20ms
> with no spikes lasting more than a few seconds over 50ms.
>
> The MSExchangeIS counter for user count and active user count can be used to
> derive your concurrency rate. Again, don't use averages or you'll be doomed
> to failure. Unless there is a clear case (an organization such as a
> hospital that works in clear shifts, or an airline where a lage percentage
> of employees are in the air and don't have access to mail), you should err
> toward the side of caution. I use 100% unless there is a clear case to do
> otherwise.
>
> 3. You'll need to take the read numbers, plus the write numbers (with the
> write penalty for your RAID type factored in) and determine the IOPS that
> your storage subsystem must be capable of supporting (the math is in
> "Optimizing Storage for Exchange Server 2003", so I won't repeat it here).
> You primary concern will be adequate spindle count to support the
> performance requirement. Do be aware of the response time requirement.
> Many folks make the mistake of using the maximum IOPS a spindle can support
> when determining spindle count. This number is not the same as the number
> of IOPS at a 20ms response time a spindle can support.
>
> 4. Change delta will be a significant factor with a sizable pop/imap user
> base. Typically these users pull the mail from the server to the client
> each time they check their mail. If users check their mail each day, this
> translates to a 100% change delta without deleted items retention or a 12.5%
> change delta with 7 days deleted items retention. As you migrate to MAPI
> clients, the size of the databases will grow as more mail is retained on the
> server, however the change deltas will shrink. Change deltas impact space
> and time for incremental disk or tape backup or space for snapshots.
>
> John
>
>
> "TonyP" wrote in message
> news:46B30508-7444-41F4-BAEB-E9961C83ACB5@microsoft.com...
> > Hello
> >
> >
> > I am trying to capacity plan for a migration of 30,000 users to Exchange
> > 2003. Currently the users are based on a Sun iPlanet environment using
> > only
> > POP and IMAP connections.
> >
> > In the exchange environment initially they will use POP/IMAP and MAPI then
> > overtime we forecast about 20,000 MAXIMUM users to be configured for MAPI.
> > Actually we cannot even see more than 15,000 ever being moved to MAPI.
> >
> > Currently we also have a four node exchange 2003 cluster, 3 Active 1
> > Passive
> > hosting 20,000 users in MAPI with a backend SAN environment.
> >
> > Our clustered environment is hosted at our Data Centre and all our offices
> > over the city connect back to us via Fibre. All users connect via MAPI
> > and
> > there a few using RPC over HTTPS. There is also a front end NLB
> > consisting
> > of 3 servers for OWA but this is mainly used for OWA and OMA connections.
> >
> > I am using the Microsoft white paper "Optimizing Storage for Microsoft
> > Exchange Server 2003" to work out IOPS and Megacycles per mailbox.
> >
> > I have monitored data for 2 weeks period on the current cluster and I seem
> > to have no 2 hour busy period which is consistent (Monday mornings), it
> > seems
> > to change from day to day and hour to hour .. my users are in Spain and
> > working habits here are not the same!!
> >
> > I intend to collect data for about one month then analysis this data and
> > use
> > statistical averages etc to come up with a figure. I will come up with an
> > IOPS and Megacycles for a 2 hour busy period every day. Then analysing
> > this
> > set of information to come up with an average IOPS and Megacycles based on
> > data collected every day.
> >
> > My intention is come up with an IOPS and Megacycle for the current 20,000
> > users. Then use this calculation to size hardware for the 30,000 users I
> > intend to migrate.
> >
> > I wanted to check on two things:
> >
> > 1) If my approach to use the IOPS and Megacycles from the 20,000 users can
> > be used for planning for the 30,000 users?
> >
> > 2) MORE importantly looking at tonnes of documentation on the web I am
> > seeing some people suggesting to monitor - logical disk transfers and
> > others
> > suggesting Physical disk transfer in Event viewer. Which do I use? Four
> > node
> > cluster on HP hardware backend is Storagetek SAN.
> >
> > Thanks
> >
> > T
> >
>
>
>
date: Tue, 13 Mar 2007 17:27:00 -0700
author: TonyP
Re: IOPS and megacycles calculations
An example of bell curves that are skewed:
http://pirate.shu.edu/~wachsmut/Teaching/MATH1101/Descriptives/box.html
If the data is skewed you would likely apply a non-linear transform to
correct for the skew. It would be worth picking up a statistics package if
you go this route.
"TonyP" wrote in message
news:02ADB368-132C-47B5-84C8-9ED58305589D@microsoft.com...
>
> Ok a little more complicated than I had envisaged but like you said it is
> worth the extra mile.
>
> So how do I obtain an normalized bell curve!! and then find mean and
> average? I found this link
>
> http://support.microsoft.com/kb/213930
>
> Also I dont understand change delta?
>
> Tony
>
>
>
> "John Fullbright [MVP]" wrote:
>
>> 1. In some cases, I have found it necessary to take a statistical
>> approach
>> to sizing. In this case, you would collect data points over a period of
>> time (weeks to months) at a lower sampling frequency (5 minute interval
>> or
>> so). Once the data is collected and svaed to csv format, you can work
>> with
>> the numbers in your favorite statistics package to create a normalized
>> bell
>> curve, then find the mean and the standard deviation. From that point,
>> you
>> would add enough standeard deviations to the mean to reach the desired
>> level
>> of accuracy and use that figure for your sizing. I would go the extra
>> mile
>> and do this. If you use the average alone then 50% of the time you will
>> be
>> undersized, ensuring an undesirable end user experience and the failure
>> of
>> your project. By adding 5 standard deviations to the mean, you will meet
>> your perfomance objective 99.9485% of the time. If you use six standard
>> deviations, you'll reach 5 nines.
>>
>> 2. From a disk perspective, the important numbers are:
>>
>> reads/sec
>> writes/sec
>> sec/write
>> sec/read
>> user count
>> active user count
>>
>> I use the physical disk counters to obtain these numbers. These counters
>> are collected using perfmon. The sum of reads and writes per second (or
>> transactions/per sec) tells you the total IOPS while the read and write
>> numbers are used to generate the read/write ratio. Seconds per read and
>> write tell you about IO latency. You want latency to average less than
>> 20ms
>> with no spikes lasting more than a few seconds over 50ms.
>>
>> The MSExchangeIS counter for user count and active user count can be used
>> to
>> derive your concurrency rate. Again, don't use averages or you'll be
>> doomed
>> to failure. Unless there is a clear case (an organization such as a
>> hospital that works in clear shifts, or an airline where a lage
>> percentage
>> of employees are in the air and don't have access to mail), you should
>> err
>> toward the side of caution. I use 100% unless there is a clear case to
>> do
>> otherwise.
>>
>> 3. You'll need to take the read numbers, plus the write numbers (with
>> the
>> write penalty for your RAID type factored in) and determine the IOPS that
>> your storage subsystem must be capable of supporting (the math is in
>> "Optimizing Storage for Exchange Server 2003", so I won't repeat it
>> here).
>> You primary concern will be adequate spindle count to support the
>> performance requirement. Do be aware of the response time requirement.
>> Many folks make the mistake of using the maximum IOPS a spindle can
>> support
>> when determining spindle count. This number is not the same as the
>> number
>> of IOPS at a 20ms response time a spindle can support.
>>
>> 4. Change delta will be a significant factor with a sizable pop/imap
>> user
>> base. Typically these users pull the mail from the server to the client
>> each time they check their mail. If users check their mail each day,
>> this
>> translates to a 100% change delta without deleted items retention or a
>> 12.5%
>> change delta with 7 days deleted items retention. As you migrate to MAPI
>> clients, the size of the databases will grow as more mail is retained on
>> the
>> server, however the change deltas will shrink. Change deltas impact
>> space
>> and time for incremental disk or tape backup or space for snapshots.
>>
>> John
>>
>>
>> "TonyP" wrote in message
>> news:46B30508-7444-41F4-BAEB-E9961C83ACB5@microsoft.com...
>> > Hello
>> >
>> >
>> > I am trying to capacity plan for a migration of 30,000 users to
>> > Exchange
>> > 2003. Currently the users are based on a Sun iPlanet environment using
>> > only
>> > POP and IMAP connections.
>> >
>> > In the exchange environment initially they will use POP/IMAP and MAPI
>> > then
>> > overtime we forecast about 20,000 MAXIMUM users to be configured for
>> > MAPI.
>> > Actually we cannot even see more than 15,000 ever being moved to MAPI.
>> >
>> > Currently we also have a four node exchange 2003 cluster, 3 Active 1
>> > Passive
>> > hosting 20,000 users in MAPI with a backend SAN environment.
>> >
>> > Our clustered environment is hosted at our Data Centre and all our
>> > offices
>> > over the city connect back to us via Fibre. All users connect via MAPI
>> > and
>> > there a few using RPC over HTTPS. There is also a front end NLB
>> > consisting
>> > of 3 servers for OWA but this is mainly used for OWA and OMA
>> > connections.
>> >
>> > I am using the Microsoft white paper "Optimizing Storage for Microsoft
>> > Exchange Server 2003" to work out IOPS and Megacycles per mailbox.
>> >
>> > I have monitored data for 2 weeks period on the current cluster and I
>> > seem
>> > to have no 2 hour busy period which is consistent (Monday mornings), it
>> > seems
>> > to change from day to day and hour to hour .. my users are in Spain and
>> > working habits here are not the same!!
>> >
>> > I intend to collect data for about one month then analysis this data
>> > and
>> > use
>> > statistical averages etc to come up with a figure. I will come up with
>> > an
>> > IOPS and Megacycles for a 2 hour busy period every day. Then analysing
>> > this
>> > set of information to come up with an average IOPS and Megacycles based
>> > on
>> > data collected every day.
>> >
>> > My intention is come up with an IOPS and Megacycle for the current
>> > 20,000
>> > users. Then use this calculation to size hardware for the 30,000 users
>> > I
>> > intend to migrate.
>> >
>> > I wanted to check on two things:
>> >
>> > 1) If my approach to use the IOPS and Megacycles from the 20,000 users
>> > can
>> > be used for planning for the 30,000 users?
>> >
>> > 2) MORE importantly looking at tonnes of documentation on the web I am
>> > seeing some people suggesting to monitor - logical disk transfers and
>> > others
>> > suggesting Physical disk transfer in Event viewer. Which do I use? Four
>> > node
>> > cluster on HP hardware backend is Storagetek SAN.
>> >
>> > Thanks
>> >
>> > T
>> >
>>
>>
>>
date: Tue, 13 Mar 2007 18:56:47 -0700
author: John Fullbright [MVP] fjohn@donotspamnetappdotcom
|
|