Ureader.com  
Microsoft software help and Community
   home   |   control panel login   |   archive   |  
 
Exchange
2000.active.directory
2000.admin
2000.announcements
2000.app.conversion
2000.applications
2000.clients
2000.clustering
2000.connectivity
2000.development
2000.documentation
2000.general
2000.information.store
2000.interop
2000.kms
2000.misc
2000.protocols
2000.realtime.collabo.
2000.setup
2000.transport
2000.win2000
admin
application.conversion
applications
clients
clustering
connectivity
design
development
misc
mobility
setup
tools
  
 
date: Tue, 13 Mar 2007 09:43:47 -0700,    group: microsoft.public.exchange.design        back       


IOPS and megacycles calculations   
Hello


I am trying to capacity plan for a migration of 30,000 users to Exchange 
2003. Currently the users are based on a Sun iPlanet environment using only 
POP and IMAP connections. 

In the exchange environment initially they will use POP/IMAP and MAPI then 
overtime we forecast about 20,000 MAXIMUM users to be configured for MAPI. 
Actually we cannot even see more than 15,000 ever being moved to MAPI.

Currently we also have a four node exchange 2003 cluster, 3 Active 1 Passive 
hosting 20,000 users in MAPI with a backend SAN environment.  

Our clustered environment is hosted at our Data Centre and all our offices 
over the city connect back to us via Fibre.  All users connect via MAPI and 
there a few using RPC over HTTPS.  There is also a front end NLB consisting 
of 3 servers for OWA but this is mainly used for OWA and OMA connections. 

I am using the Microsoft white paper “Optimizing Storage for Microsoft 
Exchange Server 2003” to work out IOPS and Megacycles per mailbox. 

I have monitored data for 2 weeks period on the current cluster and I seem 
to have no 2 hour busy period which is consistent (Monday mornings), it seems 
to change from day to day and hour to hour .. my users are in Spain and 
working habits here are not the same!! 

I intend to collect data for about one month then analysis this data and use 
statistical averages etc to come up with a figure. I will come up with an 
IOPS and Megacycles for a 2 hour busy period every day. Then analysing this 
set of information to come up with an average IOPS and Megacycles based on 
data collected every day.

My intention is come up with an IOPS and Megacycle for the current 20,000 
users. Then use this calculation to size hardware for the 30,000 users I 
intend to migrate. 

I wanted to check on two things:

1)	If my approach to use the IOPS and Megacycles from the 20,000 users can 
be used for planning for the 30,000 users?

2)	MORE importantly looking at tonnes of documentation on the web I am 
seeing some people suggesting to monitor  – logical disk transfers and others 
suggesting Physical disk transfer in Event viewer. Which do I use? Four node 
cluster on HP hardware backend is Storagetek SAN.

Thanks

T
date: Tue, 13 Mar 2007 09:43:47 -0700   author:   TonyP

Re: IOPS and megacycles calculations   
1.  In some cases, I have found it necessary to take a statistical approach 
to sizing.  In this case, you would collect data points over a period of 
time (weeks to months) at a lower sampling frequency (5 minute interval or 
so).  Once the data is collected and svaed to csv format, you can work with 
the numbers in your favorite statistics package to create a normalized bell 
curve, then find the mean and the standard deviation.  From that point, you 
would add enough standeard deviations to the mean to reach the desired level 
of accuracy and use that figure for your sizing.  I would go the extra mile 
and do this.  If you use the average alone then 50% of the time you will be 
undersized, ensuring an undesirable end user experience and the failure of 
your project.  By adding 5 standard deviations to the mean, you will meet 
your perfomance objective 99.9485% of the time.  If you use six standard 
deviations, you'll reach 5 nines.

2.  From a disk perspective, the important numbers are:

reads/sec
writes/sec
sec/write
sec/read
user count
active user count

I use the physical disk counters to obtain these numbers.  These counters 
are collected using perfmon. The sum of reads and writes per second (or 
transactions/per sec) tells you the total IOPS while the read and write 
numbers are used to generate the read/write ratio.  Seconds per read and 
write tell you about IO latency.  You want latency to average less than 20ms 
with no spikes lasting more than a few seconds over 50ms.

The MSExchangeIS counter for user count and active user count can be used to 
derive your concurrency rate.  Again, don't use averages or you'll be doomed 
to failure.  Unless there is a clear case (an organization such as a 
hospital that works in clear shifts, or an airline where a lage percentage 
of employees are in the air and don't have access to mail), you should err 
toward the side of caution.  I use 100% unless there is a clear case to do 
otherwise.

3.  You'll need to take the read numbers, plus the write numbers (with the 
write penalty for your RAID type factored in) and determine the IOPS that 
your storage subsystem must be capable of supporting (the math is in 
"Optimizing Storage for Exchange Server 2003", so I won't repeat it here). 
You primary concern will be adequate spindle count to support the 
performance requirement.  Do be aware of the response time requirement. 
Many folks make the mistake of using the maximum IOPS a spindle can support 
when determining spindle count.  This number is not the same as the number 
of IOPS at a 20ms response time a spindle can support.

4.  Change delta will be a significant factor with a sizable pop/imap user 
base.  Typically these users pull the mail from the server to the client 
each time they check their mail.  If users check their mail each day, this 
translates to a 100% change delta without deleted items retention or a 12.5% 
change delta with 7 days deleted items retention.  As you migrate to MAPI 
clients, the size of the databases will grow as more mail is retained on the 
server, however the change deltas will shrink.  Change deltas impact space 
and time for incremental disk or tape backup or space for snapshots.

John


"TonyP"  wrote in message 
news:46B30508-7444-41F4-BAEB-E9961C83ACB5@microsoft.com...
> Hello
>
>
> I am trying to capacity plan for a migration of 30,000 users to Exchange
> 2003. Currently the users are based on a Sun iPlanet environment using 
> only
> POP and IMAP connections.
>
> In the exchange environment initially they will use POP/IMAP and MAPI then
> overtime we forecast about 20,000 MAXIMUM users to be configured for MAPI.
> Actually we cannot even see more than 15,000 ever being moved to MAPI.
>
> Currently we also have a four node exchange 2003 cluster, 3 Active 1 
> Passive
> hosting 20,000 users in MAPI with a backend SAN environment.
>
> Our clustered environment is hosted at our Data Centre and all our offices
> over the city connect back to us via Fibre.  All users connect via MAPI 
> and
> there a few using RPC over HTTPS.  There is also a front end NLB 
> consisting
> of 3 servers for OWA but this is mainly used for OWA and OMA connections.
>
> I am using the Microsoft white paper "Optimizing Storage for Microsoft
> Exchange Server 2003" to work out IOPS and Megacycles per mailbox.
>
> I have monitored data for 2 weeks period on the current cluster and I seem
> to have no 2 hour busy period which is consistent (Monday mornings), it 
> seems
> to change from day to day and hour to hour .. my users are in Spain and
> working habits here are not the same!!
>
> I intend to collect data for about one month then analysis this data and 
> use
> statistical averages etc to come up with a figure. I will come up with an
> IOPS and Megacycles for a 2 hour busy period every day. Then analysing 
> this
> set of information to come up with an average IOPS and Megacycles based on
> data collected every day.
>
> My intention is come up with an IOPS and Megacycle for the current 20,000
> users. Then use this calculation to size hardware for the 30,000 users I
> intend to migrate.
>
> I wanted to check on two things:
>
> 1) If my approach to use the IOPS and Megacycles from the 20,000 users can
> be used for planning for the 30,000 users?
>
> 2) MORE importantly looking at tonnes of documentation on the web I am
> seeing some people suggesting to monitor  - logical disk transfers and 
> others
> suggesting Physical disk transfer in Event viewer. Which do I use? Four 
> node
> cluster on HP hardware backend is Storagetek SAN.
>
> Thanks
>
> T
>
date: Tue, 13 Mar 2007 15:52:17 -0700   author:   John Fullbright [MVP] fjohn@donotspamnetappdotcom

Re: IOPS and megacycles calculations   
Ok a little more complicated than I had envisaged but like you said it is 
worth the extra mile.

So how do I obtain an normalized bell curve!! and then find mean and 
average? I found this link 

http://support.microsoft.com/kb/213930

Also I dont understand change delta?

Tony



"John Fullbright [MVP]" wrote:

> 1.  In some cases, I have found it necessary to take a statistical approach 
> to sizing.  In this case, you would collect data points over a period of 
> time (weeks to months) at a lower sampling frequency (5 minute interval or 
> so).  Once the data is collected and svaed to csv format, you can work with 
> the numbers in your favorite statistics package to create a normalized bell 
> curve, then find the mean and the standard deviation.  From that point, you 
> would add enough standeard deviations to the mean to reach the desired level 
> of accuracy and use that figure for your sizing.  I would go the extra mile 
> and do this.  If you use the average alone then 50% of the time you will be 
> undersized, ensuring an undesirable end user experience and the failure of 
> your project.  By adding 5 standard deviations to the mean, you will meet 
> your perfomance objective 99.9485% of the time.  If you use six standard 
> deviations, you'll reach 5 nines.
> 
> 2.  From a disk perspective, the important numbers are:
> 
> reads/sec
> writes/sec
> sec/write
> sec/read
> user count
> active user count
> 
> I use the physical disk counters to obtain these numbers.  These counters 
> are collected using perfmon. The sum of reads and writes per second (or 
> transactions/per sec) tells you the total IOPS while the read and write 
> numbers are used to generate the read/write ratio.  Seconds per read and 
> write tell you about IO latency.  You want latency to average less than 20ms 
> with no spikes lasting more than a few seconds over 50ms.
> 
> The MSExchangeIS counter for user count and active user count can be used to 
> derive your concurrency rate.  Again, don't use averages or you'll be doomed 
> to failure.  Unless there is a clear case (an organization such as a 
> hospital that works in clear shifts, or an airline where a lage percentage 
> of employees are in the air and don't have access to mail), you should err 
> toward the side of caution.  I use 100% unless there is a clear case to do 
> otherwise.
> 
> 3.  You'll need to take the read numbers, plus the write numbers (with the 
> write penalty for your RAID type factored in) and determine the IOPS that 
> your storage subsystem must be capable of supporting (the math is in 
> "Optimizing Storage for Exchange Server 2003", so I won't repeat it here). 
> You primary concern will be adequate spindle count to support the 
> performance requirement.  Do be aware of the response time requirement. 
> Many folks make the mistake of using the maximum IOPS a spindle can support 
> when determining spindle count.  This number is not the same as the number 
> of IOPS at a 20ms response time a spindle can support.
> 
> 4.  Change delta will be a significant factor with a sizable pop/imap user 
> base.  Typically these users pull the mail from the server to the client 
> each time they check their mail.  If users check their mail each day, this 
> translates to a 100% change delta without deleted items retention or a 12.5% 
> change delta with 7 days deleted items retention.  As you migrate to MAPI 
> clients, the size of the databases will grow as more mail is retained on the 
> server, however the change deltas will shrink.  Change deltas impact space 
> and time for incremental disk or tape backup or space for snapshots.
> 
> John
> 
> 
> "TonyP"  wrote in message 
> news:46B30508-7444-41F4-BAEB-E9961C83ACB5@microsoft.com...
> > Hello
> >
> >
> > I am trying to capacity plan for a migration of 30,000 users to Exchange
> > 2003. Currently the users are based on a Sun iPlanet environment using 
> > only
> > POP and IMAP connections.
> >
> > In the exchange environment initially they will use POP/IMAP and MAPI then
> > overtime we forecast about 20,000 MAXIMUM users to be configured for MAPI.
> > Actually we cannot even see more than 15,000 ever being moved to MAPI.
> >
> > Currently we also have a four node exchange 2003 cluster, 3 Active 1 
> > Passive
> > hosting 20,000 users in MAPI with a backend SAN environment.
> >
> > Our clustered environment is hosted at our Data Centre and all our offices
> > over the city connect back to us via Fibre.  All users connect via MAPI 
> > and
> > there a few using RPC over HTTPS.  There is also a front end NLB 
> > consisting
> > of 3 servers for OWA but this is mainly used for OWA and OMA connections.
> >
> > I am using the Microsoft white paper "Optimizing Storage for Microsoft
> > Exchange Server 2003" to work out IOPS and Megacycles per mailbox.
> >
> > I have monitored data for 2 weeks period on the current cluster and I seem
> > to have no 2 hour busy period which is consistent (Monday mornings), it 
> > seems
> > to change from day to day and hour to hour .. my users are in Spain and
> > working habits here are not the same!!
> >
> > I intend to collect data for about one month then analysis this data and 
> > use
> > statistical averages etc to come up with a figure. I will come up with an
> > IOPS and Megacycles for a 2 hour busy period every day. Then analysing 
> > this
> > set of information to come up with an average IOPS and Megacycles based on
> > data collected every day.
> >
> > My intention is come up with an IOPS and Megacycle for the current 20,000
> > users. Then use this calculation to size hardware for the 30,000 users I
> > intend to migrate.
> >
> > I wanted to check on two things:
> >
> > 1) If my approach to use the IOPS and Megacycles from the 20,000 users can
> > be used for planning for the 30,000 users?
> >
> > 2) MORE importantly looking at tonnes of documentation on the web I am
> > seeing some people suggesting to monitor  - logical disk transfers and 
> > others
> > suggesting Physical disk transfer in Event viewer. Which do I use? Four 
> > node
> > cluster on HP hardware backend is Storagetek SAN.
> >
> > Thanks
> >
> > T
> > 
> 
> 
>
date: Tue, 13 Mar 2007 17:27:00 -0700   author:   TonyP

Re: IOPS and megacycles calculations   
An example of bell curves that are skewed:

http://pirate.shu.edu/~wachsmut/Teaching/MATH1101/Descriptives/box.html

If the data is skewed you would likely apply a non-linear transform to 
correct for the skew.  It would be worth picking up a statistics package if 
you go this route.


"TonyP"  wrote in message 
news:02ADB368-132C-47B5-84C8-9ED58305589D@microsoft.com...
>
> Ok a little more complicated than I had envisaged but like you said it is
> worth the extra mile.
>
> So how do I obtain an normalized bell curve!! and then find mean and
> average? I found this link
>
> http://support.microsoft.com/kb/213930
>
> Also I dont understand change delta?
>
> Tony
>
>
>
> "John Fullbright [MVP]" wrote:
>
>> 1.  In some cases, I have found it necessary to take a statistical 
>> approach
>> to sizing.  In this case, you would collect data points over a period of
>> time (weeks to months) at a lower sampling frequency (5 minute interval 
>> or
>> so).  Once the data is collected and svaed to csv format, you can work 
>> with
>> the numbers in your favorite statistics package to create a normalized 
>> bell
>> curve, then find the mean and the standard deviation.  From that point, 
>> you
>> would add enough standeard deviations to the mean to reach the desired 
>> level
>> of accuracy and use that figure for your sizing.  I would go the extra 
>> mile
>> and do this.  If you use the average alone then 50% of the time you will 
>> be
>> undersized, ensuring an undesirable end user experience and the failure 
>> of
>> your project.  By adding 5 standard deviations to the mean, you will meet
>> your perfomance objective 99.9485% of the time.  If you use six standard
>> deviations, you'll reach 5 nines.
>>
>> 2.  From a disk perspective, the important numbers are:
>>
>> reads/sec
>> writes/sec
>> sec/write
>> sec/read
>> user count
>> active user count
>>
>> I use the physical disk counters to obtain these numbers.  These counters
>> are collected using perfmon. The sum of reads and writes per second (or
>> transactions/per sec) tells you the total IOPS while the read and write
>> numbers are used to generate the read/write ratio.  Seconds per read and
>> write tell you about IO latency.  You want latency to average less than 
>> 20ms
>> with no spikes lasting more than a few seconds over 50ms.
>>
>> The MSExchangeIS counter for user count and active user count can be used 
>> to
>> derive your concurrency rate.  Again, don't use averages or you'll be 
>> doomed
>> to failure.  Unless there is a clear case (an organization such as a
>> hospital that works in clear shifts, or an airline where a lage 
>> percentage
>> of employees are in the air and don't have access to mail), you should 
>> err
>> toward the side of caution.  I use 100% unless there is a clear case to 
>> do
>> otherwise.
>>
>> 3.  You'll need to take the read numbers, plus the write numbers (with 
>> the
>> write penalty for your RAID type factored in) and determine the IOPS that
>> your storage subsystem must be capable of supporting (the math is in
>> "Optimizing Storage for Exchange Server 2003", so I won't repeat it 
>> here).
>> You primary concern will be adequate spindle count to support the
>> performance requirement.  Do be aware of the response time requirement.
>> Many folks make the mistake of using the maximum IOPS a spindle can 
>> support
>> when determining spindle count.  This number is not the same as the 
>> number
>> of IOPS at a 20ms response time a spindle can support.
>>
>> 4.  Change delta will be a significant factor with a sizable pop/imap 
>> user
>> base.  Typically these users pull the mail from the server to the client
>> each time they check their mail.  If users check their mail each day, 
>> this
>> translates to a 100% change delta without deleted items retention or a 
>> 12.5%
>> change delta with 7 days deleted items retention.  As you migrate to MAPI
>> clients, the size of the databases will grow as more mail is retained on 
>> the
>> server, however the change deltas will shrink.  Change deltas impact 
>> space
>> and time for incremental disk or tape backup or space for snapshots.
>>
>> John
>>
>>
>> "TonyP"  wrote in message
>> news:46B30508-7444-41F4-BAEB-E9961C83ACB5@microsoft.com...
>> > Hello
>> >
>> >
>> > I am trying to capacity plan for a migration of 30,000 users to 
>> > Exchange
>> > 2003. Currently the users are based on a Sun iPlanet environment using
>> > only
>> > POP and IMAP connections.
>> >
>> > In the exchange environment initially they will use POP/IMAP and MAPI 
>> > then
>> > overtime we forecast about 20,000 MAXIMUM users to be configured for 
>> > MAPI.
>> > Actually we cannot even see more than 15,000 ever being moved to MAPI.
>> >
>> > Currently we also have a four node exchange 2003 cluster, 3 Active 1
>> > Passive
>> > hosting 20,000 users in MAPI with a backend SAN environment.
>> >
>> > Our clustered environment is hosted at our Data Centre and all our 
>> > offices
>> > over the city connect back to us via Fibre.  All users connect via MAPI
>> > and
>> > there a few using RPC over HTTPS.  There is also a front end NLB
>> > consisting
>> > of 3 servers for OWA but this is mainly used for OWA and OMA 
>> > connections.
>> >
>> > I am using the Microsoft white paper "Optimizing Storage for Microsoft
>> > Exchange Server 2003" to work out IOPS and Megacycles per mailbox.
>> >
>> > I have monitored data for 2 weeks period on the current cluster and I 
>> > seem
>> > to have no 2 hour busy period which is consistent (Monday mornings), it
>> > seems
>> > to change from day to day and hour to hour .. my users are in Spain and
>> > working habits here are not the same!!
>> >
>> > I intend to collect data for about one month then analysis this data 
>> > and
>> > use
>> > statistical averages etc to come up with a figure. I will come up with 
>> > an
>> > IOPS and Megacycles for a 2 hour busy period every day. Then analysing
>> > this
>> > set of information to come up with an average IOPS and Megacycles based 
>> > on
>> > data collected every day.
>> >
>> > My intention is come up with an IOPS and Megacycle for the current 
>> > 20,000
>> > users. Then use this calculation to size hardware for the 30,000 users 
>> > I
>> > intend to migrate.
>> >
>> > I wanted to check on two things:
>> >
>> > 1) If my approach to use the IOPS and Megacycles from the 20,000 users 
>> > can
>> > be used for planning for the 30,000 users?
>> >
>> > 2) MORE importantly looking at tonnes of documentation on the web I am
>> > seeing some people suggesting to monitor  - logical disk transfers and
>> > others
>> > suggesting Physical disk transfer in Event viewer. Which do I use? Four
>> > node
>> > cluster on HP hardware backend is Storagetek SAN.
>> >
>> > Thanks
>> >
>> > T
>> >
>>
>>
>>
date: Tue, 13 Mar 2007 18:56:47 -0700   author:   John Fullbright [MVP] fjohn@donotspamnetappdotcom

Google
 
Web ureader.com


    COPYRIGHT 2007, YARDI TECHNOLOGY LIMITED, ALL RIGHT RESERVE  |   contact us