|
|
|
date: Tue, 30 Jun 2009 12:52:02 -0500,
group: microsoft.public.sqlserver.tools
back
RE: Performance problem
> beta_lockinfo is available on my website:
> http://www.sommarskog.se/sqlutil/beta_lockinfo.html
Excellent! Just a comment on your comments on the performance impact of the
tool itself. What I found is that if you can get to the server in time to
collect data when the problem is already there, you are probably okay to do
all these joins on the DMVs. But often (1) you don't know when a performance
problem may appear and (2) when it occurs, it may disappear before you get
there, so you may have to collect data continuously or at a fairly high
frequency (say once every few seconds). In that case, a monitoring tool that
does heavy joins every time it collects perf data can easily become a top
load contributor, kind of defeating the purpose.
One solution is to collect perf data lightly (in terms of its perf impact)
but regularly, and shift heavy processing to the reporting/query time. And if
you store the data somewher else, you can afford to run fancy joins to look
for the info you want.
Linchi
"Erland Sommarskog" wrote:
> Linchi Shea (LinchiShea@discussions.microsoft.com) writes:
> > If you take quick snapshots of sysprocesses when the CPU usage is high,
> > and find the incremental cpu usage between the consecutive snapshots for
> > each spid for the same session, you can find the sessions that are the
> > largest CPU consumers at the time, and that can usually help you
> > determine what SQL statements are contributing to the the high CPU
> > usage.
>
> And if you instead use beta_lockinfo, which includes a CPU column, you
> can also see what that process is up to.
>
> beta_lockinfo is available on my website:
> http://www.sommarskog.se/sqlutil/beta_lockinfo.html
>
>
> --
> Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
>
> Links for SQL Server Books Online:
> SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
> SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
> SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
>
>
date: Tue, 30 Jun 2009 19:30:01 -0700
author: Linchi Shea
Re: Performance problem
> In my experience, performance problems that are shorter than your tolerance
> for querying the DMVs (depending on how hard you are querying them), are not
> really problems.
That really depends on the nature of the app in question. Frankly,
performance problems that are long running (or relatively long running) are
easy to troubleshoot, precisely because they can easily caught.
The tough performance problems are those that come and go, but they are
really problems. For instance, I have been called to find out why there are
10-20 second gaps in certain appl flow (like why don't we see trades coming
in between 10:31:10 and 10:32:25?).
If you have to query a bunch of DMVs and do fancy joins once every few
seconds, you'd quickly find your monitoring query consuming a lot of CPU
cycles.
Linchi
"Aaron Bertrand [SQL Server MVP]" wrote:
> > all these joins on the DMVs. But often (1) you don't know when a performance
> > problem may appear and (2) when it occurs, it may disappear before you get
> > there, so you may have to collect data continuously or at a fairly high
> > frequency (say once every few seconds).
>
> In my experience, performance problems that are shorter than your tolerance
> for querying the DMVs (depending on how hard you are querying them), are not
> really problems. The Resource Governor works in the same way... It only
> checks for long-running queries so often; anything shorter is ignored.
>
> > One solution is to collect perf data lightly (in terms of its perf impact)
> > but regularly, and shift heavy processing to the reporting/query time. And if
> > you store the data somewher else, you can afford to run fancy joins to look
> > for the info you want.
>
> I do agree with this. The key is to collect the bare minimum on a routine
> schedule. An extension would be to add code that dynamically determines
> that a problem is creeping up and at that point take the more detailed info
> for a few cycles.
>
>
date: Wed, 1 Jul 2009 10:16:00 -0700
author: Linchi Shea
Re: Performance problem
The values in that column is cumulative. And cumulative values usually don't
mean anything in terms of their magnitude unless you have a feel for teh time
span in which the value is cumulated. You need to take a difference to find
how many CPU milliseconds are consumed by that spid in that time period. In
addition, when you do take that diff or delta, make sure it is still the same
session because spid can be reused and doing a cpu diff for the same spid but
different sessions is meaningless.
Linchi
"Mark Goldin" wrote:
> How do I properly read a column "cpu"? I see some very big numbers there.
>
> Thanks
>
> "Erland Sommarskog" wrote in message
> news:Xns9C3AF35BB3B90Yazorman@127.0.0.1...
> > Linchi Shea (LinchiShea@discussions.microsoft.com) writes:
> >> If you take quick snapshots of sysprocesses when the CPU usage is high,
> >> and find the incremental cpu usage between the consecutive snapshots for
> >> each spid for the same session, you can find the sessions that are the
> >> largest CPU consumers at the time, and that can usually help you
> >> determine what SQL statements are contributing to the the high CPU
> >> usage.
> >
> > And if you instead use beta_lockinfo, which includes a CPU column, you
> > can also see what that process is up to.
> >
> > beta_lockinfo is available on my website:
> > http://www.sommarskog.se/sqlutil/beta_lockinfo.html
> >
> >
> > --
> > Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
> >
> > Links for SQL Server Books Online:
> > SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
> > SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
> > SQL 2000:
> > http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
> >
>
>
date: Wed, 1 Jul 2009 10:25:01 -0700
author: Linchi Shea
Re: Performance problem
I understand about the same spid but I am not sure about:
<how many CPU milliseconds are consumed by that spid in that time period
What is 'that time period'?
Thanks
"Linchi Shea" wrote in message
news:BB36606B-4C07-48E3-AF26-44FD28C16A48@microsoft.com...
> The values in that column is cumulative. And cumulative values usually
> don't
> mean anything in terms of their magnitude unless you have a feel for teh
> time
> span in which the value is cumulated. You need to take a difference to
> find
> how many CPU milliseconds are consumed by that spid in that time period.
> In
> addition, when you do take that diff or delta, make sure it is still the
> same
> session because spid can be reused and doing a cpu diff for the same spid
> but
> different sessions is meaningless.
>
> Linchi
>
> "Mark Goldin" wrote:
>
>> How do I properly read a column "cpu"? I see some very big numbers there.
>>
>> Thanks
>>
>> "Erland Sommarskog" wrote in message
>> news:Xns9C3AF35BB3B90Yazorman@127.0.0.1...
>> > Linchi Shea (LinchiShea@discussions.microsoft.com) writes:
>> >> If you take quick snapshots of sysprocesses when the CPU usage is
>> >> high,
>> >> and find the incremental cpu usage between the consecutive snapshots
>> >> for
>> >> each spid for the same session, you can find the sessions that are the
>> >> largest CPU consumers at the time, and that can usually help you
>> >> determine what SQL statements are contributing to the the high CPU
>> >> usage.
>> >
>> > And if you instead use beta_lockinfo, which includes a CPU column, you
>> > can also see what that process is up to.
>> >
>> > beta_lockinfo is available on my website:
>> > http://www.sommarskog.se/sqlutil/beta_lockinfo.html
>> >
>> >
>> > --
>> > Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
>> >
>> > Links for SQL Server Books Online:
>> > SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
>> > SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
>> > SQL 2000:
>> > http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
>> >
>>
>>
date: Wed, 1 Jul 2009 13:25:23 -0500
author: Mark Goldin L
Re: Performance problem
Whatever time period you make it to be. So, if you are sampling these DMVs
once every 10 seconds, it would be the incremental for that 10 seconds. You
need to determine how often you sample them. Ultimately, it depends on the
nature of the app and what you want to use the sampled data for.
For instance, if you are collecting the perf data for long term trending
and/or capacity planning purposes, you probably cna live with sample
intervals that are in minutes.
Linchi
"Mark Goldin" wrote:
> I understand about the same spid but I am not sure about:
> <how many CPU milliseconds are consumed by that spid in that time period
> What is 'that time period'?
>
> Thanks
>
> "Linchi Shea" wrote in message
> news:BB36606B-4C07-48E3-AF26-44FD28C16A48@microsoft.com...
> > The values in that column is cumulative. And cumulative values usually
> > don't
> > mean anything in terms of their magnitude unless you have a feel for teh
> > time
> > span in which the value is cumulated. You need to take a difference to
> > find
> > how many CPU milliseconds are consumed by that spid in that time period.
> > In
> > addition, when you do take that diff or delta, make sure it is still the
> > same
> > session because spid can be reused and doing a cpu diff for the same spid
> > but
> > different sessions is meaningless.
> >
> > Linchi
> >
> > "Mark Goldin" wrote:
> >
> >> How do I properly read a column "cpu"? I see some very big numbers there.
> >>
> >> Thanks
> >>
> >> "Erland Sommarskog" wrote in message
> >> news:Xns9C3AF35BB3B90Yazorman@127.0.0.1...
> >> > Linchi Shea (LinchiShea@discussions.microsoft.com) writes:
> >> >> If you take quick snapshots of sysprocesses when the CPU usage is
> >> >> high,
> >> >> and find the incremental cpu usage between the consecutive snapshots
> >> >> for
> >> >> each spid for the same session, you can find the sessions that are the
> >> >> largest CPU consumers at the time, and that can usually help you
> >> >> determine what SQL statements are contributing to the the high CPU
> >> >> usage.
> >> >
> >> > And if you instead use beta_lockinfo, which includes a CPU column, you
> >> > can also see what that process is up to.
> >> >
> >> > beta_lockinfo is available on my website:
> >> > http://www.sommarskog.se/sqlutil/beta_lockinfo.html
> >> >
> >> >
> >> > --
> >> > Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
> >> >
> >> > Links for SQL Server Books Online:
> >> > SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
> >> > SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
> >> > SQL 2000:
> >> > http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
> >> >
> >>
> >>
>
>
date: Wed, 1 Jul 2009 12:03:01 -0700
author: Linchi Shea
RE: Performance problem
Linchi Shea (LinchiShea@discussions.microsoft.com) writes:
> Excellent! Just a comment on your comments on the performance impact of
> the tool itself. What I found is that if you can get to the server in
> time to collect data when the problem is already there, you are probably
> okay to do all these joins on the DMVs. But often (1) you don't know
> when a performance problem may appear and (2) when it occurs, it may
> disappear before you get there, so you may have to collect data
> continuously or at a fairly high frequency (say once every few seconds).
> In that case, a monitoring tool that does heavy joins every time it
> collects perf data can easily become a top load contributor, kind of
> defeating the purpose.
Agreed. Beta_lockinfo is nothing you should run every ten seconds or so.
It is not intended to be a proactive monitoring tool, but one you use
when you have trouble.
If you have issues like you describe where things clog up for 10 seconds,
you are better off with a trace.
As for the DMV joins, my experience is that the most expensive part is
accessing sys.dm_tran_locks. There is a GROUP BY, but it is really the
DMV itself which is slow.
--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
date: Wed, 01 Jul 2009 15:17:23 -0700
author: Erland Sommarskog
RE: Performance problem
> If you have issues like you describe where things clog up for 10 seconds,
> you are better off with a trace.
Well for this type of issues in this particular environment, SQL tracing is
not a good solution because (1) we don't know what we are looking for, (2)
the system is very busy with a lot of traffic, and (3) the system uses a lot
of scalar function calls. Doing a trace on a specific spid is fine on this
system. But doing a system-wide trace can be too expensive. And the key
problem is that SQL tracing just doesn't give quick enough feedback.
I found that taking snapshots of sysprocesses plus DBCC INPUTBUFFER and
fn_get_sql() for SQL2000 or a cross join with sys.dm_exec_sql_text for
SQL2005/2008 for only those spids that are consuming cpu/io is both simple
and good enough. And this can give us fast feedback to catch rogue spids.
Linchi
"Erland Sommarskog" wrote:
> Linchi Shea (LinchiShea@discussions.microsoft.com) writes:
> > Excellent! Just a comment on your comments on the performance impact of
> > the tool itself. What I found is that if you can get to the server in
> > time to collect data when the problem is already there, you are probably
> > okay to do all these joins on the DMVs. But often (1) you don't know
> > when a performance problem may appear and (2) when it occurs, it may
> > disappear before you get there, so you may have to collect data
> > continuously or at a fairly high frequency (say once every few seconds).
> > In that case, a monitoring tool that does heavy joins every time it
> > collects perf data can easily become a top load contributor, kind of
> > defeating the purpose.
>
> Agreed. Beta_lockinfo is nothing you should run every ten seconds or so.
> It is not intended to be a proactive monitoring tool, but one you use
> when you have trouble.
>
> If you have issues like you describe where things clog up for 10 seconds,
> you are better off with a trace.
>
> As for the DMV joins, my experience is that the most expensive part is
> accessing sys.dm_tran_locks. There is a GROUP BY, but it is really the
> DMV itself which is slow.
>
>
>
>
> --
> Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
>
> Links for SQL Server Books Online:
> SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
> SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx
> SQL 2000: http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
>
>
date: Wed, 1 Jul 2009 23:20:03 -0700
author: Linchi Shea
|
|