Ureader.com  
Microsoft software help and Community
   home   |   control panel login   |   archive   |  
 
SQL
ce
clients
clustering
connect
datamining
datawarehouse
dts
fulltext
jdbcdriver
msde
mseq
newusers
notificationsvcs
odbc
olap
programming
replication
reportingsvcs
security
securitytools
server
setup
sqlxml.viewmapper
tools
xml
  
 
date: Sat, 03 Nov 2007 10:15:06 -0700,    group: microsoft.public.sqlserver.datawarehouse        back       


Re: Relational Views as Transformation Method in Data Warehouse   
On Sep 26, 4:14 pm, Marco Russo  wrote:
> On Sep 20, 9:04 pm, DanUp <dupton.@.DecisionLab.Dot.Net.> wrote:
>
>
>
>
>
> > Open Question:
> > A recent post-reply, one expert (Marco Russo) indicated that he doesn't like
> > to use relational views as a quasi-ETL method, but rather prefers real
> > data-moving ETL to de-couple OLTP data from DW Data.  I do agree with the
> > importance of the classic de-coupling approach, but would like to hear more,
> > from Marco or others, about whether relational views, per se, should always
> > be avoided as a component to ETL with OLAP downstream.
>
> > Specific Scenario:  
> > For a SQL 2005 Data Mart / SSAS BI Application, suppose that, rather than
> > heavily transforming our large (~500g) operational data (which needs some,
> > but not tons of transformation) during it's transfer to a Data Mart box,
> > suppose we bulk-copy it (for speed of job completion), and then use
> > relational views to transform it (cleanse, filter, perform calcs), in
> > preparation for relational views' usage in SSAS '05?  
>
> > Questions:  
> > (1) Does anybody see value in that we now have more flexibility in terms of
> > change-management during project lifecycle, to manipulate actual table data
> > without disturbing (if we're careful) the views used for SSAS?
> > (2) How else might this hurt us, either now or later?
> > (3) By the way, will Proactive Caching work fully against relational views?
>
> > Thanks.
>
> > Daniel Upton
>
> I read other replies to this thread and I agree that this is more an
> implementation issue than a functional issue.
> That said, having consistent guidelines helps a lot if you want to
> build a maintainable solution. I don't remember what was my post-reply
> (if you can give me a link...) - I read your scenario and your
> solution works. You leverages on SQL engine to transformate data. If
> you would use SSIS to do data move and/or transformation, then it
> would be better having all transformations inside SSIS package.
> However, still today I find myself writing JOINs into source query
> (that acts on a copy of OLTP data, as in your case) just because
> otherwise SSIS would consume too much memory. As we said, it's an
> implementation issue.
> This topic is very large. In a perfect world, all ETL transformations
> are described through consistent meta-data that allows an easy
> analysis of field dependencies between source and destination data. In
> the real world, I still haven't seen a perfect solution like that :-)
>
> I'm very interesting in this kind of discussions - if there is someone
> interested, we could open a separate discussion on that (eventually
> moving out from the general newsgroup).
>
> Marco Russohttp://www.sqlbi.euhttp://sqlblog.com/blogs/marco_russo- Hide quoted text -
>
> - Show quoted text -

Hi Daniel/All,
this is a broad topic.....but let me put my 2cents in....

With faster machines and cheaper database licenses (relative to what
you can do) we moved to using the RDBMS as the staging area over files
in the 1997 time frame.

Of course, over the last 10 years machines have become faster
still.....so nowadays we see a number of benefits to heavily using
views in the ETL processing.

Specifically the track that we have now gone down is to create an
Excel workbook into which virtually ALL ETL transformations that are
needed in a typical mid-tier company can be coded.  We typically need
to have very, very little ETL outside of the definitions in the
workbook....we use a utility to generate views out of the workbook
that are placed over both the source and the target and our ETL
software pumps data from source to target with some specific
processing defined for each program.

The advantage we see in views is the complete disconnection of the
physical database from the ETL subsystem. This means the DBA can make
changes to the physical database and be certain that nothing on top of
that will be damaged. This is a very handy thing to be able to do.

We also load all the definitions in the spreadsheet into a small
reporting database and expose them via Report Services reports......

using these tools we can write ETL subsystems about 10x faster
(development time not run times) than writing them in other tools
including SSIS. So we think this approach has signicant merit and we
now sell the tools we developed.

We see the future of ETL Subsystem devlopment as defining all ETL
directives into Excel workbooks and then generating whatever needs to
be generated to create a runnable ETL subsystem. By doing this the
development of the ETL subsystm becomes encoding the mappings into the
workbook and testing.......there will be no faster way....ever...of
building ETL subsystems.....

Best Regards

Peter Nolan
www.peternolan.com
date: Sat, 03 Nov 2007 10:15:06 -0700   author:   Peter Nolan

Google
 
Web ureader.com


    COPYRIGHT 2007, YARDI TECHNOLOGY LIMITED, ALL RIGHT RESERVE  |   contact us