This post originated from an RSS feed registered with .NET Buzz
by Udi Dahan.
Original Post: "Data, Report." - Picard
Feed Title: Udi Dahan - The Software Simplist
Feed URL: http://feeds.feedburner.com/UdiDahan-TheSoftwareSimplist
Feed Description: I am a software simplist. I make this beast of architecting, analysing, designing, developing, testing, managing, deploying software systems simple.
This blog is about how I do it.
A thought occured to me earlier today. Whenever I develop systems, I get into reporting just like any other task. I architect for it on the level of making sure that I have all the data, and defining the necessary views etc, but I guess that I've never really dwelled on it. Any time I talk to other developers about reporting, we usually get into the thing that takes the data and generates the report, and never ( not hardly ever - never as in never ) truly consider the source of the data.
When I talk about reporting I mean showing data, usually aggregated and grouped in some manner, in a read-only "format" called a report. Often, data is pulled from all over the DB. Almost always, the report takes a while to "run" or "complete".
This is definitely a well known phenomena, and I for one have known about OLAP from quite some time now, but I've never really integrated it with my architectural view of systems.
The fact of the matter is, reporting should not be done on the same DB as the system is running on. In other words, reporting should not be done on OLTP systems. Rather, the data should be "shipped" to a separate "reporting DB" and the reporting code should use it as its source of data. The structure of the reporting DB need not ( and 99% probably should not ) be the same as the regular system DB. It should be optimized for reporting - just as the regular system DB is optimized for transaction processing.
For some background on the differences between OLTP and OLAP, with a bit of reporting in the background, take a look at this sample chapter from the MS SQL 2000 resource kit: http://www.microsoft.com/mspress/books/sampchap/4939a.asp
There really is no reason to run reporting on the same DB ( unless 100% timeliness is a requirement, and even then there are alternatives ) and quite a lot of good reasons to split them up.
Agree ? Disagree ? Is my epiphany dead on, or should it just be shot dead ? Let me know.