Data Warehousing | DWH

Complete portal for DWH and BI

Star Schema in Data Warehouse

Posted on | February 2, 2010 | No Comments

Star schema contain a single or mutiple fact table surrounded by dimension table.The dimension table in star schema will be denormalized.Most of the datawarehouse are designed according to star schema.Accordingly star schema as following features

1. Single(Simple Star Schema) or multiple(Complex Star Schema) fact table
2. There are fewer number of dimension table hence lesser joines and queries execute faster
3. Considered a special case of snowflake schema.
4. Dimensional tables are generally in 2nd normal form while fact table are in third normal form

grep command in unix with example

Posted on | January 15, 2010 | No Comments

Grep command is very useful in UNIX.Grep searches for similar pattern on file and return the result.Example of grep command is
grep ‘word’ filename – Will return the line which contain word.
grep -i ‘nocase’ filename – This option will make grep case insensitive.
grep – r ‘sear’ /etc/ – This will search in the whole directory recursively
grep “boo” filename – This will display line which contain exact match of the word boo.
egrep -w ‘word1|word2′ filename – This will search for both the word1 and word2 in the filename.
grep – c word1 filename – Will return the number of lines where word1 is found.
grep -v word1 filename – Will return the line which doesn’t contain the word1.
grep word1 * – Will search for occurence of word1 in whole directory
grep ^a.ad fruitlist.txt – Print line that contain the letter followed by a and in last contain letter ad.

File Compression and archival in UNIX

Posted on | January 15, 2010 | No Comments

There are lot of times when you need to compress and archive files in UNIX in data warehousing project.The reason for this is files in DWH are huge.In this article I will explain how to compress files and than archive them.Files compression is not useful for small files.There are few command and utilities you can use to compress files.They are as follows

1. compress
2. gzip
3. tar(with c option) – Arcchives files without compression
Archive may contain one or more files.Generally archival is done for files which are not used much.
Read more

Snowflake Schema in data warehouse

Posted on | January 11, 2010 | No Comments

A Snowflake schema has a centralized fact table surrounded by dimension tables.The dimension table in snow flake schema are normalized.There will be single fact table joined to multiple dimension table which in turn may be joined to other dimension tables.So snowflake schema has following features

1. Large number of dimension tables.
2. Since dimension table are normalized the size of dimension table is small
3. For query purpose we need to retrieve data from multiple dimension table resulting in more joins and slower performance.

ETL tools in data warehousing?

Posted on | January 8, 2010 | 3 Comments

Once you have decided to build a data warehouse for the company.The challenges lies in deciding which ETL tool to use.There are plenty of ETL tools like Pentaho Data Integration,Abnitio.The most used ETL tool is Informatica.ETL stands for Extract,Transform and Load.Although most of ETL tool can perform all the three function.Generally one or more process of extraction,transformation and load may be performed by some external software.Lets us understand all the three process one by one
Read more

What is Slowly changing Dimension?

Posted on | January 3, 2010 | No Comments

A slowly changing dimension is dimension table in which attribute changes with time.Suppose we have a customer who has bank account in Chicago.Now he changes his location and moves to new york.A dimension table for this customer will typically contain the following information

NameAgeCityGender
Jack25ChicagoMale

Now the challenge is to how to store this changing information.In a datawarehouse this is generally done in three ways.Accordingly such type of dimension are called type 1,type 2 and type 3

Read more

What is Dimension Table?

Posted on | January 3, 2010 | No Comments

Dimension table contain attribute which describe fact table.In my previous article I explained fact is nothing but  measures to value business like sales.Dimension on the other hand are used to analyze fact.A example of dimension table will be

Region
CountryStateCity

Read more

Fact Table in Datawarehouse

Posted on | December 31, 2009 | No Comments

Fact table is the centre table in snow flake scheme.A fact is measure of business for example sales.Generally facts are numbers.Dimension on other help us to analyze facts.Example of Dimension will be Region.Apart from these measures the fact table contain foreign key for dimension table.Fact table is generally large in size.Fact table are highly normalized because of that reason.
Read more

Free web directory General Business Web Directory