The star schema architecture is the simplest data warehouse schema. It is called a star schema because the diagram resembles a star, with points radiating from a center. The center of the star consists of fact table and the points of the star are the dimension tables.
In Star Schema, usually the fact tables are in third normal form(3NF) whereas dimensional tables are de-normalized. Despite the fact that the star schema is the simplest architecture, it is most commonly used nowadays and is recommended by Oracle.
A fact table typically has two types of columns: foreign keys to dimension tables and measures those that contain numeric facts. A fact table can contain fact’s data on detail or aggregated level.
A dimension is a structure usually composed of one or more hierarchies that categorizes data. If a dimension hasn’t got a hierarchies and levels it is called flat dimension or list. The primary keys of each of the dimension tables are part of the composite primary key of the fact table. Dimensional attributes help to describe the dimensional value. They are normally descriptive, textual values. Dimension tables are generally small in size then fact table.
Typical fact tables store data about numerical facts like Revenue, countable facts while dimension tables data about geographic region(markets, cities) , clients, products, times, channels.
The main characteristics of star schema:
- Simple structure -> easy to understand schema
- Great query effectives -> small number of tables to join
- Relatively long time of loading data into dimension tables -> de-normalization, redundancy data caused that size of the table could be large.
- The most commonly used in the data warehouse implementations -> widely supported by a large number of business intelligence tools
“Students attend online courses from various geographies. Online courses are authored by teachers. The DW should allow the analysis of the attendance , and the factors that affect the popularity of courses .
You do not need to create SQL statements in this problem. The answer should describe in words a data model for the problem , names of the main tables , description of what they will contain, and relationships. Describe any assumptions you may need to make.”
Things to consider are Attendance of student (can be a measure), Geographies (Location dimension), Time to attend online course (dimension), Teachers (can be a dimension), Courses (can be a dimension)
Fact table contains :
1) Foreign key reference to
Location, Time, Teachers, Courses Dimensions
2) Fact / Measure : Attendance
Dimension tables: Location, Time, Teachers and Courses