Contact US

Log In

Come Join Us!

Are you a
Computer / IT professional?
Join Tek-Tips Forums!
  • Talk With Other Members
  • Be Notified Of Responses
    To Your Posts
  • Keyword Search
  • One-Click Access To Your
    Favorite Forums
  • Automated Signatures
    On Your Posts
  • Best Of All, It's Free!

*Tek-Tips's functionality depends on members receiving e-mail. By joining you are opting in to receive e-mail.

Posting Guidelines

Promoting, selling, recruiting, coursework and thesis posting is forbidden.

Students Click Here

schema design question

schema design question

schema design question


I'm designing a monthly periodic snapshot fact table that will have every homeowner's insurance policy that is active at the end of each month.

Each policy has many attributes including policy number, effective date, expiration date, agent, territory, distance from coast, number of family, usage type, New or Renewal Code, and others. The measures in the fact table will include written premium, earned premium, coverage A/B/C/D amounts, deductible amount, and total exposure.

My question is should I have the Policy Number as a Degenerate Dimension in the fact table and create dimensions for all of the policy fields listed above, or should I create a policy dimension that will store certain attributes that may not need their own dimension?

I'm not sure what the best approach would be given the grain of the fact table. Looking for some advice.


RE: schema design question

It depends.

Topic 1: Do you have a higher dimension than policy? Like "Household", or "Customer"? As the customer/household could have multiple insurance policies.
Topic 2: Do policy numbers change? If I leave your company for a year and come back, is it a new policy number? Does the business even care about returning customers/policies?
Topic 3: You mentioned that this would be a monthly snapshot (summary) table. Is the table recreated every month? Or are more rows added each month? This could affect how to distribute the dimensions or whether to denormalize some of them in the fact records. If the table is recreated anew each month, I would denormalize most of the dimensions.

Hope this helps, or at least gets you thinking.

Sometimes the grass is greener on the other side because there is more manure there - original.

RE: schema design question

Hi Johnherman,

thanks for your input.

1. We do have a Customer Number for each policy and yes, a customer could have multiple policies with our company. I was thinking of building a Customer dimension to hold each customer number, address, and name information. This could be an SCD type 2 if the address or name changes, but probably very minimally. The business might not even care and just want to see the most recent data. I'll have to find that out on my end.

2. Policy numbers can change if a policy is cancelled and then rewritten. When the policy is rewritten, a new number is given. I'm not sure if the business needs to know if the policy number changed, but it may be beneficial in the future. Also, we are currently running 2 systems right now, with system 1 going away by the end of the year. We are manually migrated all policies from system 1 to system 2 at renewal. At this point, the policy will get a new number issued by system 2. The data warehouse still retains the "Old" number from system 1 in a field called InceptionPolicyNumber. This way we can tell what the policy number was before is was migrated to system 2.

3. I was going to create the snapshot table by appending the policies that were active each month, so we would be adding more rows to the table every month. There will be a Snapshot Date field that will hold the last day of each month so that we can tell for which month the data is for. So in Jan 2013, we may have 100K active policies, so 100K rows will be added with snapshot date of 01-31-2013. In Feb 2013, we may have 110K active policies, so 110K rows will be added to the table with a Snapshot Date of 2-28-2013. At this point the table will contain a total of 210K rows.

Let me know if the above answers help with the decision of creating a separate Policy dimension to hold some of the policy attributes or to just have the Policy Number as a Degenerate Dimension and separate dimension tables for all other fields.


RE: schema design question

Scott, your answers were well thought out.....I hope you can get your business folks to be as accurate as you when they tell you their plans for the future.

Based on your description, I would denormalize Customer into the degenerate policy dimension. Without address of course, although zip code - maybe.

If your company isn't real concerned with the cancellation aspect, then the policy number changes are probably not important.

You are wise to use a full date for the snapshot date (rather than month/year). This allows you to use the same structure to do weekly (or even daily) snapshots if your customer moves in that direction.

At some point, you may want to consider an archive process to take rows from the "active" snapshot table to an "archive" snapshot table. Maybe a weekend batch process. This will help preserve the good performance of the summary table.

Sometimes the grass is greener on the other side because there is more manure there - original.

Red Flag This Post

Please let us know here why this post is inappropriate. Reasons such as off-topic, duplicates, flames, illegal, vulgar, or students posting their homework.

Red Flag Submitted

Thank you for helping keep Tek-Tips Forums free from inappropriate posts.
The Tek-Tips staff will check this out and take appropriate action.

Reply To This Thread

Posting in the Tek-Tips forums is a member-only feature.

Click Here to join Tek-Tips and talk with other members! Already a Member? Login

Close Box

Join Tek-Tips® Today!

Join your peers on the Internet's largest technical computer professional community.
It's easy to join and it's free.

Here's Why Members Love Tek-Tips Forums:

Register now while it's still free!

Already a member? Close this window and log in.

Join Us             Close