Hello,
I am still and again implementing my small sample data warehouse and I am now facing an interesting challenge, that of an accumulating snapshot. However, there is a trick to id...
Say I have data in my staging table such as:
ID ItemId Action TimeStamp
1 1 1 Jan 1
2 1 2 Jan 3
3 2 1 Jan 4
4 1 3 Jan 6
5 1 1 Jan 8
Basically, it represents the occupation of a parking space. Obviously the space can be occupied more than once and several actions can be performed on it. My pipeline is starts when something gets to the space and ends when it leaves that space. Granted that same space could be taken again later. So I am trying to derive a fact from this table that would basically represent the following:
ItemId TimeAction1 TimeAction2 TimeAction3
1 Jan1 Jan3 Jan6
1 Jan8 NULL NULL
2 Jan1 NULL NULL
Also, if there were two Action #2 for ItemId #1, I'm only concerned with keeping the latest timestamp. And as you can see, after action 3 my "cycle/pipeline" is completed and I need to start another one for the same item.
So my question is whether there's a "preferred" way to approach these kinds of problems in SSIS. I'm wondering if I should get all my staging data and then use a ForEach container to do my processing before I insert my data. Or should I "pre-build" my full pipelines and then process them separately, but if so, what kinds of ways are there to do this?
Anyways, I can think of a couple of ideas around this but I'm just wondering conceptually, how you would approach it, so any hints/suggestions/advice/help is more than welcome!
Thanks,
Greg
P.S: I apologize for the ugly code pasting
I am still and again implementing my small sample data warehouse and I am now facing an interesting challenge, that of an accumulating snapshot. However, there is a trick to id...
Say I have data in my staging table such as:
ID ItemId Action TimeStamp
1 1 1 Jan 1
2 1 2 Jan 3
3 2 1 Jan 4
4 1 3 Jan 6
5 1 1 Jan 8
Basically, it represents the occupation of a parking space. Obviously the space can be occupied more than once and several actions can be performed on it. My pipeline is starts when something gets to the space and ends when it leaves that space. Granted that same space could be taken again later. So I am trying to derive a fact from this table that would basically represent the following:
ItemId TimeAction1 TimeAction2 TimeAction3
1 Jan1 Jan3 Jan6
1 Jan8 NULL NULL
2 Jan1 NULL NULL
Also, if there were two Action #2 for ItemId #1, I'm only concerned with keeping the latest timestamp. And as you can see, after action 3 my "cycle/pipeline" is completed and I need to start another one for the same item.
So my question is whether there's a "preferred" way to approach these kinds of problems in SSIS. I'm wondering if I should get all my staging data and then use a ForEach container to do my processing before I insert my data. Or should I "pre-build" my full pipelines and then process them separately, but if so, what kinds of ways are there to do this?
Anyways, I can think of a couple of ideas around this but I'm just wondering conceptually, how you would approach it, so any hints/suggestions/advice/help is more than welcome!
Thanks,
Greg
P.S: I apologize for the ugly code pasting