Skip to playerSkip to main content
  • 2 days ago
Transcript
00:00Hey friends, welcome back.
00:07Now we have learned how we can extend a model, a semantic model, from fabric with external data here,
00:13which is stored locally, for instance with, in this case, dates tables.
00:16But more often, actually it's not related to dates tables.
00:20Those are rather often already available inside, for instance, a semantic model.
00:24But what is often not available is something coming from the business side.
00:27Meaning business, for instance, has one additional local Excel file,
00:31and in this Excel file there is data stored.
00:33And this data is often times not available, for instance, in the default semantic model,
00:38or in the lake house, simply because it only relates to one specific department.
00:42So those people are more often, well, relying on exactly this storage model which we have covered.
00:48So converting it to a local model into direct query and then extend it.
00:53And one additional example I'd like to show you here.
00:56And also dive into one problem which often arises in developing Power BI reports,
01:01which is the many-to-many relationship.
01:03So let's see the problem, and then let's fix it.
01:05So at first, I have an additional Excel file you also find in the resource section,
01:09which looks like that.
01:11So in this file, this is coming now from, let's say, our local department.
01:15And they have the requirements that they also have here this table,
01:19which contains regions, and then also the management here,
01:22which are certain persons you probably have never heard before.
01:25Now, how do we get this data into Power BI?
01:29Now, of course, to get this in, it's quite simple.
01:31After converting it, let me just close this for now, after converting this into a local model,
01:37so stored in direct query first and now in mixed mode, as you can see down there,
01:40we can now, of course, simply get to get data or directly to Excel workbook
01:45and select the specific workbook I've shown you, this management.
01:48So let me just select it for now.
01:50And now let's click on Open.
01:52And then let's just see what we get.
01:54We got one table. Let's select this table in here.
01:57And we see that currently the, well, the headers are not correct
02:00because it was not formatted as a table inside the Excel file.
02:04But that's fine. We just select it, and then we can adjust this inside the editor.
02:08So we go to Transform.
02:10And then we just wait until the editor opens. Here we go.
02:13And this is now our local power query.
02:15So not what we've seen in the data flow chain 2, but now locally.
02:18And in here, I'd like to actually now just promote the headers.
02:21So let's say, go to Transform here and then use first row as headers.
02:25So, so far, so good. Now we have that.
02:27By the way, that would also be available under the Home ribbon.
02:30Just here, a little bit more hidden, but there would also be the same operation available.
02:34Now, the last thing I'd like to do here is I simply go there and name this, for instance, my management.
02:39Or let's call it actually DIM Management.
02:42Like this.
02:44So, okay. Here we got our management table.
02:46And also, always check best practice.
02:48I like to mention this several times.
02:50Always check the data types.
02:51That's important that you map them to the correct data types.
02:53In this case, that's text, and that's already fine.
02:55But always make sure that there's not ABC123.
02:58Because if you see ABC123, data type is clearly not mapped.
03:01So make sure that this is also correct.
03:03So next, let's actually, oh, there's a space too much.
03:07Let me get rid of that.
03:08Okay.
03:09So next, let's actually click on Close and Apply.
03:11And see whether we can already use this.
03:13So now we have imported this.
03:15And now we go inside the model view.
03:17And here, you can see that our star schema already looks quite interesting.
03:21But now we have here one additional table.
03:24And this time, it's not that easy to connect this table directly to the fact transactions.
03:29Because remember, we replaced actually the region already with a region key.
03:33Meaning, we do not have the option to simply use the region and connect it to any region here.
03:38Because inside here, we only have the idea related to the product info, for instance.
03:43Which has a product ID here and a product ID here.
03:46But the region itself is not available anymore in this table.
03:49Now, how do we fix this now?
03:51Well, the simple solution is that we need to extend our star schema a little bit.
03:56And convert it into a snowflake schema.
03:58It's called snowflake.
03:59But the idea is that we are not able to connect this dim dimension table directly to the fact table.
04:04So what we have to do instead is we rely on this table here, which is dim product.
04:09Let me just maybe drag this up here and this one here.
04:12And now we need to connect this dim table here to this dim product info.
04:16And then that goes to the fact transactions.
04:19Because here we have regions and here we have regions.
04:21So this would be now one option to solve it.
04:24There might be other ways and better ways, but this way is the way we go now.
04:27Because here we also see the next problem, which I mentioned at the beginning.
04:31And we'll see it just now.
04:33So let me drag the regions for now from here to the regions here.
04:36And let's drop this.
04:38And now we wait.
04:39And we see we got here a warning issue.
04:42So there's a red, big red sign and a warning issue that currently we creating a many to many relationship here.
04:48So we have a many regions.
04:51So the regions appear more than once in the management table and also more than once in this product info table.
04:58Because simply we have here several sales managers which are working in the same region as you can see here.
05:03So that's already a problem.
05:04So we cannot simply get rid of the duplicates here.
05:07And the same is true in this dim management table.
05:10And I'm going to show you why.
05:11Let's click on cancel for now.
05:12And then let's go back to the table view and select this dim management.
05:17And we can see that here we have several people which are working in the management.
05:22Maybe the CEO, the CFO and so on.
05:24But all of them, of course, are belonging to a specific region.
05:28So we can see region central for instance appearing here.
05:30It's appearing here.
05:31And then it's also appearing here.
05:33So there's also no option for us to get rid of the duplicates here.
05:37So we have duplicates in both tables.
05:39And this is why currently if we try to connect those tables we get this many to many issue.
05:44So now we would have to fix this somehow.
05:46Because it's never a good idea to create a many to many relationship between two tables.
05:51So we should really avoid this at any cost if possible.
05:54And here it is possible.
05:56What we do is we need a kind of helper table.
06:00And this helper table is called bridge table.
06:03And so let's do this.
06:04Let's create this and then see that this works.
06:06So let's go back to in this case report view for now.
06:09And let's go to transform data one more time.
06:11Because the bridge table currently does not exist we need to create it.
06:15And the bridge table is quite simple.
06:17Because think of it.
06:18We have the regions with duplicates here.
06:20And also the regions with duplicates in the product table.
06:24Which is the table we connect to.
06:26In order to get to the facts table.
06:28Right?
06:29In our snowflake collection.
06:30So what we need to do is we need to create a table.
06:33Which contains the regions but doesn't contain any duplicates.
06:36So what we do is we right click on the dimdome management.
06:39And then we create a reference.
06:41You could create a duplicate as well.
06:43But the better idea is always to work with references here.
06:45Because reference does not require much more space.
06:48Because it's only a reference.
06:50And also if the dimdates table management is updated.
06:53The reference always is also updated.
06:54So that's why I prefer the reference here.
06:56So let's do that.
06:57Click on reference here.
06:58And then let's just right click on the region.
07:01And at first get rid of the other columns.
07:03Because we only need the region column.
07:04So let's say remove other columns.
07:06And secondly we need to get rid of the duplicates.
07:09So one more time.
07:10Right click here.
07:11And say in this case remove duplicates.
07:14There it is.
07:15The other option by the way would be to click on the table icon.
07:18There's also an option to remove the duplicates down there.
07:21And the third option.
07:23Because I will often ask.
07:24Where can I find to remove duplicates inside this ribbon up here.
07:27It is available.
07:28If you go to remove rows.
07:30And under remove rows.
07:31There's remove duplicates.
07:32It's a little bit hidden.
07:33But it's also there available.
07:34That's why I always say if I'm asked.
07:37Inside the query editor.
07:38Many ways lead to wrong.
07:39Because you can find the same transaction in multiple areas.
07:43So let's do that.
07:44That's right.
07:45Click here for instance.
07:46And say I want to get rid of the duplicates.
07:48So remove duplicates.
07:49And now here we have the regions.
07:52Yeah.
07:53And also now we have dimension 2.
07:55And I'd like to of course name this properly.
07:57And this will be our bridge.
07:58So let's call this bridge.
08:00And sometimes also I can call it TBL.
08:04It's an abbreviation.
08:05But as I said.
08:06You might have a naming convention.
08:07How you work with.
08:08And I encourage you to stick to this naming convention.
08:10So I call it just bridge for now.
08:12And here is now the table without duplicates.
08:15Which we're going to use.
08:16In order to combine the two tables.
08:18So this is the link for the two tables.
08:20And let me click on close and apply.
08:23And now I just take a little bit.
08:25And now we have a bridge table here as well.
08:27And if I go back to the table view.
08:29Just check that.
08:30The bridge is exactly just one column.
08:33With all the regions which are available without duplicates.
08:36And now let's go back to the model view.
08:39And let's scroll in here.
08:41And you see that the bridge already has connected to the management.
08:44So that's already fine.
08:45And the relation is here one to many.
08:48That's also fine between the regions.
08:50And now we need to connect the bridge table here to the product info.
08:53So instead of connecting the product info directly to the management.
08:56We use this bridge.
08:57And now connect also the region to the region here.
09:00Drop it.
09:01And this is also fine.
09:03It's also a many to one.
09:04That is fine.
09:05Click on save.
09:06And now this of course you see that because of different storage modes.
09:09Again this is a weak or a limited relationship.
09:11But now we have a one to many here.
09:13So one to star.
09:14And also one to many here.
09:15One to star.
09:16And this is exactly what we want.
09:17So instead of connecting this via many to many.
09:19We would avoid this.
09:21Instead of using a bridge table.
09:22Which helps us then to have this one to many relationship.
09:25And we would like to see the same of course for the region.
09:27So the same of course for the region.
09:28Are we already finished?
09:29Well let's find out.
09:30So let's first go inside the report view.
09:32And let's create a new page for now.
09:34And now let's actually try that.
09:35Let's say we're gonna use now.
09:37Here from the management table.
09:38I'd like to see the region.
09:40And next to the region.
09:41I'd also like to see the management people.
09:43Which are actually managing these regions.
09:45Right?
09:46You can see that.
09:47There they are.
09:48So there are three people for each of the regions.
09:50And now I would like to know.
09:52What is the sales amount?
09:53Which should be the same of course.
09:55For all of the three people.
09:56Working in the same region.
09:57But still I'd like to see the data.
09:59So let me put in revenue in US dollars.
10:02Click on that.
10:03And this looks not correct.
10:06Because I can see the same number everywhere.
10:08Which is exactly the total amount of revenue.
10:11This 235 million and so on.
10:12We've seen this before.
10:14In various tables in here.
10:16Here for instance with the dates table.
10:18This is exactly this number.
10:19So obviously something went wrong here.
10:22And what is this?
10:23Well.
10:24Let's go back to the model view.
10:26Because there's one thing we need to adjust here.
10:29So there you can see.
10:30We tried to filter.
10:31The transaction table here.
10:32This one.
10:33Via the management table.
10:34And the problem.
10:35Is that here.
10:37The filter.
10:38Coming from the bridge.
10:39Is filtering the product info.
10:40That is fine.
10:41The product info on the other hand.
10:43Is filtering the transactions.
10:44So it's also fine.
10:45Because you can see the arrows.
10:46Are going from the one.
10:47To the many side.
10:48And from the one.
10:49To the many side.
10:50But here we have a problem.
10:51Because currently.
10:52This is actually.
10:53Coming from the bridge.
10:54So the bridge filter.
10:55Is able to filter.
10:56The management.
10:57But the management.
10:58Is not able to filter.
10:59The bridge table.
11:00Which in turn.
11:01Then will filter the product.
11:02And then in turn.
11:03Will filter the transactions.
11:04So we need to modify.
11:05This relationship here.
11:06And the way we do it.
11:07Is simply.
11:08Either you double click on it.
11:09Or you right click on it.
11:10That would also work.
11:11And then go to the property section.
11:13And here we want to make sure.
11:14That this cross filter direction.
11:16Is not single.
11:17But both.
11:18Meaning that.
11:19The many side.
11:20Is also able to filter.
11:21The one side.
11:22And by the way.
11:23You can also.
11:24Select it in here.
11:25In the property section.
11:26There's also an option.
11:27To switch this.
11:28Here.
11:29To cross filter direction.
11:30To both.
11:31That could also be done.
11:32But for now.
11:33Let me just save this.
11:34And you see that.
11:35Now we've changed this.
11:36And the filter.
11:37Is now going in both directions.
11:38We could go back.
11:39To the model.
11:40Report view.
11:41And we see here.
11:42The numbers.
11:43So.
11:44Those look correct.
11:45We can see the same number.
11:46Appearing three times.
11:47Which is fine.
11:48Because you can see that.
11:49The relation is via the region.
11:51So that means that.
11:52For each of the management people.
11:53Each of those three.
11:54Of course.
11:55The revenue is exactly the same.
11:56Because the revenue is based on the region.
11:58But this.
11:59Allows us to.
12:00Well.
12:01Create.
12:02This many to many relationship.
12:03But avoid it.
12:04Directly in the model.
12:05And instead.
12:06Using the bridge table.
12:07To.
12:08Well.
12:09To the filtering.
12:10And then.
12:11To display the data.
12:12In our model.
12:13So.
12:14That was just.
12:15A little excourse.
12:16An example.
12:17Because.
12:18I think that's quite important.
12:19Because this.
12:20Topic comes up.
12:21Again and again.
12:22And this many to many relationship.
12:23Should really be avoided.
12:24And the bridge table is the way to go.
12:25So.
12:26This.
12:27As you've seen.
12:28Has now.
12:29Extended our model further.
12:30And.
12:31Well.
12:32We derived.
12:33Or.
12:34Changed.
12:35The original structure.
12:36Schema.
12:37If we have.
12:38One or more facts table.
12:39In the middle.
12:40And then.
12:41Only dimension tables here.
12:42To filter the facts.
12:43But this time.
12:44We also have.
12:45This bridge table.
12:46And the third table.
12:47Down here.
12:48So this makes.
12:49This relation.
12:50Actually.
12:51A snowflake schema.
12:52Is the case.
12:53That you have this.
12:54In the model.
12:55So.
12:56That's it.
12:57For the.
12:58Many to many part.
12:59And extending.
13:00Our.
13:01Local model.
13:02So.
13:03Hopefully you enjoyed it.
13:04If you got questions.
13:05Related to that.
13:06Let me know.
13:07Otherwise.
Be the first to comment
Add your comment

Recommended