Skip to playerSkip to main content
  • 2 months ago
Learn the concept of Gini impurity, a key splitting criterion used in Decision Tree algorithms, with this easy-to-follow tutorial in Tamil!

Our Website:
Visit 🔗 http://www.skillfloor.com

Our Blogs:
Visit 🔗 https://skillfloor.com/blog/

DEVELOPMENT TRAINING IN CHENNAI
https://skillfloor.com/development-training-in-chennai

DEVELOPMENT TRAINING IN COIMBATORE
https://skillfloor.com/development-training-in-coimbatore

Our Development Courses:
Certified Python Developer
Visit 🔗https://skillfloor.com/certified-python-developer
Certified Data BASE Developer
Visit 🔗https://skillfloor.com/certified-data-base-developer
Certified Android App Developer
Visit 🔗https://skillfloor.com/certified-android-app-developer
Certified IOS App Developer
Visit 🔗https://skillfloor.com/certified-ios-app-developer
Certified Flutter Developer
Visit 🔗https://skillfloor.com/certified-flutter-developer
Certified Full Stack Developer
Visit 🔗https://skillfloor.com/certified-full-stack-developer
Certified Front End Developer
Visit 🔗https://skillfloor.com/certified-front-end-developer

Our Classroom Locations:
Bangalore - https://maps.app.goo.gl/ZKTSJNCKTihQqfgx6
Chennai - https://maps.app.goo.gl/36gvPAnwqVWWoWD47
Coimbatore - https://maps.app.goo.gl/BvEpAWtdbDUuTf1G6
Hyderabad - https://maps.app.goo.gl/NyPwrN35b3EoUDHCA
Ahmedabad - https://maps.app.goo.gl/uSizg8qngBMyLhC76
Pune - https://maps.app.goo.gl/JbGVtDgNQA7hpJYj9

Our Additional Course:
Analytics Course
https://skillfloor.com/analytics-courses
https://skillfloor.com/analytics-training-in-bangalore
Artificial Intelligence Course
https://skillfloor.com/artificial-intelligence-courses
https://skillfloor.com/artificial-intelligence-training-in-bangalore
Data Science Course
https://skillfloor.com/data-science-courses
https://skillfloor.com/data-science-course-in-bangalore
Digital Marketing
https://skillfloor.com/digital-marketing-courses
https://skillfloor.com/digital-marketing-courses-in-bangalore
Ethical Hacking
https://skillfloor.com/ethical-hacking-courses
https://skillfloor.com/cyber-security-training-in-bangalore

#giniimpurity #decisiontree #pythontutorial #machinelearning #tamilcoding #skillfloor #datascience #pythonintamil #mlalgorithm #splittingcriteria #classification #datasciencetamil #machinelearningtamil #pythonprogramming #mlmodels #techeducation #codingintamil #featureselection #pythoncourse #tamiltech
Transcript
00:00Hello everyone. In this video, we will talk about splitting criteria Gini.
00:07We will talk about Gini in the decision tree and how to use Gini.
00:18In splitting criteria, first we will talk about Gini index.
00:22In the entire dataset, we will talk about a randomness or a disorder.
00:30We will talk about Gini index is 0, and we will talk about the proper data.
00:34We will talk about the change in the order, classification and classification.
00:38We will talk about randomness.
00:40If Gini index is high, we will talk about split.
00:45We will talk about decision tree and the spitting is proper.
00:50We will talk about decision final.
00:52We will talk about accuracy and provide accuracy.
00:55That's why we will talk about Gini index.
00:58We will talk about tree tree.
00:59The Gini index is basically what we call formula.
01:021 minus probability of each and every class labels.
01:06That is target column based.
01:08First, we will take a simple example.
01:11This is target column, whether it is an apple or not.
01:14We will decide yes or no.
01:16So, we will choose what we call color and size.
01:21So, if we call Gini index, we will talk about the entire data.
01:24Actually, we will talk about yes or no, mixed up.
01:27Then, we will talk about disorder column.
01:29We will talk about disorder column.
01:30We will talk about disorder column.
01:31We will talk about disorder column.
01:32Now, we will talk about probability of target column, yes or no.
01:37So, first probability of yes.
01:39How many data is?
01:403 data is.
01:41Out of 6 data.
01:42Probability is 3 by 6.
01:43That's why probability of no is 3 by 6.
01:46This is 0.5.
01:47So, if we calculate the Gini index index,
01:491 minus probability of yes whole square plus probability of no whole square.
01:54Okay?
01:55So, in the CI in red, we have target column.
01:57So, targets order class labels.
02:01Okay?
02:02So, we will substitute value.
02:06So, 1 minus 0.25 plus 0.25.
02:081 minus 0.5.
02:09We will find 0.5.
02:10We will find 0.5.
02:11So, we will calculate the Gini index index index.
02:14Okay?
02:15So, we will calculate the Gini index index index.
02:18Now, we will consider a data set.
02:20That means, we will paste the tree structure and paste the Gini index index index.
02:23So, first, we will consider the weekend data.
02:26Then, we will select the weather, parent, money column.
02:29So, we will paste the decision.
02:31We will provide a decision on the weekend.
02:33We will go to cinema, tennis, or stay in the street.
02:35We will go to shopping.
02:37So, now, what we will do is,
02:40we will create a tree structure.
02:41So, what we will do in the tree structure?
02:43We will create a root node.
02:44Then, we will create a decision node.
02:47Then, we will create a decision node.
02:48Then, based on the final,
02:49we will choose for that division.
02:51Correct?
02:52So, the first root node,
02:53we will select the root node.
02:54We will not have any type of root node.
02:56Like, weather, parents, money, etc.
02:58So, in the three different columns,
03:00we will calculate the Gini index index.
03:02If there is a minimum value,
03:04we will decide the root node.
03:06So, first, step 1.
03:08Then, we will select the Gini index index for this entire dataset.
03:10In the entire dataset,
03:11if we look at the six data,
03:12we will pick up the cinema and say,
03:14two data is tennis,
03:16one data is shopping,
03:17and one data is stay in.
03:19Then, if we calculate the Gini index index,
03:22the value is 0.58.
03:24Now, first, we will choose root node.
03:26So, we can choose root node. So, if we look at entire data results, we can choose 58% disorder.
03:32So, we can create root node based tree structure. So, first, we will consider weather column.
03:37So, in this weather column, what are the possibilities?
03:42Sunny, rainy and windy.
03:47There are three data. So, in sunny, most of the data, there are three data.
03:51So, then, windy, 4, and rainy, 3. So, first, sunny data, consider.
03:58So, sunny data, w1, w2, w3.
04:02So, here are three data, decision node.
04:06Cinema, tennis and tennis. So, here are two tennis and one cinema.
04:13So, sunny data, where are three instances?
04:16So, what is the name of the cinema? 1 by 3.
04:18Tendence is 2 by 3.
04:20So, we will check the guinea index for sunny.
04:23So, first, weather, what is the category?
04:26But, in one category, we will calculate the guinea split index.
04:29So, we consider sunny data.
04:311 minus 1 divided by 3 the whole square, plus 2 divided by 3 the whole square.
04:37We will calculate value.
04:38The value is 0.44.
04:40Okay?
04:41That means windy.
04:42In windy, there are three cinemas and one shopping.
04:45So, in the cinema, it is 3 by 4.
04:46Shopping is 1 by 4.
04:47So, this is the guinea split index for windy.
04:50This means rainy.
04:52So, we will calculate three different class labels of weather.
04:56Finally, weighted guinea for weather.
04:59Weighted guinea for weather is the probability of sunny
05:02into guinea value of sunny.
05:06Plus, probability of windy into guinea value of windy.
05:11Probability of rainy into guinea value of rainy.
05:14So, in this case, we will separate.
05:16So, sunny in the first place, there are three data.
05:19So, it is 3 by 10.
05:20That is the guinea value we provide.
05:230.44.
05:25So, we will multiply.
05:27Then, probability of windy data.
05:29Windy data, there are four windy data.
05:32So, 4 by 10.
05:33That is the guinea split index.
05:35So, in this case, we will provide.
05:36We have 41.4% of the disorders.
05:41So, we will check.
05:42This is the parent node.
05:44So, in the next parent node,
05:46we will choose the parent node.
05:47What classes are yes or no?
05:49So, first, yes.
05:50There are five instances.
05:51In five instances, we will decide.
05:53So, one minus five by five the whole square.
05:58So, one minus one is zero.
06:00So, this is the spitting proper.
06:02That is the guinea split index value is zero.
06:04That is the guinea split index of parents.
06:06So, the guinea split index of parents.
06:08Probability of yes into guinea split index of yes.
06:11Plus probability of no into guinea value of no.
06:13Check.
06:14Then, similarly, this is the money column.
06:18So, the money column is rich or poor.
06:20So, rich or poor based on decision column.
06:23Based on the guinea value.
06:24Check.
06:25This is the final decision of root node.
06:29So, the guinea split index of whether, parents and money.
06:33So, this is the minimum value of parents.
06:36So, the parents is the root node.
06:38Now, finally, I show the decision tree.
06:41So, this is the first layer of data.
06:45So, this is the first layer of parents.
06:48Parents are the root node.
06:50Yes.
06:51So, the guinea split index is zero.
06:54So, the guinea split index is zero.
06:55So, this is the first layer of data.
06:58You will see the other two written,
07:20Second, further spitting is nonno data.
07:24parents equal to no data
07:25parents equal to no data
07:27first parents base
07:29create
07:29next
07:30weather or money
07:32base
07:32create
07:33so
07:34we can consider
07:37so
07:38weather attribute
07:39parents equal to no data
07:40parents equal to no data
07:41weather attribute
07:43so
07:44weather attribute
07:45sunny
07:46rainy
07:46windy
07:47so
07:47separate
07:48two instance
07:49two instance
07:50one instance
07:51so
07:51sunny
07:51calculate
07:52rainy
07:53and windy
07:54separate
07:54guinea
07:54split
07:55index
07:55calculate
07:55sunny
07:56data
07:57consider
07:57two
07:58cinema
07:59tennis
07:59decide
08:00so
08:01guinea
08:02split
08:02value
08:02zero
08:03rainy
08:03consider
08:04stay
08:04in
08:05above
08:05value
08:050
08:06windy
08:06one
08:06cinema
08:08one
08:08shopping
08:09decide
08:10so
08:10guinea
08:11split
08:11index
08:12value
08:120.5
08:12this
08:13normal
08:14weighted
08:16guinea
08:16index
08:16calculate
08:17when
08:18guinea
08:18index
08:19of
08:19sunny
08:20into
08:21probability
08:21of sunny
08:22into
08:23guinea
08:24value of sunny
08:24provide
08:25and here
08:25is
08:25probability of sunny
08:27given
08:28parents
08:30equal to
08:31no
08:31so
08:32in the data
08:32we have
08:33probability value
08:35into
08:35guinea
08:38value of
08:38sunny
08:40given
08:42parents
08:42equal to
08:43no
08:43okay
08:44in the
08:44math
08:44plus
08:45probability of
08:47windy
08:48given
08:49parents
08:49equal to
08:50no
08:50we have
08:51separated
08:51calculate
08:52okay
08:53then
08:54check
08:55money
08:56column
08:57check
08:57once
08:57money
08:58column
08:58check
08:59and
09:00analyze
09:01we have
09:02We will analyze how the minimum value is.
09:04We will base weather based.
09:06In the next depth layer tool,
09:08we will create weather based.
09:10We will add no.
09:12We will add sunny, rainy and windy.
09:14We will create sunny based.
09:16We will decide both tennis based.
09:18We will decide.
09:20Guinea index value is 0.
09:22No further splitting.
09:24We will consider rainy data.
09:26We will add no further splitting.
09:28Guinea value index is 0.
09:30We will decide one of the cinema.
09:32We will decide shopping.
09:34Now, we will split the other.
09:36We will split the other.
09:38We will split the other.
09:40Money based.
09:42Who are rich.
09:44So, we will create a decision tree.
09:46Guinea split index based.
09:52This is the splitting criteria.
09:54We will see the next video.
Be the first to comment
Add your comment

Recommended