Skip to player
Skip to main content
Search
Connect
Watch fullscreen
Like
Bookmark
Share
More
Add to Playlist
Report
Python - Splitting Criteria: Gini | Python Courses in Tamil | Skillfloor
Skillfloor
Follow
2 months ago
Learn the concept of Gini impurity, a key splitting criterion used in Decision Tree algorithms, with this easy-to-follow tutorial in Tamil!
Our Website:
Visit 🔗 http://www.skillfloor.com
Our Blogs:
Visit 🔗 https://skillfloor.com/blog/
DEVELOPMENT TRAINING IN CHENNAI
https://skillfloor.com/development-training-in-chennai
DEVELOPMENT TRAINING IN COIMBATORE
https://skillfloor.com/development-training-in-coimbatore
Our Development Courses:
Certified Python Developer
Visit 🔗https://skillfloor.com/certified-python-developer
Certified Data BASE Developer
Visit 🔗https://skillfloor.com/certified-data-base-developer
Certified Android App Developer
Visit 🔗https://skillfloor.com/certified-android-app-developer
Certified IOS App Developer
Visit 🔗https://skillfloor.com/certified-ios-app-developer
Certified Flutter Developer
Visit 🔗https://skillfloor.com/certified-flutter-developer
Certified Full Stack Developer
Visit 🔗https://skillfloor.com/certified-full-stack-developer
Certified Front End Developer
Visit 🔗https://skillfloor.com/certified-front-end-developer
Our Classroom Locations:
Bangalore - https://maps.app.goo.gl/ZKTSJNCKTihQqfgx6
Chennai - https://maps.app.goo.gl/36gvPAnwqVWWoWD47
Coimbatore - https://maps.app.goo.gl/BvEpAWtdbDUuTf1G6
Hyderabad - https://maps.app.goo.gl/NyPwrN35b3EoUDHCA
Ahmedabad - https://maps.app.goo.gl/uSizg8qngBMyLhC76
Pune - https://maps.app.goo.gl/JbGVtDgNQA7hpJYj9
Our Additional Course:
Analytics Course
https://skillfloor.com/analytics-courses
https://skillfloor.com/analytics-training-in-bangalore
Artificial Intelligence Course
https://skillfloor.com/artificial-intelligence-courses
https://skillfloor.com/artificial-intelligence-training-in-bangalore
Data Science Course
https://skillfloor.com/data-science-courses
https://skillfloor.com/data-science-course-in-bangalore
Digital Marketing
https://skillfloor.com/digital-marketing-courses
https://skillfloor.com/digital-marketing-courses-in-bangalore
Ethical Hacking
https://skillfloor.com/ethical-hacking-courses
https://skillfloor.com/cyber-security-training-in-bangalore
#giniimpurity #decisiontree #pythontutorial #machinelearning #tamilcoding #skillfloor #datascience #pythonintamil #mlalgorithm #splittingcriteria #classification #datasciencetamil #machinelearningtamil #pythonprogramming #mlmodels #techeducation #codingintamil #featureselection #pythoncourse #tamiltech
Category
🦄
Creativity
Transcript
Display full video transcript
00:00
Hello everyone. In this video, we will talk about splitting criteria Gini.
00:07
We will talk about Gini in the decision tree and how to use Gini.
00:18
In splitting criteria, first we will talk about Gini index.
00:22
In the entire dataset, we will talk about a randomness or a disorder.
00:30
We will talk about Gini index is 0, and we will talk about the proper data.
00:34
We will talk about the change in the order, classification and classification.
00:38
We will talk about randomness.
00:40
If Gini index is high, we will talk about split.
00:45
We will talk about decision tree and the spitting is proper.
00:50
We will talk about decision final.
00:52
We will talk about accuracy and provide accuracy.
00:55
That's why we will talk about Gini index.
00:58
We will talk about tree tree.
00:59
The Gini index is basically what we call formula.
01:02
1 minus probability of each and every class labels.
01:06
That is target column based.
01:08
First, we will take a simple example.
01:11
This is target column, whether it is an apple or not.
01:14
We will decide yes or no.
01:16
So, we will choose what we call color and size.
01:21
So, if we call Gini index, we will talk about the entire data.
01:24
Actually, we will talk about yes or no, mixed up.
01:27
Then, we will talk about disorder column.
01:29
We will talk about disorder column.
01:30
We will talk about disorder column.
01:31
We will talk about disorder column.
01:32
Now, we will talk about probability of target column, yes or no.
01:37
So, first probability of yes.
01:39
How many data is?
01:40
3 data is.
01:41
Out of 6 data.
01:42
Probability is 3 by 6.
01:43
That's why probability of no is 3 by 6.
01:46
This is 0.5.
01:47
So, if we calculate the Gini index index,
01:49
1 minus probability of yes whole square plus probability of no whole square.
01:54
Okay?
01:55
So, in the CI in red, we have target column.
01:57
So, targets order class labels.
02:01
Okay?
02:02
So, we will substitute value.
02:06
So, 1 minus 0.25 plus 0.25.
02:08
1 minus 0.5.
02:09
We will find 0.5.
02:10
We will find 0.5.
02:11
So, we will calculate the Gini index index index.
02:14
Okay?
02:15
So, we will calculate the Gini index index index.
02:18
Now, we will consider a data set.
02:20
That means, we will paste the tree structure and paste the Gini index index index.
02:23
So, first, we will consider the weekend data.
02:26
Then, we will select the weather, parent, money column.
02:29
So, we will paste the decision.
02:31
We will provide a decision on the weekend.
02:33
We will go to cinema, tennis, or stay in the street.
02:35
We will go to shopping.
02:37
So, now, what we will do is,
02:40
we will create a tree structure.
02:41
So, what we will do in the tree structure?
02:43
We will create a root node.
02:44
Then, we will create a decision node.
02:47
Then, we will create a decision node.
02:48
Then, based on the final,
02:49
we will choose for that division.
02:51
Correct?
02:52
So, the first root node,
02:53
we will select the root node.
02:54
We will not have any type of root node.
02:56
Like, weather, parents, money, etc.
02:58
So, in the three different columns,
03:00
we will calculate the Gini index index.
03:02
If there is a minimum value,
03:04
we will decide the root node.
03:06
So, first, step 1.
03:08
Then, we will select the Gini index index for this entire dataset.
03:10
In the entire dataset,
03:11
if we look at the six data,
03:12
we will pick up the cinema and say,
03:14
two data is tennis,
03:16
one data is shopping,
03:17
and one data is stay in.
03:19
Then, if we calculate the Gini index index,
03:22
the value is 0.58.
03:24
Now, first, we will choose root node.
03:26
So, we can choose root node. So, if we look at entire data results, we can choose 58% disorder.
03:32
So, we can create root node based tree structure. So, first, we will consider weather column.
03:37
So, in this weather column, what are the possibilities?
03:42
Sunny, rainy and windy.
03:47
There are three data. So, in sunny, most of the data, there are three data.
03:51
So, then, windy, 4, and rainy, 3. So, first, sunny data, consider.
03:58
So, sunny data, w1, w2, w3.
04:02
So, here are three data, decision node.
04:06
Cinema, tennis and tennis. So, here are two tennis and one cinema.
04:13
So, sunny data, where are three instances?
04:16
So, what is the name of the cinema? 1 by 3.
04:18
Tendence is 2 by 3.
04:20
So, we will check the guinea index for sunny.
04:23
So, first, weather, what is the category?
04:26
But, in one category, we will calculate the guinea split index.
04:29
So, we consider sunny data.
04:31
1 minus 1 divided by 3 the whole square, plus 2 divided by 3 the whole square.
04:37
We will calculate value.
04:38
The value is 0.44.
04:40
Okay?
04:41
That means windy.
04:42
In windy, there are three cinemas and one shopping.
04:45
So, in the cinema, it is 3 by 4.
04:46
Shopping is 1 by 4.
04:47
So, this is the guinea split index for windy.
04:50
This means rainy.
04:52
So, we will calculate three different class labels of weather.
04:56
Finally, weighted guinea for weather.
04:59
Weighted guinea for weather is the probability of sunny
05:02
into guinea value of sunny.
05:06
Plus, probability of windy into guinea value of windy.
05:11
Probability of rainy into guinea value of rainy.
05:14
So, in this case, we will separate.
05:16
So, sunny in the first place, there are three data.
05:19
So, it is 3 by 10.
05:20
That is the guinea value we provide.
05:23
0.44.
05:25
So, we will multiply.
05:27
Then, probability of windy data.
05:29
Windy data, there are four windy data.
05:32
So, 4 by 10.
05:33
That is the guinea split index.
05:35
So, in this case, we will provide.
05:36
We have 41.4% of the disorders.
05:41
So, we will check.
05:42
This is the parent node.
05:44
So, in the next parent node,
05:46
we will choose the parent node.
05:47
What classes are yes or no?
05:49
So, first, yes.
05:50
There are five instances.
05:51
In five instances, we will decide.
05:53
So, one minus five by five the whole square.
05:58
So, one minus one is zero.
06:00
So, this is the spitting proper.
06:02
That is the guinea split index value is zero.
06:04
That is the guinea split index of parents.
06:06
So, the guinea split index of parents.
06:08
Probability of yes into guinea split index of yes.
06:11
Plus probability of no into guinea value of no.
06:13
Check.
06:14
Then, similarly, this is the money column.
06:18
So, the money column is rich or poor.
06:20
So, rich or poor based on decision column.
06:23
Based on the guinea value.
06:24
Check.
06:25
This is the final decision of root node.
06:29
So, the guinea split index of whether, parents and money.
06:33
So, this is the minimum value of parents.
06:36
So, the parents is the root node.
06:38
Now, finally, I show the decision tree.
06:41
So, this is the first layer of data.
06:45
So, this is the first layer of parents.
06:48
Parents are the root node.
06:50
Yes.
06:51
So, the guinea split index is zero.
06:54
So, the guinea split index is zero.
06:55
So, this is the first layer of data.
06:58
You will see the other two written,
07:20
Second, further spitting is nonno data.
07:24
parents equal to no data
07:25
parents equal to no data
07:27
first parents base
07:29
create
07:29
next
07:30
weather or money
07:32
base
07:32
create
07:33
so
07:34
we can consider
07:37
so
07:38
weather attribute
07:39
parents equal to no data
07:40
parents equal to no data
07:41
weather attribute
07:43
so
07:44
weather attribute
07:45
sunny
07:46
rainy
07:46
windy
07:47
so
07:47
separate
07:48
two instance
07:49
two instance
07:50
one instance
07:51
so
07:51
sunny
07:51
calculate
07:52
rainy
07:53
and windy
07:54
separate
07:54
guinea
07:54
split
07:55
index
07:55
calculate
07:55
sunny
07:56
data
07:57
consider
07:57
two
07:58
cinema
07:59
tennis
07:59
decide
08:00
so
08:01
guinea
08:02
split
08:02
value
08:02
zero
08:03
rainy
08:03
consider
08:04
stay
08:04
in
08:05
above
08:05
value
08:05
0
08:06
windy
08:06
one
08:06
cinema
08:08
one
08:08
shopping
08:09
decide
08:10
so
08:10
guinea
08:11
split
08:11
index
08:12
value
08:12
0.5
08:12
this
08:13
normal
08:14
weighted
08:16
guinea
08:16
index
08:16
calculate
08:17
when
08:18
guinea
08:18
index
08:19
of
08:19
sunny
08:20
into
08:21
probability
08:21
of sunny
08:22
into
08:23
guinea
08:24
value of sunny
08:24
provide
08:25
and here
08:25
is
08:25
probability of sunny
08:27
given
08:28
parents
08:30
equal to
08:31
no
08:31
so
08:32
in the data
08:32
we have
08:33
probability value
08:35
into
08:35
guinea
08:38
value of
08:38
sunny
08:40
given
08:42
parents
08:42
equal to
08:43
no
08:43
okay
08:44
in the
08:44
math
08:44
plus
08:45
probability of
08:47
windy
08:48
given
08:49
parents
08:49
equal to
08:50
no
08:50
we have
08:51
separated
08:51
calculate
08:52
okay
08:53
then
08:54
check
08:55
money
08:56
column
08:57
check
08:57
once
08:57
money
08:58
column
08:58
check
08:59
and
09:00
analyze
09:01
we have
09:02
We will analyze how the minimum value is.
09:04
We will base weather based.
09:06
In the next depth layer tool,
09:08
we will create weather based.
09:10
We will add no.
09:12
We will add sunny, rainy and windy.
09:14
We will create sunny based.
09:16
We will decide both tennis based.
09:18
We will decide.
09:20
Guinea index value is 0.
09:22
No further splitting.
09:24
We will consider rainy data.
09:26
We will add no further splitting.
09:28
Guinea value index is 0.
09:30
We will decide one of the cinema.
09:32
We will decide shopping.
09:34
Now, we will split the other.
09:36
We will split the other.
09:38
We will split the other.
09:40
Money based.
09:42
Who are rich.
09:44
So, we will create a decision tree.
09:46
Guinea split index based.
09:52
This is the splitting criteria.
09:54
We will see the next video.
Be the first to comment
Add your comment
Recommended
13:39
|
Up next
Treaspassing Sasquatch looking theif.
Wiredwizard.net
3 weeks ago
0:37
Benefits of AI in Marketing | Eflot - Digital marketing agency in Bangalore
Eflot
2 months ago
0:43
Why Dove’s “Real Beauty Sketches” Went Viral Worldwide | Eflot
Eflot
4 months ago
0:34
The Importance of Internet Advertising | Eflot - Digital Marketing Agency In Bangalore
Eflot
5 months ago
9:13
Python - Purpose of Version Control & Popular tools | Python Courses in Tamil | Skillfloor
Skillfloor
2 months ago
15:00
Implementation of DT in Python | Python Courses in Tamil | Skillfloor
Skillfloor
2 months ago
7:48
Python - Splitting Criteria Entrophy | Python Courses in Tamil | Skillfloor
Skillfloor
2 months ago
1:06
Steps to Begin Your Digital Marketing Journey | Digital Marketing Course in Hyderabad | Skillfloor
Skillfloor
2 months ago
5:36
Python - Introduction to Decision Tree | Python Courses in Tamil | Skillfloor
Skillfloor
2 months ago
10:45
Modelling and Evaluation of RF in Python | Python Courses in Tamil | Skillfloor
Skillfloor
2 months ago
4:17
Python - Random Forest Ensemble technique | Python Courses in Tamil | Skillfloor
Skillfloor
2 months ago
4:30
Advanced Python Data Visualizations Count plot, Catplot | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
0:38
The Best Social Media Marketing Institute | Digital Marketing Course in Coimbatore | Skillfloor
Skillfloor
3 months ago
4:25
Advanced Python Data Visualizations Strip plot, Swarmplot | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
5:36
Python - Advanced Python Data Visualizations, Boxplot, Violin | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
5:46
Python - Advanced Python Data Visualizations Relplot, Heatmap | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
8:30
Python - Seaborn Basic Plots Histogram, Distplot | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
7:43
Python - Seaborn Basic Plots Line, Scatter | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
7:41
Python - Data Visualization Using Matplotlib in Python Part 4 | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
7:38
Python - Data Visualization Using Matplotlib Part 2 | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
5:00
Data Visualization using Matplotlib in Python Part 1 | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
5:45
Python - Introduction to Visualisation Packages | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
8:24
Python - Data Munging in Pandas Part-5 | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
0:51
Former Aide Claims She Was Asked to Make a ‘Hit List’ For Trump
Veuer
2 years ago
1:08
Musk’s X Is ‘the Platform With the Largest Ratio of Misinformation or Disinformation’ Amongst All Social Media Platforms
Veuer
2 years ago
Be the first to comment