Skip to player
Skip to main content
Search
Connect
Watch fullscreen
Like
Bookmark
Share
More
Add to Playlist
Report
Modelling and Evaluation of RF in Python | Python Courses in Tamil | Skillfloor
Skillfloor
Follow
2 months ago
#randomforest
#machinelearning
#pythontutorial
#tamilcoding
#skillfloor
#datascience
#pythonintamil
#ensemblelearning
#mlalgorithm
#decisiontree
#datasciencetamil
#machinelearningtamil
#pythonprogramming
#mlmodels
#techeducation
Explore the powerful world of Random Forest (RF) modelling and evaluation in Python—now in Tamil!
Start learning machine learning in your own language, Tamil, today!
Our Website:
Visit 🔗 http://www.skillfloor.com
Our Blogs:
Visit 🔗 https://skillfloor.com/blog/
DEVELOPMENT TRAINING IN CHENNAI
https://skillfloor.com/development-training-in-chennai
DEVELOPMENT TRAINING IN COIMBATORE
https://skillfloor.com/development-training-in-coimbatore
Our Development Courses:
Certified Python Developer
Visit 🔗https://skillfloor.com/certified-python-developer
Certified Data BASE Developer
Visit 🔗https://skillfloor.com/certified-data-base-developer
Certified Android App Developer
Visit 🔗https://skillfloor.com/certified-android-app-developer
Certified IOS App Developer
Visit 🔗https://skillfloor.com/certified-ios-app-developer
Certified Flutter Developer
Visit 🔗https://skillfloor.com/certified-flutter-developer
Certified Full Stack Developer
Visit 🔗https://skillfloor.com/certified-full-stack-developer
Certified Front End Developer
Visit 🔗https://skillfloor.com/certified-front-end-developer
Our Classroom Locations:
Bangalore - https://maps.app.goo.gl/ZKTSJNCKTihQqfgx6
Chennai - https://maps.app.goo.gl/36gvPAnwqVWWoWD47
Coimbatore - https://maps.app.goo.gl/BvEpAWtdbDUuTf1G6
Hyderabad - https://maps.app.goo.gl/NyPwrN35b3EoUDHCA
Ahmedabad - https://maps.app.goo.gl/uSizg8qngBMyLhC76
Pune - https://maps.app.goo.gl/JbGVtDgNQA7hpJYj9
Our Additional Course:
Analytics Course
https://skillfloor.com/analytics-courses
https://skillfloor.com/analytics-training-in-bangalore
Artificial Intelligence Course
https://skillfloor.com/artificial-intelligence-courses
https://skillfloor.com/artificial-intelligence-training-in-bangalore
Data Science Course
https://skillfloor.com/data-science-courses
https://skillfloor.com/data-science-course-in-bangalore
Digital Marketing
https://skillfloor.com/digital-marketing-courses
https://skillfloor.com/digital-marketing-courses-in-bangalore
Ethical Hacking
https://skillfloor.com/ethical-hacking-courses
https://skillfloor.com/cyber-security-training-in-bangalore
#randomforest #machinelearning #pythontutorial #tamilcoding #skillfloor #datascience #pythonintamil #ensemblelearning #mlalgorithm #decisiontree #datasciencetamil #machinelearningtamil #pythonprogramming #mlmodels #techeducation #codingintamil #dataanalysis #modelassessment #rfmodelling #pythoncourse
Category
🦄
Creativity
Transcript
Display full video transcript
00:00
Hello everyone. In this video, we will talk about modeling and evaluation of random forest in python.
00:10
So, if we use a random forest, we will see how to implement a random forest.
00:19
So, we will consider a hard data set.
00:22
So, we will decide a particular person who has a heart disease.
00:26
So, this is a classification problem.
00:28
So, in random forest, we will create a decision tree classifier in the back end.
00:34
So, when we implement it, we will load the data set.
00:37
So, first, we will import the NumPy, Pandas, Matplot, C1.
00:42
So, we will import the libraries.
00:44
So, we will import the data set.
00:47
So, we will import the data set.
00:48
Like, if there are any columns, age.
00:50
Then, gender, female and male.
00:52
We will import the 0, female, 1, male.
00:55
Then, chest pain type.
00:56
There are 4 different types.
00:58
0, 1, 2, 3.
00:59
0 is typical angina.
01:00
1 is atypical angina.
01:02
2 is non-anginal pain.
01:04
3 is asymptomatic.
01:06
So, in this case, we have chest pain type.
01:09
Then, chest BP.
01:10
Blood pressure.
01:11
Then, cholesterol levels.
01:12
Then, fasting blood sugar rate.
01:14
So, fasting blood sugar rate.
01:16
1 is 120.
01:17
1 is low.
01:18
Then, resting blood sugar.
01:19
Fasting blood sugar rate.
01:20
1 is low.
01:21
Then, resting ECG.
01:22
So, ECG.
01:23
There is a PQRS wave.
01:25
So, there is a PQRS wave.
01:26
So, the PQRS wave.
01:28
We are normal.
01:29
1 is abnormal.
01:30
2 is abnormal.
01:31
2 is left ventricular hypotrophy according to S criteria.
01:35
So, there is maximum heart rate.
01:42
So, when we say that, we are stressed.
01:46
So, when we say that, we provide value 2.
01:49
So, we are separated from one column.
01:52
So, based on that particular person, we decide the target column.
01:58
So, that means that there is no heart disease.
02:02
So, when we say CA, we say number of major vessels.
02:09
Then, TAL is a blood disorder.
02:12
So, 1 is normal.
02:15
2 is the blood flow in particular heart.
02:18
3 is the reversible defect.
02:20
So, when we say the issue of blood flow, we say reverse.
02:26
So, in that part, we do this.
02:29
So, all of these values are complete data set.
02:33
So, there are actually 14 columns.
02:37
So, target is our final output column.
02:40
So, this is what we predict.
02:42
Then, in this hard dot shape, basic information checks.
02:46
So, 303 rows, 14 columns.
02:49
Then, you have null data set.
02:51
So, actually, we have 303 rows.
02:53
So, all of these values are non-null values.
02:55
Okay?
02:56
Then, describe complete and visualize.
02:59
Then, normal and basic EDH checks.
03:02
So, EDL is univariate analysis.
03:05
So, univariate analysis is heart with respect to heart.
03:08
We look at heart with respect to count values.
03:11
His plot.
03:12
Then, bivariate analysis is CP.
03:15
That is chest pain with respect to target provide.
03:18
So, target is 0 and 1.
03:19
That is based on chest pain types.
03:21
Then, each and every columns.
03:24
We analyze correlation and analyze.
03:26
Then, here we look at missing data.
03:28
This is missing data.
03:29
So, if you have missing data,
03:30
one of the missing data is also have missing data.
03:31
Then, heart.duplicator.sum
03:32
Then, heart.duplicator.sum
03:33
Now, here is another duplicate data.
03:35
That is important.
03:36
It is important for the port in columns.
03:37
We are exactly replicate.
03:38
So, that is what we do.
03:39
We remove this.
03:40
So, heart.drop.duplicates.
03:43
In place equal to true.
03:45
So, here we check.
03:48
We remove the duplicate.
03:50
Next, we remove random forest classifier.
03:55
Next, let's go to random forest classifier implementation.
03:59
So from SQL under Ensemble, we import random forest classifier.
04:03
This is the regression problem.
04:05
We use random forest regressor.
04:07
So first, we create a model for random forest classifier.
04:10
Then, we fit our training data,
04:13
we predict our testing data.
04:15
Then, accuracy final,
04:16
we calculate y test and y prediction.
04:19
We have 86% accuracy.
04:22
Then, we check classification report.
04:25
So in classification report,
04:27
Precision, Recall, F1 score, Support.
04:30
We check.
04:31
So here, we have 87% accuracy.
04:33
Now, 0 is the person not having heart disease.
04:37
So not having heart disease.
04:40
Then, one person is having heart disease.
04:43
So if we compare,
04:45
if we compare the person having heart disease,
04:47
we learn better.
04:49
We compare not having heart disease.
04:51
Next.
04:52
Next, we have 86% accuracy.
04:54
If we compare,
04:56
we still have improvised.
04:57
What we use is hyperparameter tuning.
05:01
We optimize hyperparameter tuning.
05:03
So hyperparameter tuning is basic,
05:06
best parameters we provide.
05:08
So what we use is hyperparameter tuning.
05:11
Grid search CV, randomized search CV.
05:13
So cross-validation.
05:14
How do we do cross-validation?
05:16
Like,
05:17
we provide the entire data
05:19
for 4-folds.
05:22
Now, we split it.
05:24
First time we consider,
05:26
the block is training,
05:28
testing data,
05:29
testing data,
05:30
and training data.
05:32
After that,
05:33
then,
05:34
we run with 75% to 25%
05:35
then,
05:36
second time we run with
05:37
first block,
05:38
testing data
05:39
then,
05:40
training data
05:41
So, now we have separate training data. Then, we provide the features, the combinations
05:50
provide, set, where we have higher security, the particular part we provide. So, this is
05:57
random. Grid search CV, we set different combinations and best parameters. So, in random
06:07
for us, what parameters we consider? N estimators. That is the number of decision trees. So,
06:14
decision tree is basically default 100, 200, 300 provide. Then, maximum of features. So,
06:21
features are auto, square root and log2. Then, depth of trees is 10, 20, 30, none. In the list
06:29
we provide up and of none. So, sometimes we consider maximum depth. Then, minimum sample
06:36
split. So, when the data is split, we say minimum sample split. So, that is 5 to 10. Then,
06:43
final node, we say leaf node, we say leaf node. That leaf node split
06:48
is confined to minimum sample leaf. So, if we go to all these, we say
06:54
a random grid, we say dictionary, n estimators. That is 100, 200, 300. So,
07:00
that is 100, 200, 300. So, this is one dictionary we create. Then, next we
07:07
create a random forest classifier model. So, we create a grid search
07:11
here, first name we provide model. So, the model name is rf1. So,
07:18
the first name we provide. So, the first name we provide. So, the scoring
07:21
is equal to f1 provide. So, f1 score based on the accuracy and accuracy
07:26
and accuracy and accuracy and accuracy
07:28
we provide the precision, recall f1
07:30
we provide. So, in the model, we can complete f1
07:33
base and create and provide. Then,
07:37
param grid is random grid. That is, we consider all the data
07:40
that we have to do. Then, cv equal to 3
07:43
we divide. Then, obos. Obos equal to 2
07:48
we provide. Fitting 3 folds for each of 144 candidates.
07:52
In the line provide. Then, n jobs. n jobs
07:56
n jobs, basic grid search cv. Now, what is
08:00
n estimate as 100, so, 100. Then,
08:06
auto feature. Then, all the depth. 10. Then, 100
08:11
square root. 10. Then, 100 log to 10. So,
08:20
in the model, we complete. Then, 3p. 100 auto. 20. 100
08:26
square root. 20. So, all different possible combinations
08:30
we complete match. Based on f1 score
08:34
we decide. We decide. We
08:36
in the data, we divide the entire
08:38
three different folds. In the particular parameters
08:41
based and f1 consider
08:43
we provide best parameters. That is, we
08:46
provide. Okay. So, here
08:48
we train the grid search cv. We train
08:50
we provide. That is, grid search cv.
08:52
We provide model. Then, training data
08:56
provide best parameters extract. So, best parameters
08:59
how do we extract the model? We
09:01
create the model. So, in that
09:03
we have inbuilt best parameters. We
09:05
say a keyword. The base. That
09:07
we extract it. It will be showing. Best
09:10
parameters. How do we
09:11
take maximum depth? How do we take
09:12
minimum swap? How do we take
09:13
sample leaf? How do we take
09:14
sample leaf? How do we take
09:15
separate this? Okay. So, this
09:20
based, we create random forest
09:23
classifier. So, random forest
09:25
classifier. Here, best parameters
09:27
provide maximum depth
09:29
10. Maximum feature auto. Minimum
09:31
sample leaf 4. So, this
09:33
we use double pointer. So, random forest
09:36
classifier of double star. Here, what we
09:39
provide? We save the best parameters. So, that
09:42
we point. So, that's what we
09:44
point. That's what we
09:45
point. That's what we
09:46
point. That's what we
09:47
point. Double pointer method. We
09:48
use random forest classifier
09:49
create. Here, we provide
09:51
training data. Then, testing data.
09:53
provide. Predic.
09:55
Predic.
09:57
Okay.
09:58
Predic.
09:59
Accuracy.
10:00
Find.
10:01
88%.
10:02
Normal 86.
10:03
How do we
10:04
find?
10:05
88?
10:06
80?
10:07
88?
10:08
88?
10:09
88.
10:10
88?
10:12
88?
10:14
88?
10:15
88?
10:16
88?
10:17
88?
10:18
88?
10:19
88?
10:20
88?
10:21
88?
10:22
88?
10:23
88?
10:24
the highest predictor.
10:25
So, we can do the random forest overfitting and then we can do the hyperparameter tuning.
10:34
This is the modeling and evaluation of random forest in python.
Be the first to comment
Add your comment
Recommended
13:39
|
Up next
Treaspassing Sasquatch looking theif.
Wiredwizard.net
2 weeks ago
0:37
Benefits of AI in Marketing | Eflot - Digital marketing agency in Bangalore
Eflot
2 months ago
0:43
Why Dove’s “Real Beauty Sketches” Went Viral Worldwide | Eflot
Eflot
4 months ago
0:34
The Importance of Internet Advertising | Eflot - Digital Marketing Agency In Bangalore
Eflot
5 months ago
9:13
Python - Purpose of Version Control & Popular tools | Python Courses in Tamil | Skillfloor
Skillfloor
2 months ago
15:00
Implementation of DT in Python | Python Courses in Tamil | Skillfloor
Skillfloor
2 months ago
7:48
Python - Splitting Criteria Entrophy | Python Courses in Tamil | Skillfloor
Skillfloor
2 months ago
9:59
Python - Splitting Criteria: Gini | Python Courses in Tamil | Skillfloor
Skillfloor
2 months ago
1:06
Steps to Begin Your Digital Marketing Journey | Digital Marketing Course in Hyderabad | Skillfloor
Skillfloor
2 months ago
5:36
Python - Introduction to Decision Tree | Python Courses in Tamil | Skillfloor
Skillfloor
2 months ago
4:17
Python - Random Forest Ensemble technique | Python Courses in Tamil | Skillfloor
Skillfloor
2 months ago
4:30
Advanced Python Data Visualizations Count plot, Catplot | Python Courses in Tamil | Skillfloor
Skillfloor
2 months ago
0:38
The Best Social Media Marketing Institute | Digital Marketing Course in Coimbatore | Skillfloor
Skillfloor
2 months ago
4:25
Advanced Python Data Visualizations Strip plot, Swarmplot | Python Courses in Tamil | Skillfloor
Skillfloor
2 months ago
5:36
Python - Advanced Python Data Visualizations, Boxplot, Violin | Python Courses in Tamil | Skillfloor
Skillfloor
2 months ago
5:46
Python - Advanced Python Data Visualizations Relplot, Heatmap | Python Courses in Tamil | Skillfloor
Skillfloor
2 months ago
8:30
Python - Seaborn Basic Plots Histogram, Distplot | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
7:43
Python - Seaborn Basic Plots Line, Scatter | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
7:41
Python - Data Visualization Using Matplotlib in Python Part 4 | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
7:38
Python - Data Visualization Using Matplotlib Part 2 | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
5:00
Data Visualization using Matplotlib in Python Part 1 | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
5:45
Python - Introduction to Visualisation Packages | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
8:24
Python - Data Munging in Pandas Part-5 | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
11:53
Python - Data Munging in Pandas Part-4 | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
7:00
Python - Data Munging in Pandas Part-3 | Python Courses in Tamil | Skillfloor
Skillfloor
3 months ago
Be the first to comment