Data Science & Machine Learning
Join this channel to learn data science, artificial intelligence and machine learning with funny quizzes, interesting projects and amazing resources for free For collaborations: @love_data
Show moreπ Analytical overview of Telegram channel Data Science & Machine Learning
Channel Data Science & Machine Learning (@datasciencefun) in the English language segment is an active participant. Currently, the community unites 75 816 subscribers, ranking 2 113 in the Education category and 4 286 in the India region.
π Audience metrics and dynamics
Since its creation on Π½Π΅Π²ΡΠ΄ΠΎΠΌΠΎ, the project has demonstrated rapid growth, gathering an audience of 75 816 subscribers.
According to the latest data from 18 June, 2026, the channel demonstrates stable activity. Although there has been a change in the number of participants by 884 over the last 30 days and by 6 over the last 24 hours, overall reach remains high.
- Verification status: Not verified
- Engagement rate (ER): The average audience engagement rate is 3.25%. Within the first 24 hours after publication, content typically collects 1.38% reactions from the total number of subscribers.
- Post reach: On average, each post receives 2 462 views. Within the first day, a publication typically gains 1 043 views.
- Reactions and interaction: The audience actively supports content: the average number of reactions per post is 4.
- Thematic interests: Content is focused on key topics such as learning, accuracy, distribution, panda, dataset.
π Description and content policy
The author describes the resource as a platform for expressing subjective opinions:
βJoin this channel to learn data science, artificial intelligence and machine learning with funny quizzes, interesting projects and amazing resources for free
For collaborations: @love_dataβ
Thanks to the high frequency of updates (latest data received on 19 June, 2026), the channel maintains relevance and a high level of publication reach. Analytics show that the audience actively interacts with content, making it an important point of influence in the Education category.
# Import necessary libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
import matplotlib.pyplot as plt
import seaborn as sns
# Example data
data = {
'Age': [29, 45, 50, 39, 48, 50, 55, 60, 62, 43],
'Cholesterol': [220, 250, 230, 180, 240, 290, 310, 275, 300, 280],
'Max_Heart_Rate': [180, 165, 170, 190, 155, 160, 150, 140, 130, 148],
'Heart_Disease': [0, 1, 1, 0, 1, 1, 1, 1, 1, 0]
}
df = pd.DataFrame(data)
# Independent variables (features) and dependent variable (target)
X = df[['Age', 'Cholesterol', 'Max_Heart_Rate']]
y = df['Heart_Disease']
# Splitting the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
# Creating and training the random forest model
model = RandomForestClassifier(n_estimators=100, random_state=0)
model.fit(X_train, y_train)
# Making predictions
y_pred = model.predict(X_test)
# Evaluating the model
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)
print(f"Accuracy: {accuracy}")
print(f"Confusion Matrix:\n{conf_matrix}")
print(f"Classification Report:\n{class_report}")
# Feature importance
feature_importances = pd.DataFrame(model.feature_importances_, index=X.columns, columns=['Importance']).sort_values('Importance', ascending=False)
print(f"Feature Importances:\n{feature_importances}")
# Plotting the feature importances
sns.barplot(x=feature_importances.index, y=feature_importances['Importance'])
plt.title('Feature Importances')
plt.xlabel('Feature')
plt.ylabel('Importance')
plt.show()
## Explanation of the Code
1. Libraries: We import necessary libraries like numpy, pandas, sklearn, matplotlib, and seaborn.
2. Data Preparation: We create a DataFrame containing features (Age, Cholesterol, Max_Heart_Rate) and the target variable (Heart_Disease).
3. Feature and Target: We separate the features and the target variable.
4. Train-Test Split: We split the data into training and testing sets.
5. Model Training: We create a RandomForestClassifier model with 100 trees and train it using the training data.
6. Predictions: We use the trained model to predict heart disease for the test set.
7. Evaluation: We evaluate the model using accuracy, confusion matrix, and classification report.
8. Feature Importance: We compute and display the importance of each feature.
9. Visualization: We plot the feature importances to visualize which features contribute most to the model's predictions.
## Evaluation Metrics
- Accuracy: The proportion of correctly classified instances among the total instances.
- Confusion Matrix: Shows the counts of true positives, true negatives, false positives, and false negatives.
- Classification Report: Provides precision, recall, F1-score, and support for each class.
Best Data Science & Machine Learning Resources: https://topmate.io/coding/914624
Credits: t.me/datasciencefun
ENJOY LEARNING ππ# Import necessary libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier, plot_tree
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
import matplotlib.pyplot as plt
# Example data
data = {
'Age': [25, 45, 35, 50, 23, 37, 32, 28, 40, 27],
'Income': ['High', 'High', 'High', 'Medium', 'Low', 'Low', 'Low', 'Medium', 'Low', 'Medium'],
'Student': ['No', 'No', 'No', 'No', 'Yes', 'Yes', 'Yes', 'Yes', 'Yes', 'No'],
'Buys_Computer': ['No', 'No', 'Yes', 'Yes', 'Yes', 'No', 'Yes', 'No', 'Yes', 'Yes']
}
df = pd.DataFrame(data)
# Convert categorical features to numeric
df['Income'] = df['Income'].map({'Low': 1, 'Medium': 2, 'High': 3})
df['Student'] = df['Student'].map({'No': 0, 'Yes': 1})
df['Buys_Computer'] = df['Buys_Computer'].map({'No': 0, 'Yes': 1})
# Independent variables (features) and dependent variable (target)
X = df[['Age', 'Income', 'Student']]
y = df['Buys_Computer']
# Splitting the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
# Creating and training the decision tree model
model = DecisionTreeClassifier(criterion='gini', max_depth=3, random_state=0)
model.fit(X_train, y_train)
# Making predictions
y_pred = model.predict(X_test)
# Evaluating the model
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)
print(f"Accuracy: {accuracy}")
print(f"Confusion Matrix:\n{conf_matrix}")
print(f"Classification Report:\n{class_report}")
# Plotting the decision tree
plt.figure(figsize=(12,8))
plot_tree(model, feature_names=['Age', 'Income', 'Student'], class_names=['No', 'Yes'], filled=True)
plt.title('Decision Tree')
plt.show()
#### Explanation of the Code
1. Libraries: We import necessary libraries like numpy, pandas, sklearn, and matplotlib.
2. Data Preparation: We create a DataFrame containing features and the target variable. Categorical features are converted to numeric values.
3. Feature and Target: We separate the features (Age, Income, Student) and the target (Buys_Computer).
4. Train-Test Split: We split the data into training and testing sets.
5. Model Training: We create a DecisionTreeClassifier model, specifying the criterion (Gini impurity) and maximum depth of the tree, and train it using the training data.
6. Predictions: We use the trained model to predict whether a person buys a computer for the test set.
7. Evaluation: Evaluate the model using accuracy, confusion matrix, and classification report.
8. Visualization: Plot decision tree to visualize the decision-making process.
## Evaluation Metrics
- Accuracy
- Confusion Matrix: Shows the counts of true positives, true negatives, false positives, and false negatives.
- Classification Report: Provides precision, recall, F1-score, and support for each class.
Like if you need similar content ππ
Hope this helps you π# Import necessary libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, classification_report, roc_auc_score, roc_curve
import matplotlib.pyplot as plt
# Example data
data = {
'Hours_Studied': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'Passed': [0, 0, 0, 0, 1, 1, 1, 1, 1, 1]
}
df = pd.DataFrame(data)
# Independent variable (feature) and dependent variable (target)
X = df[['Hours_Studied']]
y = df['Passed']
# Splitting the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
# Creating and training the logistic regression model
model = LogisticRegression()
model.fit(X_train, y_train)
# Making predictions
y_pred = model.predict(X_test)
y_pred_prob = model.predict_proba(X_test)[:, 1]
# Evaluating the model
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)
roc_auc = roc_auc_score(y_test, y_pred_prob)
print(f"Confusion Matrix:\n{conf_matrix}")
print(f"Classification Report:\n{class_report}")
print(f"ROC-AUC: {roc_auc}")
# Plotting the ROC curve
fpr, tpr, thresholds = roc_curve(y_test, y_pred_prob)
plt.plot(fpr, tpr, label='Logistic Regression (area = %0.2f)' % roc_auc)
plt.plot([0, 1], [0, 1], 'k--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic')
plt.legend(loc="lower right")
plt.show()
## Explanation of the Code
1. Libraries: We import necessary libraries like numpy, pandas, sklearn, and matplotlib.
2. Data Preparation: We create a DataFrame containing the hours studied and whether the student passed.
3. Feature and Target: We separate the feature (Hours_Studied) and the target (Passed).
4. Train-Test Split: We split the data into training and testing sets.
5. Model Training: We create a LogisticRegression model and train it using the training data.
6. Predictions: We use the trained model to predict the pass/fail outcome for the test set and also obtain the predicted probabilities.
7. Evaluation: We evaluate the model using the confusion matrix, classification report, and ROC-AUC score.
8. Visualization: We plot the ROC curve to visualize the model's performance.
## Evaluation Metrics
- Confusion Matrix: Shows the counts of true positives, true negatives, false positives, and false negatives.
- Classification Report: Provides precision, recall, F1-score, and support for each class.
- ROC-AUC: Measures the model's ability to distinguish between the classes. AUC (Area Under the Curve) closer to 1 indicates better performance.
Best Data Science & Machine Learning Resources: https://topmate.io/coding/914624
Credits: https://t.me/datasciencefun
Like if you need similar content ππ
Hope this helps you π# Import necessary libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
import matplotlib.pyplot as plt
# Example data
data = {
'Size': [1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400],
'Price': [300000, 320000, 340000, 360000, 380000, 400000, 420000, 440000, 460000, 480000]
}
df = pd.DataFrame(data)
# Independent variable (feature) and dependent variable (target)
X = df[['Size']]
y = df['Price']
# Splitting the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
# Creating and training the linear regression model
model = LinearRegression()
model.fit(X_train, y_train)
# Making predictions
y_pred = model.predict(X_test)
# Evaluating the model
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print(f"Mean Squared Error: {mse}")
print(f"R-squared: {r2}")
# Plotting the results
plt.scatter(X, y, color='blue') # Original data points
plt.plot(X_test, y_pred, color='red', linewidth=2) # Regression line
plt.xlabel('Size (sq ft)')
plt.ylabel('Price ($)')
plt.title('Linear Regression: House Prices vs Size')
plt.show()
#### Explanation of the Code
1. Libraries: We import necessary libraries like numpy, pandas, sklearn, and matplotlib.
2. Data Preparation: We create a DataFrame containing the size and price of houses.
3. Feature and Target: We separate the feature (Size) and the target (Price).
4. Train-Test Split: We split the data into training and testing sets.
5. Model Training: We create a LinearRegression model and train it using the training data.
6. Predictions: We use the trained model to predict house prices for the test set.
7. Evaluation: We evaluate the model using Mean Squared Error (MSE) and R-squared (RΒ²) metrics.
8. Visualization: We plot the original data points and the regression line to visualize the model's performance.
#### Evaluation Metrics
- Mean Squared Error (MSE): Measures the average squared difference between the actual and predicted values. Lower values indicate better performance.
- R-squared (RΒ²): Represents the proportion of the variance in the dependent variable that is predictable from the independent variable(s). Values closer to 1 indicate a better fit.
Share this channel with your real friends: https://t.me/datasciencefun
Like if you want me to continue this series πβ€οΈ
ENJOY LEARNING ππ
Available now! Telegram Research 2025 β the year's key insights 
