Data science/ML/AI
Data science and machine learning hub Python, SQL, stats, ML, deep learning, projects, PDFs, roadmaps and AI resources. For beginners, data scientists and ML engineers 👉 https://rebrand.ly/bigdatachannels DMCA: @disclosure_bds Contact: @mldatascientist
Show more📈 Analytical overview of Telegram channel Data science/ML/AI
Channel Data science/ML/AI (@datascience_bds) in the English language segment is an active participant. Currently, the community unites 13 660 subscribers, ranking 9 391 in the Technologies & Applications category and 31 743 in the India region.
📊 Audience metrics and dynamics
Since its creation on невідомо, the project has demonstrated rapid growth, gathering an audience of 13 660 subscribers.
According to the latest data from 07 June, 2026, the channel demonstrates stable activity. Although there has been a change in the number of participants by 151 over the last 30 days and by -5 over the last 24 hours, overall reach remains high.
- Verification status: Not verified
- Engagement rate (ER): The average audience engagement rate is 7.92%. Within the first 24 hours after publication, content typically collects 2.33% reactions from the total number of subscribers.
- Post reach: On average, each post receives 1 082 views. Within the first day, a publication typically gains 318 views.
- Reactions and interaction: The audience actively supports content: the average number of reactions per post is 5.
- Thematic interests: Content is focused on key topics such as panda, learning, row, api, ethic.
📝 Description and content policy
The author describes the resource as a platform for expressing subjective opinions:
“Data science and machine learning hub
Python, SQL, stats, ML, deep learning, projects, PDFs, roadmaps and AI resources.
For beginners, data scientists and ML engineers
👉 https://rebrand.ly/bigdatachannels
DMCA: @disclosure_bds
Contact: @mldatasci...”
Thanks to the high frequency of updates (latest data received on 08 June, 2026), the channel maintains relevance and a high level of publication reach. Analytics show that the audience actively interacts with content, making it an important point of influence in the Technologies & Applications category.
scikit-learn library on the famous Iris dataset:
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.manifold import TSNE
# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target
# Apply t-SNE
tsne = TSNE(n_components=2, perplexity=30, random_state=42)
X_embedded = tsne.fit_transform(X)
# Plotting the results
plt.figure(figsize=(8, 6))
scatter = plt.scatter(X_embedded[:, 0], X_embedded[:, 1], c=y, cmap='viridis')
plt.title('t-SNE Visualization of Iris Dataset')
plt.xlabel('t-SNE Component 1')
plt.ylabel('t-SNE Component 2')
plt.colorbar(scatter, label='Species')
plt.show()
In this example, we load the Iris dataset, apply t-SNE to reduce its four dimensions down to two, and then visualize the results. The colors represent different species of iris flowers, showing how well t-SNE can separate them based on their features.
▎Limitations of t-SNE
While t-SNE is powerful, it has some limitations:
• Computationally Intensive: It can be slow for very large datasets due to its complexity.
• Non-Deterministic: Different runs can yield different results unless you set a random seed.
• Difficulty in Interpreting Distances: The distances in the lower-dimensional space do not have a direct interpretation; they are more about relative positioning than absolute distances.import matplotlib.pyplot as plt
# Days of the week
days = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']
# Coffee cups consumed
cups = [2, 3, 4, 1, 5, 6, 3]
plt.bar(days, cups, color='brown')
plt.title('Weekly Coffee Consumption')
plt.xlabel('Days')
plt.ylabel('Cups of Coffee')
plt.show()
With this simple code, you’ve transformed boring numbers into a visual that tells a story about your caffeine habits!
▎Conclusion
Data visualization isn’t just about making pretty pictures; it’s about making data accessible and understandable. It helps you tell stories that resonate with your audience and empowers them to make decisions based on insights rather than just raw numbers. So next time you have data to share, think about how you can visualize it, your audience will thank you!scikit-learn library to perform linear regression:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
# Sample data
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([2, 3, 5, 7, 11])
# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create and train the model
model = LinearRegression()
model.fit(X_train, y_train)
# Make predictions
predictions = model.predict(X_test)
# Plot results
plt.scatter(X, y, color='blue', label='Data Points')
plt.plot(X_test, predictions, color='red', label='Predicted Line')
plt.legend()
plt.show()(TP+TN) / Total - Avoid for imbalanced data!
• Precision: TP / (TP + FP)
• Meaning: Out of all times it said "Positive," how many were truly positive?
• Use When: False Positives (FP) are very costly (e.g., wrongly flagging a healthy person as sick).
• Recall: TP / (TP + FN)
• Meaning: Out of all actual positives, how many did it catch?
• Use When: False Negatives (FN) are very costly (e.g., missing a real fraud, not detecting a tumor).
• F1-Score: Balances Precision and Recall.
🐍 Code Example: The 99% Accurate Lie
from sklearn.metrics import accuracy_score, precision_score, recall_score
import numpy as np
y_true = np.concatenate([np.zeros(990), np.ones(10)]) # 1000 samples, 1% positive
# Model 1: Always predicts '0' (no disease)
y_pred_bad = np.zeros(1000)
print(f"Model 1 (Always No Disease):\n Accuracy: {accuracy_score(y_true, y_pred_bad):.2f}")
print(f" Precision: {precision_score(y_true, y_pred_bad, zero_division=0):.2f}") # 0.00!
print(f" Recall: {recall_score(y_true, y_pred_bad):.2f}\n") # 0.00!
# Model 2: Catches 5 positives, 2 false alarms (Better!)
y_pred_better = np.zeros(1000)
y_pred_better[990:995] = 1 # 5 True Positives
y_pred_better[100:102] = 1 # 2 False Positives
print(f"Model 2 (Actually Catches Some):\n Accuracy: {accuracy_score(y_true, y_pred_better):.2f}")
print(f" Precision: {precision_score(y_true, y_pred_better, zero_division=0):.2f}") # 0.71
print(f" Recall: {recall_score(y_true, y_pred_better):.2f}") # 0.50
# Model 2's accuracy might be slightly lower, but its Precision/Recall shows it's far superior!
🎯 Today's Goal (What you should do)
✔️ Recognize accuracy's flaw for imbalanced data.
✔️ Pick Precision when False Positives hurt most.
✔️ Pick Recall when False Negatives hurt most.
✔️ Understand what your model's mistakes truly cost.Pandas, NumPy, scikit-learn, and TensorFlow for machine learning, as well as Tableau and Matplotlib for data visualization. Online courses, tutorials, and coding bootcamps can provide structured learning paths.
2. Identify Your Niche
Data science spans various industries, including healthcare, finance, marketing, and technology. Explore these fields to determine where your interests lie. Understanding the specific challenges and data types in your chosen industry will help you tailor your learning and make you more effective in your future role.
3. Build a Strong Portfolio
Start working on small projects that demonstrate your skills and knowledge. These could include data analysis tasks, machine learning models, or visualizations based on publicly available datasets. Use platforms like GitHub to showcase your work, and consider writing blog posts or creating presentations to explain your projects. A well-rounded portfolio not only highlights your technical capabilities but also reflects your problem-solving approach.
4. Engage with the Community
Join data science communities online (like Kaggle, Stack Overflow, or LinkedIn groups) to connect with professionals in the field. Participating in discussions, attending webinars, and contributing to open-source projects can enhance your learning experience and expand your network.
5. Pursue Continuous Learning
Data science is an ever-evolving field, so staying updated with the latest trends, techniques, and tools is crucial. Follow relevant blogs, podcasts, and research papers. Consider pursuing advanced certifications or degrees to deepen your expertise.
6. Gain Practical Experience
Look for internships, volunteer opportunities, or part-time positions that allow you to apply your skills in real-world scenarios. Practical experience will not only reinforce your learning but also give you insights into the day-to-day responsibilities of a data scientist.
By following these steps, you can build a solid foundation in data science and position yourself for success in this dynamic and rewarding field.
Available now! Telegram Research 2025 — the year's key insights 
