Data science/ML/AI

Open in Telegram

Data science and machine learning hub Python, SQL, stats, ML, deep learning, projects, PDFs, roadmaps and AI resources. For beginners, data scientists and ML engineers 👉 https://rebrand.ly/bigdatachannels DMCA: @disclosure_bds Contact: @mldatascientist

Network:Programming, data science, ML - free courses by Big Data Specialist India31 551 Technologies & Applications9 384...

📈 Analytical overview of Telegram channel Data science/ML/AI

Channel Data science/ML/AI (@datascience_bds) in the English language segment is an active participant. Currently, the community unites 13 684 subscribers, ranking 9 384 in the Technologies & Applications category and 31 551 in the India region.

📊 Audience metrics and dynamics

Since its creation on невідомо, the project has demonstrated rapid growth, gathering an audience of 13 684 subscribers.

According to the latest data from 11 June, 2026, the channel demonstrates stable activity. Although there has been a change in the number of participants by 150 over the last 30 days and by 11 over the last 24 hours, overall reach remains high.

Verification status: Not verified
Engagement rate (ER): The average audience engagement rate is 8.13%. Within the first 24 hours after publication, content typically collects 2.20% reactions from the total number of subscribers.
Post reach: On average, each post receives 1 112 views. Within the first day, a publication typically gains 301 views.
Reactions and interaction: The audience actively supports content: the average number of reactions per post is 5.
Thematic interests: Content is focused on key topics such as panda, learning, row, api, ethic.

📝 Description and content policy

The author describes the resource as a platform for expressing subjective opinions:
“Data science and machine learning hub Python, SQL, stats, ML, deep learning, projects, PDFs, roadmaps and AI resources. For beginners, data scientists and ML engineers 👉 https://rebrand.ly/bigdatachannels DMCA: @disclosure_bds Contact: @mldatasci...”

Thanks to the high frequency of updates (latest data received on 12 June, 2026), the channel maintains relevance and a high level of publication reach. Analytics show that the audience actively interacts with content, making it an important point of influence in the Technologies & Applications category.

13 684

Subscribers

+1124 hours

+227 days

+15030 days

1 112

Post views

~ 30124 hours

~ 43548 hours

8.13%

Engagement rate

~ 1

Posts per day

Ads index

beta

Posts Archive

13 684

Interesting Terminologies to Understand in Machine Learning Bag of words: A technique used to extract features from the text. It counts how many times a word appears in a document (corpus), and then transforms that information into a dataset. A categorical label has a discrete set of possible values, such as "is a cat" and "is not a cat." Clustering. Unsupervised learning task that helps to determine if there are any naturally occurring groupings in the data. CNN: Convolutional Neural Networks (CNN) represent nested filters over grid-organized data. They are by far the most commonly used type of model when processing images. A continuous (regression) label does not have a discrete set of possible values, which means possibly an unlimited number of possibilities. Data vectorization: A process that converts non-numeric data into a numerical format so that it can be used by a machine learning model. Discrete: A term taken from statistics referring to an outcome taking on only a finite number of values (such as days of the week). FFNN: The most straightforward way of structuring a neural network, the Feed Forward Neural Network (FFNN) structures neurons in a series of layers, with each neuron in a layer containing weights to all neurons in the previous layer. Hyperparameters are settings on the model which are not changed during training but can affect how quickly or how reliably the model trains, such as the number of clusters the model should identify. Log loss is used to calculate how uncertain your model is about the predictions it is generating. Hyperplane: A mathematical term for a surface that contains more than two planes. Impute is a common term referring to different statistical tools which can be used to calculate missing values from your dataset. Label refers to data that already contains the solution. Loss function is used to codify the model’s distance from this goal Machine learning, or ML, is a modern software development technique that enables computers to solve problems by using examples of real-world data. Model accuracy is the fraction of predictions a model gets right. Discrete: A term taken from statistics referring to an outcome taking on only a finite number of values (such as days of the week). Continuous: Floating-point values with an infinite range of possible values. The opposite of categorical or discrete values, which take on a limited number of possible values. Model inference is when the trained model is used to generate predictions. Model is an extremely generic program, made specific by the data used to train it. Model parameters are settings or configurations the training algorithm can update to change how the model behaves. Model training algorithms work through an interactive process where the current model iteration is analyzed to determine what changes can be made to get closer to the goal. Those changes are made and the iteration continues until the model is evaluated to meet the goals. Neural networks: a collection of very simple models connected together. These simple models are called neurons. The connections between these models are trainable model parameters called weights. Outliers are data points that are significantly different from others in the same sample. Plane: A mathematical term for a flat surface (like a piece of paper) on which two points can be joined by a straight line. Regression: A common task in supervised machine learning. In reinforcement learning, the algorithm figures out which actions to take in a situation to maximize a reward (in the form of a number) on the way to reaching a specific goal. RNN/LSTM: Recurrent Neural Networks (RNN) and the related Long Short-Term Memory (LSTM) model types are structured to effectively represent for loops in traditional computing, collecting state while iterating over some object. They can be used for processing sequences of data.

13 684

Model Evaluation Metrics

13 684

Different Data Sources and How They Are Collected 1) Company Data Sources: Web Events, Survey Data, Customer Data, Logistics Data and Financial Transactions. 2) Open Data Sources: Public Data APIs, Public Records APIs request data over the internet. Interesting API's include: Twitter, Wikipedia, Yahoo Finance, Google Maps etc Public records data can be collected by international organisations like World Bank, UN, WTO 3) National Statistical Offices: Censuses Surveys 4) Government Agencies: Weather Data Environment Data Population Data

13 684

A Guide to Understanding Different Types of Data Hey There😃!! Do you know the different formats your data can be in and how to identify them?😌 Here's a guide that can help you😉 Structured Data : It is in a standardized format, has a well-defined structure, complies to a data model, follows a persistent order, and is easily accessed by humans and programs. This data type is generally stored in a database. Normally in a table or number of tables. Examples: Data from surveys, different sensors, point-of-sale details, and financial information Unstructured Data: It does not conform to any other model and has no easily identifiable structure. There is no organization to it and it cannot be stored in any logical way. Unstructured data does not fit into any database structure, has no rules or format, and it cannot be easily used by programs. Examples: raw videos from surveillance cameras, reports, file shared with corporate documents, images, and memos. Semi Structured Data: It is not in a relational database, does not conform to a data model, but has some elements of structure. It cannot be stored in rows and columns or databases. This data contains metadata and tags which helps it to be grouped appropriately and describes the way it is stored. Semi-structured data is organized hierarchically, although the entities within that group may not have the same properties or attributes. It is difficult to automate and manage and is hard for programs to access. Examples: wikipedia pages with links, collection of scientific papers in JSON format with authors, emails, zipped files, web files, and binary executables.

13 684

dtatattata.webp0.13 KB

13 684

A Guide to Understanding Different Types of Data Hey There😃!! Do you know the different formats your data can be in and how to identify them?😌 Here's a guide that can help you😉 **Structured Data :**It is in a standardized format, has a well-defined structure, complies to a data model, follows a persistent order, and is easily accessed by humans and programs. This data type is generally stored in a database. Normally in a table or number of tables. Examples: Data from surveys, different sensors, , web logs, point-of-sale details, and financial information Unstructured Data: **It does not conform to any other model and has no easily identifiable structure. There is no organization to it and it cannot be stored in any logical way. Unstructured data does not fit into any database structure, has no rules or format, and it cannot be easily used by programs. **Semi Structured Data: It is not in a relational database, does not conform to a data model, but has some elements of structure. It cannot be stored in rows and columns or databases. This data contains metadata and tags which helps it to be grouped appropriately and describes the way it is stored. Semi-structured data is organized hierarchically, although the entities within that group may not have the same properties or attributes. It is difficult to automate and manage and is hard for programs to access.

13 684

🚀Join us this week in the FREE Webinars and explore the fields of tech! You will find the answers to all your questions at our webinars. Open the link https://crst.co/PSO8Z, make your choice and apply now while there are still seats available. See you there! ▶️ July 12, 11:00 AM PT - Tech Jobs for Beginners: Become a Software Tester ▶️ July 13, 11:00 AM PT - Best Remote Tech Jobs in 2022: Сareer Guidance for Everyone ▶️ July 13, 12:00 PM PT - Tech Sales Career Path to a Secure Your Future In 2022 ▶️ July 14, 11:00 AM PT - Career Change: Get a Remote Job as a Software Tester ▶️ July 17 - Manual QA. First Free lesson ▶️ July 18 - Tech Sales Training. First Free lesson ▶️ July 19 - Tech Sales Program. First Free lesson Special offer for all participants! ️✅ Apply by the https://crst.co/PSO8Z

13 684

Fundamentals of Data Visualization A primer on making informative and compelling figures Author: Claus . O . Wike Book Link; Read Me!

13 684

Tools Regularly Used By Data Scientist

13 684

**A List Of Free Data Science Tutorials** 🔘Python for Data Science - Great Learning Rating ⭐️: 4.2 out of 5 Duration ⏰: 1 hour 55 mins on-demand video Students 👨‍🏫: 25,605 Created by: Bharani Akella 🔗 Course link 🔘A - Z™ Python crash course for Data Science 2021 Rating ⭐️: 4.4 out of 5 Duration ⏰: 2 hours on-demand video Students 👨‍🏫: 7,012 Created by: Abb Selec 🔗 Course link 🔘An Athlete’s Guide To Data Science Rating ⭐️: 3.0 out of 5 Duration ⏰: I hour 1 min on-demand video Students 👨‍🏫: 1,975 Created by: Jon pierre Jones 🔗 Course link 🔘NumPy for Data Science Beginners: 2021 Rating ⭐️: 4.0 out of 5 Duration ⏰: I hour 51 mins on-demand video Students 👨‍🏫: 11,535 Created by: Abb Selec 🔗 Course link 🔘Learn Data Science With R Part 1 of 10 Rating ⭐️: 4.1 out of 5 Duration ⏰: 8 hours 42 mins on-demand video Students 👨‍🏫: 32,824 Created by: Ram Reddy 🔗 Course link 🔘Data Science with Analogies, Algorithms and Solved Problems Rating ⭐️: 4.1 out of 5 Duration ⏰: 1 hour 19 mins on-demand video Students 👨‍🏫: 15,706 Created by: Ajay Dhruv, Neha Mayekar, Shreya Pattewar, Shubham Patil 🔗 Course link 🔘Data Science, Machine Learning, Data Analysis, Python & R Rating ⭐️: 3.8 out of 5 Duration ⏰: 8 hours 7 mins on-demand video Students 👨‍🏫: 89,564 Created by: DATAhill Solutions Srinivas Reddy 🔗 Course link 🔘Intro to Data for Data Science Rating ⭐️: 4.6 out of 5 Duration ⏰: 1 hour 1 min on-demand video Students 👨‍🏫: 9,727 Created by: Matthew Renze 🔗 Course link 🔘Learn NumPy Fundamentals (Python Library for Data Science) Rating ⭐️: 4.3 out of 5 Duration ⏰: 1 hour 49 mins on-demand video Students 👨‍🏫: 27,038 Created by: Derrick Sherrill 🔗 Course link #datascience #datanalysis #python #numpy #pandas #machinelearning ➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖ Join @datascience_bds for more cool data science materials. *This channel belongs to @bigdataspecialist group

13 684

Just wanted to share this here as well in case somebody is interested.

13 684

Repost from Programming, data science, ML - free courses by Big Data Specialist

Want to make sure your Spark applications reach the best performance? We invite you to our Dynamic Talks #90 | Spark performance mastery! ⏰ Date and time: July 20, 6:30 pm (CET) The speaker is Iñigo San Aniceto Orbegozo, Staff Big Data Engineer at Grid Dynamics. 💻 Participation is free but registration is required: https://forms.gle/UVvfWG5LeZAXTuNQ6 More about event: https://fb.me/e/1U9Vq4epw

13 684

The Scikit-Learn Guide Looking to improve your knowledge on machine Learning ALgorithms, there's no better place to start from than to check the sklearn documentation There is alot of interesting information you can gain there https://scikit-learn.org/stable/

13 684

Introduction to Machine Learning, IIT Kharagpur 🆓 Free Online Course 💻 44 Lecture Videos 🏃‍♂️ Self paced Teacher 👨‍🏫 : Prof. S. Sarkar 🔗 https://nptel.ac.in/courses/106105152

13 684

Text Classification with TensorFlow This is an intermediate-level Python course taught by MIT grad student Kylie Ying. You can code along at home in your browser. You'll use TensorFlow to train Neural Networks, visualize a diabetes dataset, and perform Text Classification on wine reviews. (2 hour YouTube course) Link: https://www.freecodecamp.org/news/text-classification-tensorflow/

13 684

Machine Learning with Python: Zero to GBMs This is a practical and beginner-friendly introduction to supervised machine learning, decision trees, and gradient boosting using Python. This is a self-paced course where you can: 👌Watch hands-on coding-focused video tutorials 👌Practice coding with cloud Jupyter notebooks 👌Build an end-to-end real-world course project 👌Earn a verified certificate of accomplishment 👌Interact with a global community of learners 👌You will solve 2 coding assignments & build a course project where you'll train ML models using a large real-world datasets Link: https://jovian.ai/learn/machine-learning-with-python-zero-to-gbms

13 684

SQL Interview Questions