ru
Feedback
Data Science & Machine Learning

Data Science & Machine Learning

Открыть в Telegram

Join this channel to learn data science, artificial intelligence and machine learning with funny quizzes, interesting projects and amazing resources for free For collaborations: @love_data

Больше

📈 Аналитический обзор Telegram-канала Data Science & Machine Learning

Канал Data Science & Machine Learning (@datasciencefun) языкового сегмента Английский является активным участником. Сейчас сообщество объединяет 75 860 подписчиков, занимая 2 107 место в категории Образование и 4 219 место в регионе Индия.

📊 Показатели аудитории и динамика

С момента создания невідомо проект демонстрирует стремительный рост, собрав аудиторию из 75 860 подписчиков.

Согласно последним данным от 22 июня, 2026, канал показывает стабильную активность. За последние 30 дней изменение числа участников составило 728, а за последние 24 часа — -2, при этом общий охват остаётся высоким.

  • Статус верификации: Не верифицирован
  • Уровень вовлечённости (ER): Средний показатель вовлечённости аудитории составляет 3.00%. В первые 24 часа после публикации контент обычно набирает 1.05% реакций от общего числа подписчиков.
  • Охват публикаций: В среднем каждый пост получает 2 278 просмотров. В течение первых суток публикация набирает 794 просмотров.
  • Реакции и взаимодействия: Аудитория активно поддерживает контент: среднее количество реакций на один пост — 3.
  • Тематические интересы: Контент сосредоточен на ключевых темах, таких как learning, accuracy, distribution, panda, dataset.

📝 Описание и контентная политика

Автор описывает ресурс как площадку для выражения субъективного мнения:
Join this channel to learn data science, artificial intelligence and machine learning with funny quizzes, interesting projects and amazing resources for free For collaborations: @love_data

Благодаря высокой частоте обновлений (последние данные получены 23 июня, 2026) канал поддерживает актуальность и высокий уровень охвата публикаций. Аналитика показывает, что аудитория активно взаимодействует с контентом, что делает его важной точкой влияния в категории Образование.

75 860
Подписчики
-224 часа
+637 дней
+72830 день
Архив постов
Is accuracy always a good metric? Accuracy is not a good performance metric when there is imbalance in the dataset. For example, in binary classification with 95% of A class and 5% of B class, a constant prediction of A class would have an accuracy of 95%. In case of imbalance dataset, we need to choose Precision, recall, or F1 Score depending on the problem we are trying to solve. What are precision, recall, and F1-score? Precision and recall are classification evaluation metrics: P = TP / (TP + FP) and R = TP / (TP + FN). Where TP is true positives, FP is false positives and FN is false negatives In both cases the score of 1 is the best: we get no false positives or false negatives and only true positives. F1 is a combination of both precision and recall in one score (harmonic mean): F1 = 2 * PR / (P + R). Max F score is 1 and min is 0, with 1 being the best.

Which website is best to learn Programming and Data science according to you?
Anonymous voting

How do we evaluate classification models? Depending on the classification problem, we can use the following evaluation metrics: Accuracy Precision Recall F1 Score Logistic loss (also known as Cross-entropy loss) Jaccard similarity coefficient score

Highly recommended courses with real world projects to advance your career in Data Science 👉Become a Data scientists https://bit.ly/2TRo9P7 👉Intermediate Python https://bit.ly/3z7YPnZ Use coupon code SAVE75 for extra 75% off on above courses Note: If you are a beginner in Python or Data science, then I will recommend you to enroll in the courses given in the pinned message instead of above courses Offer valid till tomorrow only

Where to get data for your next machine learning project? An overview of 5 amazing resources to accelerate your next project with data! 📌 Google Datasets Easy to search Datasets on Google Dataset Search engine as it is to search for anything on Google Search! You just enter the topic on which you need to find a Dataset. 📌 Kaggle Dataset Explore, analyze, and share quality data. 📌 Open Data on AWS This registry exists to help people discover and share datasets that are available via AWS resources 📌 Awesome Public Datasets A topic-centric list of HQ open datasets. 📌 Azure public data sets Public data sets for testing and prototyping.

What is sigmoid? What does it do? A sigmoid function is a type of activation function, and more specifically defined as a squashing function. Squashing functions limit the output to a range between 0 and 1, making these functions useful in the prediction of probabilities. Sigmod(x) = 1/(1+e^{-x})

What is overfitting? When your model perform very well on your training set but can't generalize the test set, because it adjusted a lot to the training set.

What is the bias-variance trade-off? • Bias is the error introduced by approximating the true underlying function, which can be quite complex, by a simpler model. Variance is a model sensitivity to changes in the training dataset. • Bias-variance trade-off is a relationship between the expected test error and the variance and the bias - both contribute to the level of the test error and ideally should be as small as possible: ExpectedTestError = Variance + Bias² + IrreducibleError • But as a model complexity increases, the bias decreases and the variance increases which leads to overfitting. And vice versa, model simplification helps to decrease the variance but it increases the bias which leads to underfitting.

Can you explain how cross-validation works? Cross-validation is the process to separate your total training set into two subsets: training and validation set, and evaluate your model to choose the hyperparameters. But you do this process iteratively, selecting differents training and validation set, in order to reduce the bias that you would have by selecting only one validation set What is K-fold cross-validation? K fold cross validation is a method of cross validation where we select a hyperparameter k. The dataset is now divided into k parts. Now, we take the 1st part as validation set and remaining k-1 as training set. Then we take the 2nd part as validation set and remaining k-1 parts as training set. Like this, each part is used as validation set once and the remaining k-1 parts are taken together and used as training set. It should not be used in a time series data.

How to validate your models? One of the most common approaches is splitting data into train, validation and test parts. Models are trained on train data, hyperparameters (for example early stopping) are selected based on the validation data, the final measurement is done on test dataset. Another approach is cross-validation: split dataset into K folds and each time train models on training folds and measure the performance on the validation folds. Also you could combine these approaches: make a test/holdout dataset and do cross-validation on the rest of the data. The final quality is measured on test dataset.

Offer extended till 8 June So use code SAVE75 to get instant 75% discount in all the above courses before 8 June

Why is it require to split our data into three parts: train, validation, and test? • The training set is used to fit the model, i.e. to train the model with the data. • The validation set is then used to provide an unbiased evaluation of a model while fine-tuning hyperparameters. This improves the generalization of the model. • Finally, a test data set which the model has never "seen" before should be used for the final evaluation of the model. This allows for an unbiased evaluation of the model. The evaluation should never be performed on the same data that is used for training. Otherwise the model performance would not be representative.

Best Channel To Download All Programming Books And Courses t.me/DataScience_Books