Data science/ML/AI

前往频道在 Telegram

Data science and machine learning hub Python, SQL, stats, ML, deep learning, projects, PDFs, roadmaps and AI resources. For beginners, data scientists and ML engineers 👉 https://rebrand.ly/bigdatachannels DMCA: @disclosure_bds Contact: @mldatascientist

显示更多

网络:Programming, data science, ML - free courses by Big Data Specialist 印度31 551 技术与应用9 384...

📈 Telegram 频道 Data science/ML/AI 的分析概览

频道 Data science/ML/AI (@datascience_bds) 英语语言赛道中的是活跃参与者。目前社区聚集了 13 684 名订阅者，在 技术与应用 类别中位列第 9 384，并在印度地区排名第 31 551 位。

📊 受众指标与增长动态

自 невідомо 创建以来，项目保持高速增长，吸引了 13 684 名订阅者。

根据 11 六月, 2026 的最新数据，频道保持稳定运转。过去 30 天订阅人数变化为 150，过去 24 小时变化为 11，整体触达仍然可观。

认证状态： 未认证
互动率 (ER)： 平均受众互动率为 8.13%。内容发布后 24 小时内通常能获得 2.20% 的反应，占订阅者总量。
帖子覆盖： 每篇帖子平均可获得 1 112 次浏览，首日通常累积 301 次浏览。
互动与反馈： 受众积极参与，单帖平均反应数为 5。
主题关注点： 内容集中在 panda, learning, row, api, ethic 等核心主题上。

📝 描述与内容策略

作者将该频道定位为表达主观观点的平台：
“Data science and machine learning hub Python, SQL, stats, ML, deep learning, projects, PDFs, roadmaps and AI resources. For beginners, data scientists and ML engineers 👉 https://rebrand.ly/bigdatachannels DMCA: @disclosure_bds Contact: @mldatasci...”

凭借高频更新（最新数据采集于 12 六月, 2026），频道始终保持新鲜度与高覆盖。分析显示受众积极互动，使其成为 技术与应用 类别中的关键影响点。

13 684

订阅者

+1124 小时

+227 天

+15030 天

1 112

帖子浏览量

~ 30124 小时

~ 43548 小时

8.13%

参与率

~ 1

每日帖子数

Ads index

beta

帖子存档

13 684

Interesting Terminologies to Understand in Machine Learning Bag of words: A technique used to extract features from the text. It counts how many times a word appears in a document (corpus), and then transforms that information into a dataset. A categorical label has a discrete set of possible values, such as "is a cat" and "is not a cat." Clustering. Unsupervised learning task that helps to determine if there are any naturally occurring groupings in the data. CNN: Convolutional Neural Networks (CNN) represent nested filters over grid-organized data. They are by far the most commonly used type of model when processing images. A continuous (regression) label does not have a discrete set of possible values, which means possibly an unlimited number of possibilities. Data vectorization: A process that converts non-numeric data into a numerical format so that it can be used by a machine learning model. Discrete: A term taken from statistics referring to an outcome taking on only a finite number of values (such as days of the week). FFNN: The most straightforward way of structuring a neural network, the Feed Forward Neural Network (FFNN) structures neurons in a series of layers, with each neuron in a layer containing weights to all neurons in the previous layer. Hyperparameters are settings on the model which are not changed during training but can affect how quickly or how reliably the model trains, such as the number of clusters the model should identify. Log loss is used to calculate how uncertain your model is about the predictions it is generating. Hyperplane: A mathematical term for a surface that contains more than two planes. Impute is a common term referring to different statistical tools which can be used to calculate missing values from your dataset. Label refers to data that already contains the solution. Loss function is used to codify the model’s distance from this goal Machine learning, or ML, is a modern software development technique that enables computers to solve problems by using examples of real-world data. Model accuracy is the fraction of predictions a model gets right. Discrete: A term taken from statistics referring to an outcome taking on only a finite number of values (such as days of the week). Continuous: Floating-point values with an infinite range of possible values. The opposite of categorical or discrete values, which take on a limited number of possible values. Model inference is when the trained model is used to generate predictions. Model is an extremely generic program, made specific by the data used to train it. Model parameters are settings or configurations the training algorithm can update to change how the model behaves. Model training algorithms work through an interactive process where the current model iteration is analyzed to determine what changes can be made to get closer to the goal. Those changes are made and the iteration continues until the model is evaluated to meet the goals. Neural networks: a collection of very simple models connected together. These simple models are called neurons. The connections between these models are trainable model parameters called weights. Outliers are data points that are significantly different from others in the same sample. Plane: A mathematical term for a flat surface (like a piece of paper) on which two points can be joined by a straight line. Regression: A common task in supervised machine learning. In reinforcement learning, the algorithm figures out which actions to take in a situation to maximize a reward (in the form of a number) on the way to reaching a specific goal. RNN/LSTM: Recurrent Neural Networks (RNN) and the related Long Short-Term Memory (LSTM) model types are structured to effectively represent for loops in traditional computing, collecting state while iterating over some object. They can be used for processing sequences of data.

13 684

Model Evaluation Metrics

13 684

Different Data Sources and How They Are Collected 1) Company Data Sources: Web Events, Survey Data, Customer Data, Logistics Data and Financial Transactions. 2) Open Data Sources: Public Data APIs, Public Records APIs request data over the internet. Interesting API's include: Twitter, Wikipedia, Yahoo Finance, Google Maps etc Public records data can be collected by international organisations like World Bank, UN, WTO 3) National Statistical Offices: Censuses Surveys 4) Government Agencies: Weather Data Environment Data Population Data

13 684

A Guide to Understanding Different Types of Data Hey There😃!! Do you know the different formats your data can be in and how to identify them?😌 Here's a guide that can help you😉 Structured Data : It is in a standardized format, has a well-defined structure, complies to a data model, follows a persistent order, and is easily accessed by humans and programs. This data type is generally stored in a database. Normally in a table or number of tables. Examples: Data from surveys, different sensors, point-of-sale details, and financial information Unstructured Data: It does not conform to any other model and has no easily identifiable structure. There is no organization to it and it cannot be stored in any logical way. Unstructured data does not fit into any database structure, has no rules or format, and it cannot be easily used by programs. Examples: raw videos from surveillance cameras, reports, file shared with corporate documents, images, and memos. Semi Structured Data: It is not in a relational database, does not conform to a data model, but has some elements of structure. It cannot be stored in rows and columns or databases. This data contains metadata and tags which helps it to be grouped appropriately and describes the way it is stored. Semi-structured data is organized hierarchically, although the entities within that group may not have the same properties or attributes. It is difficult to automate and manage and is hard for programs to access. Examples: wikipedia pages with links, collection of scientific papers in JSON format with authors, emails, zipped files, web files, and binary executables.

13 684

dtatattata.webp0.13 KB

13 684

A Guide to Understanding Different Types of Data Hey There😃!! Do you know the different formats your data can be in and how to identify them?😌 Here's a guide that can help you😉 **Structured Data :**It is in a standardized format, has a well-defined structure, complies to a data model, follows a persistent order, and is easily accessed by humans and programs. This data type is generally stored in a database. Normally in a table or number of tables. Examples: Data from surveys, different sensors, , web logs, point-of-sale details, and financial information Unstructured Data: **It does not conform to any other model and has no easily identifiable structure. There is no organization to it and it cannot be stored in any logical way. Unstructured data does not fit into any database structure, has no rules or format, and it cannot be easily used by programs. **Semi Structured Data: It is not in a relational database, does not conform to a data model, but has some elements of structure. It cannot be stored in rows and columns or databases. This data contains metadata and tags which helps it to be grouped appropriately and describes the way it is stored. Semi-structured data is organized hierarchically, although the entities within that group may not have the same properties or attributes. It is difficult to automate and manage and is hard for programs to access.

13 684

🚀Join us this week in the FREE Webinars and explore the fields of tech! You will find the answers to all your questions at our webinars. Open the link https://crst.co/PSO8Z, make your choice and apply now while there are still seats available. See you there! ▶️ July 12, 11:00 AM PT - Tech Jobs for Beginners: Become a Software Tester ▶️ July 13, 11:00 AM PT - Best Remote Tech Jobs in 2022: Сareer Guidance for Everyone ▶️ July 13, 12:00 PM PT - Tech Sales Career Path to a Secure Your Future In 2022 ▶️ July 14, 11:00 AM PT - Career Change: Get a Remote Job as a Software Tester ▶️ July 17 - Manual QA. First Free lesson ▶️ July 18 - Tech Sales Training. First Free lesson ▶️ July 19 - Tech Sales Program. First Free lesson Special offer for all participants! ️✅ Apply by the https://crst.co/PSO8Z

13 684

Fundamentals of Data Visualization A primer on making informative and compelling figures Author: Claus . O . Wike Book Link; Read Me!

13 684

Tools Regularly Used By Data Scientist

13 684

**A List Of Free Data Science Tutorials** 🔘Python for Data Science - Great Learning Rating ⭐️: 4.2 out of 5 Duration ⏰: 1 hour 55 mins on-demand video Students 👨‍🏫: 25,605 Created by: Bharani Akella 🔗 Course link 🔘A - Z™ Python crash course for Data Science 2021 Rating ⭐️: 4.4 out of 5 Duration ⏰: 2 hours on-demand video Students 👨‍🏫: 7,012 Created by: Abb Selec 🔗 Course link 🔘An Athlete’s Guide To Data Science Rating ⭐️: 3.0 out of 5 Duration ⏰: I hour 1 min on-demand video Students 👨‍🏫: 1,975 Created by: Jon pierre Jones 🔗 Course link 🔘NumPy for Data Science Beginners: 2021 Rating ⭐️: 4.0 out of 5 Duration ⏰: I hour 51 mins on-demand video Students 👨‍🏫: 11,535 Created by: Abb Selec 🔗 Course link 🔘Learn Data Science With R Part 1 of 10 Rating ⭐️: 4.1 out of 5 Duration ⏰: 8 hours 42 mins on-demand video Students 👨‍🏫: 32,824 Created by: Ram Reddy 🔗 Course link 🔘Data Science with Analogies, Algorithms and Solved Problems Rating ⭐️: 4.1 out of 5 Duration ⏰: 1 hour 19 mins on-demand video Students 👨‍🏫: 15,706 Created by: Ajay Dhruv, Neha Mayekar, Shreya Pattewar, Shubham Patil 🔗 Course link 🔘Data Science, Machine Learning, Data Analysis, Python & R Rating ⭐️: 3.8 out of 5 Duration ⏰: 8 hours 7 mins on-demand video Students 👨‍🏫: 89,564 Created by: DATAhill Solutions Srinivas Reddy 🔗 Course link 🔘Intro to Data for Data Science Rating ⭐️: 4.6 out of 5 Duration ⏰: 1 hour 1 min on-demand video Students 👨‍🏫: 9,727 Created by: Matthew Renze 🔗 Course link 🔘Learn NumPy Fundamentals (Python Library for Data Science) Rating ⭐️: 4.3 out of 5 Duration ⏰: 1 hour 49 mins on-demand video Students 👨‍🏫: 27,038 Created by: Derrick Sherrill 🔗 Course link #datascience #datanalysis #python #numpy #pandas #machinelearning ➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖➖ Join @datascience_bds for more cool data science materials. *This channel belongs to @bigdataspecialist group

13 684

Just wanted to share this here as well in case somebody is interested.

13 684

Repost from Programming, data science, ML - free courses by Big Data Specialist

Want to make sure your Spark applications reach the best performance? We invite you to our Dynamic Talks #90 | Spark performance mastery! ⏰ Date and time: July 20, 6:30 pm (CET) The speaker is Iñigo San Aniceto Orbegozo, Staff Big Data Engineer at Grid Dynamics. 💻 Participation is free but registration is required: https://forms.gle/UVvfWG5LeZAXTuNQ6 More about event: https://fb.me/e/1U9Vq4epw

13 684

The Scikit-Learn Guide Looking to improve your knowledge on machine Learning ALgorithms, there's no better place to start from than to check the sklearn documentation There is alot of interesting information you can gain there https://scikit-learn.org/stable/

13 684

Introduction to Machine Learning, IIT Kharagpur 🆓 Free Online Course 💻 44 Lecture Videos 🏃‍♂️ Self paced Teacher 👨‍🏫 : Prof. S. Sarkar 🔗 https://nptel.ac.in/courses/106105152

13 684

Text Classification with TensorFlow This is an intermediate-level Python course taught by MIT grad student Kylie Ying. You can code along at home in your browser. You'll use TensorFlow to train Neural Networks, visualize a diabetes dataset, and perform Text Classification on wine reviews. (2 hour YouTube course) Link: https://www.freecodecamp.org/news/text-classification-tensorflow/

13 684

Machine Learning with Python: Zero to GBMs This is a practical and beginner-friendly introduction to supervised machine learning, decision trees, and gradient boosting using Python. This is a self-paced course where you can: 👌Watch hands-on coding-focused video tutorials 👌Practice coding with cloud Jupyter notebooks 👌Build an end-to-end real-world course project 👌Earn a verified certificate of accomplishment 👌Interact with a global community of learners 👌You will solve 2 coding assignments & build a course project where you'll train ML models using a large real-world datasets Link: https://jovian.ai/learn/machine-learning-with-python-zero-to-gbms

13 684

SQL Interview Questions