fa
Feedback
Data Science & Machine Learning

Data Science & Machine Learning

رفتن به کانال در Telegram

The first channel on Telegram that offers exciting questions, answers, and tests in data science, artificial intelligence, machine learning, and programming languages. For promotions: @love_data

نمایش بیشتر

📈 تحلیل کانال تلگرام Data Science & Machine Learning

کانال Data Science & Machine Learning (@datascienceinterviews) در بخش زبانی انگلیسی بازیگری فعال است. در حال حاضر جامعه شامل 27 265 مشترک است و جایگاه 7 190 را در دسته آموزش و رتبه 15 948 را در منطقه الهند دارد.

📊 شاخص‌های مخاطب و پویایی

از زمان ایجاد در невідомо، پروژه رشد سریعی داشته و 27 265 مشترک جذب کرده است.

بر اساس آخرین داده‌ها در تاریخ 14 ژوئن, 2026، کانال فعالیت پایداری دارد. در ۳۰ روز گذشته تغییر اعضا برابر 142 و در ۲۴ ساعت گذشته برابر 10 بوده و همچنان دسترسی گسترده‌ای حفظ شده است.

  • وضعیت تأیید: تأیید نشده
  • نرخ تعامل (ER): میانگین تعامل مخاطب 0.56% است و در ۲۴ ساعت نخست پس از انتشار، محتوا معمولاً 0.53% واکنش نسبت به کل مشترکان کسب می‌کند.
  • دسترسی پست‌ها: هر پست به طور میانگین 152 بازدید دریافت می‌کند. در اولین روز معمولاً 144 بازدید جمع‌آوری می‌شود.
  • واکنش‌ها و تعامل: مخاطبان به‌طور فعال حمایت می‌کنند؛ میانگین واکنش به هر پست 1 است.
  • علایق موضوعی: محتوا بر موضوعات کلیدی مانند insidead, mining, pinix, learning, neo تمرکز دارد.

📝 توضیح و سیاست محتوایی

نویسنده این فضا را محل بیان دیدگاه‌های شخصی توصیف می‌کند:
The first channel on Telegram that offers exciting questions, answers, and tests in data science, artificial intelligence, machine learning, and programming languages. For promotions: @love_data

به لطف به‌روزرسانی‌های پرتکرار (آخرین داده در تاریخ 15 ژوئن, 2026)، کانال همواره به‌روز و دارای دسترسی بالاست. تحلیل‌ها نشان می‌دهد مخاطبان به‌طور فعال با محتوا تعامل دارند و آن را به نقطه اثرگذاری مهم در دسته آموزش تبدیل کرده‌اند.

27 265
مشترکین
+1024 ساعت
+407 روز
+14230 روز
آرشیو پست ها
Coffee Break NumPy Christian Mayer, 2018

What is feature selection? Why do we need it? Feature Selection is a method used to select the relevant features for the model to train on. We need feature selection to remove the irrelevant features which leads the model to under-perform.

What are the main parameters of the decision tree model? • maximum tree depth • minimum samples per leaf node • impurity criterion

What are the decision trees? This is a type of supervised learning algorithm that is mostly used for classification problems. Surprisingly, it works for both categorical and continuous dependent variables. In this algorithm, we split the population into two or more homogeneous sets. This is done based on most significant attributes/ independent variables to make as distinct groups as possible. A decision tree is a flowchart-like tree structure, where each internal node (non-leaf node) denotes a test on an attribute, each branch represents an outcome of the test, and each leaf node (or terminal node) holds a value for the target variable. Various techniques : like Gini, Information Gain, Chi-square, entropy.

Why is it require to split our data into three parts: train, validation, and test? • The training set is used to fit the model, i.e. to train the model with the data. • The validation set is then used to provide an unbiased evaluation of a model while fine-tuning hyperparameters. This improves the generalization of the model. • Finally, a test data set which the model has never "seen" before should be used for the final evaluation of the model. This allows for an unbiased evaluation of the model. The evaluation should never be performed on the same data that is used for training. Otherwise the model performance would not be representative.

Can you explain how cross-validation works? Cross-validation is the process to separate your total training set into two subsets: training and validation set, and evaluate your model to choose the hyperparameters. But you do this process iteratively, selecting differents training and validation set, in order to reduce the bias that you would have by selecting only one validation set What is K-fold cross-validation? K fold cross validation is a method of cross validation where we select a hyperparameter k. The dataset is now divided into k parts. Now, we take the 1st part as validation set and remaining k-1 as training set. Then we take the 2nd part as validation set and remaining k-1 parts as training set. Like this, each part is used as validation set once and the remaining k-1 parts are taken together and used as training set. It should not be used in a time series data.

What is the bias-variance trade-off? • Bias is the error introduced by approximating the true underlying function, which can be quite complex, by a simpler model. Variance is a model sensitivity to changes in the training dataset. • Bias-variance trade-off is a relationship between the expected test error and the variance and the bias - both contribute to the level of the test error and ideally should be as small as possible: ExpectedTestError = Variance + Bias² + IrreducibleError • But as a model complexity increases, the bias decreases and the variance increases which leads to overfitting. And vice versa, model simplification helps to decrease the variance but it increases the bias which leads to underfitting.

What is sigmoid? What does it do? A sigmoid function is a type of activation function, and more specifically defined as a squashing function. Squashing functions limit the output to a range between 0 and 1, making these functions useful in the prediction of probabilities. Sigmod(x) = 1/(1+e^{-x})

What is overfitting? When your model perform very well on your training set but can't generalize the test set, because it adjusted a lot to the training set.

How do we evaluate classification models? Depending on the classification problem, we can use the following evaluation metrics: Accuracy Precision Recall F1 Score Logistic loss (also known as Cross-entropy loss) Jaccard similarity coefficient score

Data Science Interview questions.pdf17.59 MB

🎓 Amazing Opportunity to start your career in Data Analytics & Data Science 🚀 👩‍💻 Who: 2025 or earlier graduates students (B.Tech/B.Sc/B.E/BCA/MCA/M.Tech) 📅 Date: 22nd June 2024 🕔 Time: 5PM - 7PM 💡 What: Compete in Data Analytics Coding Contest Top 3 performers get internship/job referrals from partner companies Apply Link: https://bit.ly/3z7pYMc Don't miss out on this incredible opportunity! 🌟

[Compilation]1000+ Data Science Interview Questions/Preparation Resources Compilation created by kaggle users 1. GIT interview questions for DS and SQL Interview questions 2. 50 ML questions 3. Four years on interview questions 4. Compilation of pandas interview questions 5. Difference between common ML algortihms 6. Scenario based Data questions 7. Top python interview questions 8. Internship questions for DS interns 9. Questions from DS- Netflix 10. India specific Data science interview questions 11. R interview questions 12. Explain a project in Data science 13. A great collection of cheatsheets, analyzed here 14. A collection of questions on Github here 15. Cheat Sheets for Machine Learning Interview Topics 16. Compiled list of 600+ Q&As for Data Science interview prep 🎉 17. Approaching almost any ML Problem, originally shared on Kaggle 18. A Basics refresher 19. A notebook 20. Companies and Data Science Interview questions Megathread 21. Data Scientist - Interview Question Bank 22. ML Interview questions 23. Machine Learning Interviews Book https://www.kaggle.com/discussions/questions-and-answers/239533 Best Data Science & Machine Learning Resources: https://topmate.io/coding/914624 Credits: https://t.me/datasciencefun Like if you need similar content 😄👍 Hope this helps you 😊

Are you looking to become a machine learning engineer? The algorithm brought you to the right place! 📌 I created a free and comprehensive roadmap. Let's go through this thread and explore what you need to know to become an expert machine learning engineer: Math & Statistics Just like most other data roles, machine learning engineering starts with strong foundations from math, precisely linear algebra, probability and statistics. Here are the probability units you will need to focus on: Basic probability concepts statistics Inferential statistics Regression analysis Experimental design and A/B testing Bayesian statistics Calculus Linear algebra Python: You can choose Python, R, Julia, or any other language, but Python is the most versatile and flexible language for machine learning. Variables, data types, and basic operations Control flow statements (e.g., if-else, loops) Functions and modules Error handling and exceptions Basic data structures (e.g., lists, dictionaries, tuples) Object-oriented programming concepts Basic work with APIs Detailed data structures and algorithmic thinking Machine Learning Prerequisites: Exploratory Data Analysis (EDA) with NumPy and Pandas Basic data visualization techniques to visualize the variables and features. Feature extraction Feature engineering Different types of encoding data Machine Learning Fundamentals Using scikit-learn library in combination with other Python libraries for: Supervised Learning: (Linear Regression, K-Nearest Neighbors, Decision Trees) Unsupervised Learning: (K-Means Clustering, Principal Component Analysis, Hierarchical Clustering) Reinforcement Learning: (Q-Learning, Deep Q Network, Policy Gradients) Solving two types of problems: Regression Classification Neural Networks: Neural networks are like computer brains that learn from examples, made up of layers of "neurons" that handle data. They learn without explicit instructions. Types of Neural Networks: Feedforward Neural Networks: Simplest form, with straight connections and no loops. Convolutional Neural Networks (CNNs): Great for images, learning visual patterns. Recurrent Neural Networks (RNNs): Good for sequences like text or time series, because they remember past information. In Python, it’s the best to use TensorFlow and Keras libraries, as well as PyTorch, for deeper and more complex neural network systems. Deep Learning: Deep learning is a subset of machine learning in artificial intelligence (AI) that has networks capable of learning unsupervised from data that is unstructured or unlabeled. Convolutional Neural Networks (CNNs) Recurrent Neural Networks (RNNs) Long Short-Term Memory Networks (LSTMs) Generative Adversarial Networks (GANs) Autoencoders Deep Belief Networks (DBNs) Transformer Models Machine Learning Project Deployment Machine learning engineers should also be able to dive into MLOps and project deployment. Here are the things that you should be familiar or skilled at: Version Control for Data and Models Automated Testing and Continuous Integration (CI) Continuous Delivery and Deployment (CD) Monitoring and Logging Experiment Tracking and Management Feature Stores Data Pipeline and Workflow Orchestration Infrastructure as Code (IaC) Model Serving and APIs Best Data Science & Machine Learning Resources: https://topmate.io/coding/914624 Credits: https://t.me/datasciencefun Like if you need similar content 😄👍 Hope this helps you 😊

🚀🎢 Welcome to the Crypto Rollercoaster! 🎢🚀 Get ready for the thrill of a lifetime with $TICKET tokens! 🌟 - High Returns:
+2
🚀🎢 Welcome to the Crypto Rollercoaster! 🎢🚀 Get ready for the thrill of a lifetime with $TICKET tokens! 🌟    - High Returns: Potential gains up to 386,900% per ride!  - Low Trading Fee: Supporting the project, marketing, and the team. 🔥 Invest Now & Secure Your Ticket to Riches! 🔥   Buy $TICKETTwitter | TelegramChannel https://rollercoaster.finance

Ad 👇👇

Which of the following is not a machine learning type?
Anonymous voting

What is the difference between a random forest and a gradient boosting machine? 1. Random forest is an ensemble of decision trees while gradient boosting is a single decision tree 2. Random forest combines decision trees using boosting while gradient boosting combines decision trees using bagging 3. Random forest uses bagging while gradient boosting uses boosting 4. Random forest is used for regression while gradient boosting is used for classification ✅ Correct Response: 3 Explanation: Random forest is an ensemble of decision trees that combines the results of multiple decision trees using bagging. Gradient boosting is also an ensemble of decision trees, but it combines the results of multiple decision trees using boosting.

_DATA SCIENCE INTERVIEW _ ♦️♣️.pdf9.76 KB