Epython Lab
Open in Telegram
Welcome to Epython Lab, where you can get resources to learn, one-on-one trainings on machine learning, business analytics, and Python, and solutions for business problems. Buy ads: https://telega.io/c/epythonlab
Show more6 325
Subscribers
+524 hours
+17 days
-2730 days
Posts Archive
6 325
Researchers release a huge dataset of 20 million #malware samples, which also contains metadata, labels, and features, aiming to help research for Machine Learning based malware detection.
Learn more about SOREL-20M here: https://thehackernews.com/2020/12/sorel-20m-huge-dataset-of-20-million.html
6 325
Pro Python 3: Features and Tools for Professional Development, Third Edition
#pythonbooks @epythonlab
6 325
New Diamond Bid Recommendation with Linear Regression
https://github.com/epythonlab/Udacity-Bertelsmann-Projects/blob/master/new_diamond_price.ipynb
#Datascience #ptyhon #LinearRegression
6 325
#KeyNote #DataAnanlysisMethodology #BusinessAnalytics
Type of data analysis methodology
Predictive
Predictive analytics uses existing data to predict a future outcome. For example, a company may use predictive analytics to forecast demand or whether a customer will respond to an advertising campaign.
Geospatial
This type of analysis uses location based data to help drive your conclusions. Some examples are:
Identifying customers by a geographic dimension such as zip code, state, or county, or
Calculating the distance between addresses and your stores, or
Creating a trade area based upon your customer locations for further analysis
Some types of Geospatial analysis require the use of special software - such as software that can convert an address to Latitude & Longitude, or can calculate the drive time between two geographic points on a map.
Segmentation
Segmentation is the process of grouping data together. Groups can be simple, such as customers who have purchased different items, to more complex segmentation techniques where you identify stores that are similar based upon the demographics of their customers.
Aggregation
This methodology simply means calculating a value across a group or dimension and is commonly used in data analysis. For example, you may want to aggregate sales data for a salesperson by month - adding all of the sales closed for each month. Then, you may want to aggregate across dimensions, such as sales by month per sales territory. In this scenario, you could calculate the sales per month for each salesperson, and then add the sales per month for all salespeople in each region.
Aggregation is often done in reporting to be able to β slice and diceβ information to help managers make decisions and view performance.
Descriptive
Descriptive statistics provides simple summaries of a data sample. Examples could be calculating average GPA for applicants to a school, or calculating the batting average of a professional baseball player. In our electricity supply scenario, we could use descriptive statistics to calculate the average temperature per hour, per day, or per date.
Some of the commonly used descriptive statistics are Mean, Median, Mode, Standard Deviation, and Interquartile range.
6 325
#KeyNote #BusinessAnalytics #DataScience
Cross Industry Standard Process for Data Mining (CRISP-DM)
"A data mining process model that describes commonly used approaches that data mining experts use to tackle problems... it was the leading methodology used by industry data miners." -Wikipedia
CRISP-DM Steps
1. Business Issue Understanding
2. Data Understanding
3. Data Preparation
4. Analysis/Modeling
5. Validation
6. Presentation/Visualization
6 325
#KeyNote #UnsupervisedMachineLearning #Clustering #k-means
Clustering is one of unsupervised machine learning algorithm. There are many models for clustering out there. Despite its simplicity, the K-means is vastly used for clustering in many data science applications, especially useful if you need to quickly discover insights from unlabeled data.
Some real-world applications of k-means:
- Customer segmentation
- Understand what the visitors of a website are trying to accomplish
- Pattern recognition
- Machine learning
- Data compression
6 325
String Manipulation in Python 3.
For beginners
#Subscribe to receive new topic.
https://youtu.be/6Ey9bQ-KJuk
6 325
String Operations in Python 3.
For beginners
#Subscribe to receive new topic
https://www.youtube.com/watch?v=MKtAA4ZnmkQ
6 325
This is for absolute beginners. Expressions in Python.
In this lab you will get to know about:-
What are the expressions in Python?
How to construct expressions in Python?
#Subscribe #Share
https://youtu.be/KOA3j2tbr4M
6 325
#KeyNote #DataScience #datanalytics #modeltrain #futureprediction
Data Analytics, we often use Model Development to help us predict future observations from the data we have.
A Model will help us understand the exact relationship between different variables and how these variables are used to predict the result.
@epythonlab
6 325
Introducing MySQL Shell: Administration Made Easy with Python
Charles Bell (2019)
@epythonlab
6 325
QUESTION OF THE DAY
ARE DATA NORMALIZATION AND DATA STANDARDIZATION THE SAME?
EXPLAIN? WITH EXAMPLE?
#DataScience #datatrnasformation #datacleansing #datapreprocessing
SEND YOUR ANSWER TO @PYDISCUSSION
6 325
Operators in Python
Don't forget to subscribe to YouTube to receive more tutorials.
https://youtu.be/HhTdMVRNO6E
6 325
#Datascience #database #sql #python
SQL is one of the most common computer languages in use for working with data today. It is a standardized language for accessing and manipulating relational databases. While it is relatively limited compared to a general programming language such as Python, it is highly optimized for efficient retrieval and aggregation of data from database tables. Its broad support and use virtually guarantees that any professional data scientist or analyst will encounter SQL eventually. Furthermore, SQL is often the paradigm used to discuss the relational data model, which has implications that apply beyond SQL compliant databases.
Relational data model
The relational data model for the most part corresponds with our intuitive notion of a table. Each row is a relation, usually representing some object, event, or idea. Each column corresponds with an attribute which characterizes the relation. In order to reduce redundancy in a database, when creating at able we typically include the minimum amount of attributes required to fully define a relation. This (admittedly vague) guideline is formalized in the idea of database normalization.
6 325
#Database #SQL #Datascience #python #DP_API
A Python code to connect to the database using #DB-API
Available now! Telegram Research 2025 β the year's key insights 
