Data Science, Machine Learning, Artificial Intelligence etc. are the niche technologies that is getting evolved as the 21st century skills. Data Science is a field which deals about the processes and systems to find insights from the raw data. Machine Learning is a field of study that enables the computers to learn without feeding the programs explicitly. Machines utilize data science techniques to learn from data. Data science focuses on data visualization and presentation, whereas machine learning focuses more on the learning.

Initial Steps to Learn Data Science or ML

Before starting to learn Data science or Machine Learning (ML), we must understand the foundations for them. A combination of programming, math and statistics skills are needed to learn Data Science and ML. Of course, they are the 3 Pillars of Data Science & ML.

The roadmap is not a straight forward one. There are too many resources and many of them are expensive. So in this article, we shall check out some of the first steps to learn Data Science or Machine Learning.

The 3 Pillars of Data Science and Machine Learning

3 Pillars of data science

Essential Python Programming skills

It deals with the below libraries particularly needed for data science :

  • Common data structures (data types, lists, dictionaries, sets, tuples), writing functions using logic and control flow, searching and sorting algorithms, object-oriented programming (OOP), and working with external libraries.
  • Writing Python scripts to extract, format, and store data into files or back into databases.
  • Handling multi-dimensional arrays, indexing, slicing, transposing, vectored operations – scientific computing libraries using NumPy.
  • Manipulating data with Pandas — creating series, data frame, indexing in a data frame, merging data frames, mapping, performing exploratory analysis.
  • Data Visualization using Matplotlib — API describes how to add styles, color, markers to a plot, when to use them, types – line plots, bar plots, scatter plots, histograms, boxplots, and Seaborn for more advanced plotting.

Essential Mathematics

ML is inherently data-driven. Data is at the heart of machine learning. We can think of data as vectors — an object that adheres to arithmetic rules. Linear Algebra is used to Represent Data. Calculus is used to train ML Models.

  • Basic algebra – equations, functions — linear, exponential, logarithmic, and so on.
  • Linear Algebra – Vectors, dot product, representing linear equations in matrix notation , solving linear regression problem using vectors and matrices.
  • Calculus –  derivatives and limits , partial derivatives( to compute gradients) ,  convexity of functions, local and global minima, gradient descent , training an ML model etc.
  • Integral – Area under the curve.

Essential Statistics

Every organization today is striving to become data-driven – (i.e.) to use their data in different ways for their decision making. Descriptive statistics enables you to transform each observation in your data into insights that make sense.

It focusses on

  • Estimates of location (mean, median etc.) and variability
  • Correlation and covariance
  • Random variables — discrete and continuous
  • Conditional probability — Bayesian statistics,
  • Data distribution – Gaussian, Binomial, Poisson and Exponential distributions.
  • Important theorems — Law of large numbers and Central limit theorem.
  • Inferential Statistics.

To further explore data science and ML related trainings, just click this link to refer:


It is evident that the programming, math and statistics skills are the 3 pillars or foundation elements for learning data science or ML. The checklists are also discussed in a greater detail. But, how they are applied and programmed in a machine to learn from the data have to be covered in-depth in trainings.