Python has several ML libraries but major libraries that are used are the following:
- Numpy: NumPy is a python library used for working with arrays. It also has functions for working in domain of linear algebra, Fourier transform, and matrices. Usage:
pip install numpy
import numpy as np
- Scikit-Learn: Scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license. Scipy library provides a variety of useful scientific computing tools, including statistical distributions, linear algebra and a variety of specialised mathematical functions. Usage:
pip install scikit-learn
import sklearn
It’s worth noting that Scikit-learn has many sub-modules and classes that provide specific machine learning algorithms and tools. You can import these components individually as needed. For example, if you want to use the DecisionTreeClassifier class, you can import it like this:
from sklearn.tree import DecisionTreeClassifier
- Pandas: Provides key data structure like Dataframe. It also provide support for reading/writing data in different formats. Usage:
pip install pandas
import pandas as pd
- Matplotlib: Matplotlib is a comprehensive library for creating static, animated, and interactive visualisations in Python. Usage:
pip install matplotlib
import matplotlib.pyplot as plt
In the future posts, we’ll learn about how to use these libraries for performing basic tasks required in Data Sciences.
Further reading:
Kaggle – Pandas
Happy Programming!