Exploring the Best Python Libraries for Machine Learning – 2024

Best Python Libraries for Machine Learning

in our previous tutorial we learn about basic Machine Learning Concept. but now we will look best python libraries for Machine Learning, which we can use to build our Machine Learning model easily. So The question is why should we use Python not other programming language. we have discussed  in details here. But in short we can say ” Python’s simplicity, rich ecosystem of libraries, community support, flexibility, performance, and industry adoption make it the preferred language for machine learning.” Python and its associated libraries, developers can unlock the full potential of machine learning, democratizing access to artificial intelligence and driving innovation across diverse domains and industries. below list is the most usefull libraries for Machine Learning.

 

  1. Scikit-learn
  2. Tensorflow
  3. PyTorch
  4. Keras
  5. Pandas
  6. NumPy
  7. Matplotlib
  8. Seaborn

Scikit-Learn

Scikit-learn is one of the most widely used Python library for Machine Learning. it provides simple and efficient tools for data mining and data analysis and supports various machine Learning algorithms, including classification, regression, clustering, and dimensionality reduction. Scikit-learn is known for its clean, consistent API and extensive documentation, making it an excellent choice for beginners and experts alike.

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

iris = load_iris()
X = iris.data
y = iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)

y_pred = knn.predict(X_test)

accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)

TensorFlow

tensorflow - datasciinsight

TensorFlow, developed by Google Brain, is an open-source machine learning framework that excels at building and training deep learning models. It offers a flexible architecture that allows developers to deploy models on a wide range of platforms, from desktops to mobile devices to distributed systems. TensorFlow’s high-level API, Keras, provides an intuitive interface for building neural networks, making it accessible to both beginners and experts.

</pre>
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten

(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train, X_test = X_train / 255.0, X_test / 255.0

model = Sequential([
Flatten(input_shape=(28, 28)),
Dense(128, activation='relu'),
Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])

model.fit(X_train, y_train, epochs=5, batch_size=32, validation_split=0.2)

test_loss, test_acc = model.evaluate(X_test, y_test)
print('Test accuracy:', test_acc)
<pre>

PyTorch

PyTorch is another powerful deep learning framework that has gained popularity for its dynamic computation graph and seamless integration with Python. Developed by Facebook’s AI Research lab, PyTorch is known for its ease of use and flexibility, allowing researchers and developers to experiment with new ideas and algorithms easily. PyTorch’s strong community support and rich ecosystem of libraries make it a favorite among researchers and practitioners alike.


import torch
import torch.nn as nn
import torch.optim as optim

# Step 1: Prepare data
# Example data: y = 2x + 1
x_train = torch.tensor([[1.0], [2.0], [3.0], [4.0]])
y_train = torch.tensor([[3.0], [5.0], [7.0], [9.0]])

# Step 2: Define the model
class LinearRegression(nn.Module):
def __init__(self):
super(LinearRegression, self).__init__()
self.linear = nn.Linear(1, 1) # One input feature, one output

def forward(self, x):
return self.linear(x)

model = LinearRegression()

# Step 3: Define loss function and optimizer
criterion = nn.MSELoss() # Mean Squared Error loss
optimizer = optim.SGD(model.parameters(), lr=0.01) # Stochastic Gradient Descent optimizer

# Step 4: Train the model
num_epochs = 100
for epoch in range(num_epochs):
# Forward pass
outputs = model(x_train)
loss = criterion(outputs, y_train)

# Backward pass and optimization
optimizer.zero_grad() # Clear gradients
loss.backward() # Compute gradients
optimizer.step() # Update weights

# Print progress
if (epoch+1) % 10 == 0:
print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

# Step 5: Test the model
# Predict output for a new input
x_test = torch.tensor([[5.0]])
predicted = model(x_test)
print(f'Prediction after training: {predicted.item():.4f}')

Keras

Keras is a high-level neural networks API written in Python and capable of running on top of TensorFlow, Theano, or Microsoft Cognitive Toolkit (CNTK). It was designed with user-friendliness, modularity, and extensibility in mind, allowing for fast prototyping and experimentation with deep learning models. Keras provides a simple and consistent interface for building various types of neural networks, from simple feedforward networks to complex architectures like recurrent and convolutional neural networks.


import numpy as np
from keras.models import Sequential
from keras.layers import Dense

# Step 1: Prepare data
# Example data: y = 2x + 1
x_train = np.array([1.0, 2.0, 3.0, 4.0])
y_train = np.array([3.0, 5.0, 7.0, 9.0])

# Step 2: Define the model
model = Sequential()
model.add(Dense(units=1, input_shape=[1])) # One input feature, one output

# Step 3: Compile the model
model.compile(optimizer='sgd', loss='mean_squared_error') # Stochastic Gradient Descent optimizer, Mean Squared Error loss

# Step 4: Train the model
model.fit(x_train, y_train, epochs=100)

# Step 5: Test the model
# Predict output for a new input
x_test = np.array([5.0])
predicted = model.predict(x_test)
print(f'Prediction after training: {predicted[0][0]:.4f}')

Pandas

Pandas is a powerful data manipulation and analysis library that provides data structures and functions for working with structured data. It is built on top of NumPy and provides easy-to-use data structures like DataFrame and Series, which are ideal for cleaning, transforming, and analyzing data before feeding it into machine learning models. Pandas’ intuitive syntax and rich functionality make it an essential tool in the data scientist’s toolkit.


import pandas as pd

# Step 1: Create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emily'],
'Age': [25, 30, 35, 40, 45],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Boston']}
df = pd.DataFrame(data)

# Step 2: Display the DataFrame
print("DataFrame:")
print(df)
print()

# Step 3: Accessing data
print("Accessing data:")
print("First two rows:")
print(df.head(2)) # Display first two rows
print("Age column:")
print(df['Age']) # Display 'Age' column
print()

# Step 4: Adding a new column
df['Gender'] = ['Female', 'Male', 'Male', 'Male', 'Female']
print("DataFrame after adding 'Gender' column:")
print(df)
print()

# Step 5: Filtering data
print("Filtering data:")
print("People aged 30 and above:")
print(df[df['Age'] >= 30])
print()

# Step 6: Sorting data
print("Sorting data by age:")
print(df.sort_values(by='Age'))
print()

# Step 7: Grouping data
print("Grouping data by city and calculating average age:")
print(df.groupby('City')['Age'].mean())

Numpy

NumPy is the fundamental package for scientific computing in Python. It provides support for multidimensional arrays and matrices, along with a collection of mathematical functions to operate on these arrays efficiently. NumPy’s array-oriented computing capabilities make it indispensable for tasks like data manipulation, numerical computing, and linear algebra operations, which are common in machine learning workflows.


import numpy as np

# Step 1: Create an array
arr = np.array([3, 1, 4, 1, 5, 9, 2, 6])

# Step 2: Sort the array
sorted_arr = np.sort(arr)

# Step 3: Display the sorted array
print("Original array:", arr)
print("Sorted array:", sorted_arr)

Matplotlib

Matplotlib is a versatile plotting library for Python that produces publication-quality figures in a variety of formats and interactive environments. It provides a MATLAB-like interface for creating static, animated, and interactive visualizations, making it suitable for exploring and presenting data during the model development and evaluation process. Matplotlib’s extensive gallery of examples and customization options make it a favorite among data scientists and researchers.


import matplotlib.pyplot as plt

# Step 1: Prepare data
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]

# Step 2: Create a plot
plt.plot(x, y)

# Step 3: Customize the plot
plt.title('Prime Numbers')
plt.xlabel('Index')
plt.ylabel('Value')

plt.legend(['Prime Numbers'], loc='upper left')
plt.tight_layout()

# Step 4: Show the plot
plt.show()

for more details visit Docs.

Seaborn

Seaborn is a statistical data visualization library based on Matplotlib that provides a high-level interface for drawing attractive and informative statistical graphics. It builds on top of Matplotlib’s functionality and simplifies the process of creating complex visualizations like heatmaps, violin plots, and pair plots. Seaborn’s built-in themes and color palettes make it easy to produce aesthetically pleasing plots with minimal effort.


import seaborn as sns
import matplotlib.pyplot as plt

# Step 1: Load dataset
tips = sns.load_dataset("tips")

# Step 2: Create a histogram
sns.histplot(tips['total_bill'], bins=10, kde=True)

# Step 3: Customize the plot
plt.title('Distribution of Total Bill')
plt.xlabel('Total Bill ($)')
plt.ylabel('Frequency')

# Step 4: Show the plot
plt.show()

User Avatar
Datasciinsight
https://datasciinsight.com

Leave a Reply